# Upgrade 11.2 ZFS: i/o error - all block copies unavailable



## mxc (Sep 6, 2018)

During an upgrade from 11.1 to 11.2, upon first reboot, I get the following error:


```
ZFS: i/o error - all block copies unavailable

/boot/kernel/kernel text=0x1547d28 ZFS: i/o error - all block copies unavailable

elf64_loadimage: read failed
can't  load 'kernel'
```

I have tried the following suggestions found on line and none of them worked.

1 - Copy over the boot loader code again after booting off live cd option with the memstick image
`gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 adao`
`gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1`

2 - Try copy over the boot directory again for "self-healing" after booting off live cd option with memstick image

`zpool import -f -R /mnt zroot`
`mv /mnt/boot /mnt/boot.orig`
`mkdir /mnt/boot`
`cd /mnt/boot.orig && cp -R * /mnt/boot`

From this post (https://forums.freebsd.org/threads/...m-zroot-after-applying-p25.54422/#post-308876) it appears that the only way out, if you have a striped zpool across vdevs that have a different partition structure for their devices, that is used for zfs root you are basically screwed?

I don't have enough spare capacity to over the data, rebuild the pool and restore. So my questions are: (please note I am not a freebsd nor zfs expert so apologies if my questions are conceptually confused.)

1) Is this really the only way?
2) Could I remove the 2nd vdev that has the raw disk devices and try and force all the data to be written to the first vdev? The I could re-partition the 2 raw disks and add them back as a similarly formatted vdev? I have read from several, low quality sources, that its not possible to remove a vdev. Can I remove all the disks in a vdev instead and then add them back to the vdev? If space is an issue and not all data can be copied over to the first vdev, which is what I suspect, can I remove the raw devices from the 2nd mirror vdev one at a time, re-partition then and add then back in one at a time to the 2nd mirrored vdev? Wold any of these approaches actually solve the problem?

To clarify:

My zpool has two mirror vdevs. The first has two disks which have been partitioned with boot partitions, swap partitions and then a partition dedicated to ZFS vdev1. The 2nd vdev is also mirrored but the 2 entire disks have been added to the vdev.


----------



## SirDice (Sep 6, 2018)

Don't remove anything, don't install anything. Not yet at least. Not until we figure out what you really need to do. If you do the wrong thing your data will be toast.

Please boot from a rescue disk (the install media will do) and post the output from `gpart show` so we have a better understanding of what we're dealing with.


----------



## ShelLuser (Sep 6, 2018)

Also, while you're booting from said rescue disk try dropping down to the boot console (press escape) and run `lsdev`. Does it detect your disks (and your ZFS pool(s)) at all?

Don't do anything else, just run `boot` afterwards to continue booting and follow up on SirDice's question.

(edit) PS: are you running a regular version or did you customize stuff (like building your own kernel and/or base system)?


----------



## mxc (Sep 6, 2018)

Output from lsdev, not booting from livecd  but from the "rescue" console I am dumped to.






I can run "zpool import  -R /mnt  zroot" at the rescue prompt and all looks good.



> zpool status
> pool: zroot
> state: ONLINE
> scan: none requested
> ...



The output of "gpart show" from livecd is



> root@:~ # gpart show
> =>        34  3907029101  ada0  GPT  (1.8T)
> 34           6        - free -  (3.0K)
> 40        1024     1  freebsd-boot  (512K)
> ...



I am running stock standard freebsd. Don't know enough to try anything too fancy yet 

thanks for the help


----------



## mxc (Sep 8, 2018)

attempting to remove mirror-1 with
`zpool remove zroot mirror-1`

results in



> cannot remove mirror-1: root pool can have removed devices because GRUB does not understand them.



Its not looking good.


----------



## ShelLuser (Sep 8, 2018)

mxc said:


> attempting to remove mirror-1 with
> `zpool remove zroot mirror-1`


Why would you want to do that?

I'm also a little surprised about the error message because I was pretty sure that zpool wouldn't know anything about Grub (Grub isn't part of the FreeBSD base system afterall) but I checked and I traced the error message back to: /usr/src/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_zpool.c. You learn something new everyday 

Anyway....  This was about booting. I'd leave the pool for now until you can fully boot again.

The command you used in the OP (`gpart bootcode`) was the right one however.. You probably ran that from the rescue media I assume? Which means that you'd have used that bootcode, but that code could be different than that on your actual pool.

Therefor my suggestion would be to boot using a rescue system, mount your pool, and then use the boot code as it is installed on your system. So (for example): /mnt/boot/gptzfsboot. That would ensure that your versions will never mismatch.

(edit)

Almost forgot: the screenshot you shared tells us that the bootloader does recognize your system, which is good. Are you by any chance using a custom kernel, and if so are you sure you build all the required modules needed for proper ZFS support?

Worst case scenario would be to boot your system from the live cd using the kernel from the live CD. After that you'd have full access again and can restore whatever it is that's broke. Here's how you could do that:

Boot from the cd and drop down to the boot prompt as before (when you used `lsdev`). Then use the following commands:
unload
load /boot/kernel/kernel
load /boot/kernel/opensolaris.ko
load /boot/kernel/zfs.ko
set curdev="disk0p3"
set vfs.root.mountfrom="zfs:zroot"
Careful with those set commands, do not add any spaces or something, just type it as I've shown here. After all this you can boot your system using either `boot` (boot normally) or `boot -s` if you want to be careful and boot in single user mode. Keep in mind that you'd probably need to manually mount the rest of your file systems other than the root if you use single user mode.

This should allow you to at least boot your system and access your stuff. And this would also be the perfect environment to try and fix your boot code; if you run the `gpart bootcode` command here you'd be sure that it would use the right bootcode.

Hope this can help!


----------



## mxc (Sep 9, 2018)

Thanks ShelLuser  - Your post has helped improved my understanding of the freebsd boot process  

I had already proceeded on a different course of action  to recover and had to wait for the resilver to finish before confirming if it worked.
* tldr; *
I have a bootable system!
*Longer version*

Here is what I did from the live-cd environment

1) zpool offline zroot ada3
2) gpart create -s GPT ada3 (could probably just have done step 3 instead)
3) gpart add -t freebsd-boot ... gpart add -t freebsd-swap ... gpart add -t freebsd-zfs (to mimic the existing partition on the physical disk in the first mirror vdev. I don't think this was completely necessary - just  gpt partition table with 1 partition may have done the trick.
4) zpool detach zroot ada3
5) zpool attach zroot ada2 ada3 -> wait for rebuild to finish
6) repeat steps 1 to 5 for ada2
7) cross fingers and reboot 
8) crack open a can of the best!

I am not sure how grub plays a role in the boot process. I am pretty sure I just used the bsd boot loader? But it was a while ago and I was still learning so maybe I followed a guide that used grub at some point.

What I do know about grub, now, is that it needs to use a block list to find core.img - its later stage bootcode image but looks like it needs a partition table to do that. I am not sure why grub even features though  Maybe a vestigial piece of code.


Hope this helps someone.


----------

