# "ZFS : i/o error - all block copies unavailable" - "fg not found"



## mxc (Dec 3, 2018)

Hi all,

After a system update with freebsd-update fetch/install and then a pkg update, all done via an ansible script, I can no longer boot into my system. I get the generic error message:


> ZFS: i/o error - all block copies unavailable
> fg not found
> error while including /boot/loader.4th in the line :
> if 7 fg 4 bg then



I  loaded the old kernel located at /boot/kernel.old/kernel along with opensolaris.ko and zfs.ki and booted into the system and all seemed good. I did a zpool upgrade and reinstalled the bootloader to see if that would sort the problem but no luck I cannot boot with the new kernel. According to freebsd-version I am on 11.2-RELEASE-p5

Any ideas?


----------



## ShelLuser (Dec 3, 2018)

When you say you loaded the old kernel did you actually drop down to the boot prompt (ok>) and set things up yourself, or did you simply use the boot option to switch to the previous kernel?

Alas, for all I know your method failed (I'm not much of an Ansible fan) so did you try to perform the steps manually as well to rule out issues?

(edit): what does `freebsd-version -urk` show you?


----------



## mxc (Dec 3, 2018)

Yes, I dropped down to the boot prompt and went "unload; load /boot/kernel.old/kernel" etc. The rsults of  `freebsd-version-urk` is:



> 11.2-RELEASE-P5
> 11.2-RELEASE-P2
> 11.2-RELEASE-P5



thanks for the reply.


----------



## ShelLuser (Dec 4, 2018)

I don't have a definitive answer yet but I strongly suspect that this is the result of a bug. /usr/src/UPDATING tells me that they did indeed change things in the loader:


```
20181127        p5      FreeBSD-SA-18:13.nfs
                        FreeBSD-EN-18:13.icmp
                        FreeBSD-EN-18:14.tzdata
                        FreeBSD-EN-18:15.loader
```
`svn log` (I use the source tree to maintain my system) then tells me:

```
r341093 | gordon | 2018-11-27 20:45:25 +0100 (Tue, 27 Nov 2018) | 6 lines

Fix deferred kernel loading breaks loader password. [EN-18:15.loader]
```
But if I then look at /usr/src/stand/forth/loader.4th I see that it contains the exact same line which is included in your error output above and nothing changed in there:

```
peter@zefiris:/usr/src/stand/forth $ freebsd-version -u
11.2-RELEASE-p4
peter@zefiris:/usr/src/stand/forth $ diff loader.4th /boot/loader.4th
peter@zefiris:/usr/src/stand/forth $ grep -n "7 fg" loader.4th
54:  if 7 fg 4 bg then
```
So, what you're seeing here is that I'm still at p4 where my userland is concerned yet there are no differences between my p5 source tree and the installed version of loader.4th, even though it triggered an error on your end. Logical conclusion: if the config file didn't change and still triggered an error then it has to be located in the system.

What happens if you try to load & boot the default kernel manually?

If you don't mind trying something: reboot the system, drop down to the boot console again (ok>) then manually boot the default kernel. So follow all the options you used above (unload, load, etc.) and this time use /boot/kernel followed by `boot -s` (single user mode, just to be safe).

Does that result in the same problems?

Thanks in advance..  Don't bother if this would be a hassle for you (like a live system) because the main reason I'm asking is to try & determine the cause here. Still, if my theory is correct then the result would be that you'd be running a full p5 setup.


----------



## mxc (Dec 4, 2018)

When I try `load "/boot/kernel/kernel"` I get the message


> "ZFS: i/o error - all block copies unavailable"


and then I have to wait about 30 seconds before getting  the command prompt back with the message.


> don't know how to load module '/boot/kernel/kernel'


Attempting to `load /boot/kernel/opensolaris.ko`results in 


> can't load module file /boot/kernel/opensolaris.ko 'operation not permited'


and load /boot/kernel/zfs.ko appears to succeed. Trying to load the kernel again, after zfs.ko, results in the same error as reported in this post above.


----------



## mxc (Dec 6, 2018)

Sorry to bug. Anyone got a suggestion as to what the issue could be and how to fix it? Will waiting for 12.0 to come out and upgrading to that perhaps sort out the problem? Or should I resolve this first before upgrading?


----------



## VladiBG (Dec 6, 2018)

which bootcode did you load after the upgrade?


----------



## mxc (Dec 6, 2018)

Hi VladiBG,

I couldn't boot up after upgrade so had to load the previous kernel, 11.2-RELEASE-P2, to get to the login prompt. After that I ran the standard commands to load the bootcode.

`gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0`

I also wrote to the 2nd disk in the zfsroot mirror but its only has a /dev/diskid/DISK-..... device id.


----------



## VladiBG (Dec 6, 2018)

what do you have in 
/boot/loader.conf
/boot/loader.conf.local


----------



## mxc (Dec 8, 2018)

/boot/loader.conf


> kern.geom.label.gptid.enable="0"
> zfs_load="YES"



/boot/loader.conf.local


> No such file or directory



I have also noticed that if I run `lsdev` it lists ada0 but hangs on the next disk and finally returns with the error



> 0panic free guard1 fail ...



I don't know what that means. I have booted off the memstick and reimported the zpool with EDIT: Ok rebooted into boot prompt with the USB stick removed and lsdev now works.

`zpool import -R /mnt -f zroot`
and then reapplied the boot code  to ada0 and to diskid/DISK-XXXXX but no change in status after rebooting


----------



## mxc (Dec 8, 2018)

I tried to upgrade to 11.2-p6 but still have the same issue and now kernel.old is gone so can only boot off usb stick into single user mode for repair. The old kernel is no longer available.


----------



## VladiBG (Dec 8, 2018)

can you take the FreeBSD-12.0-RC3 memstick and check if you can boot.
Also when was the last time when you update your motherboard bios?


----------



## mxc (Dec 8, 2018)

Will give upgrading the bios a try. I have upgraded a 2nd machine from 11.2-p2 to 11.2-p5 and the results are the same.  The hardware is the same for both machines.


----------



## VladiBG (Dec 8, 2018)

what is the output of `gpart show`?


----------



## ShelLuser (Dec 9, 2018)

I looked up the 'guard1 fail' error and that's no good, because the 0 should be an indication of what disk failed but that didn't happen, source of this info:

http://freebsd.1045724.x6.nabble.co...hes-on-quot-lsdev-quot-command-td6288992.html

... which could indeed point to an issue with the partition table.


----------



## mxc (Dec 9, 2018)

yeah the "guard 1 fail" is a red herring I think. Was something to do with the USB memstick I used to boot off of.  I am trying to upgrade the BIOS. They are HPE Microservers Gen 8 but this requires a bit of yak shaving because the bios update utility requires windows and it doesn't work in windows 10 vm nor with wine .

I have tried the 12.0-RC3 memstick but not sure how to "chroot" to my zroot dataset. Below is a pic of the hard disk layout.

The output of `gpart show` is attached.


----------



## VladiBG (Dec 9, 2018)

The bios update won't help in this case. You are not using UEFI becasue gen8 dosn't support it. The only thing that left is to take two free disks install a fresh copy of 11.2 Release again with ZFS and then upgrade it with `freebsd-update` to see if the problem persist or make a full backup of your data and then try to reinstall on those disks.


----------



## mxc (Dec 9, 2018)

Sad Panda  . I can boot the one machine into 11.2-p2 with `boot kernel.old`. The 2nd machine lost its kernel.old when I tried to upgrade to 11.2-p6 after booting into 11.2p2.  I tried copying over the working kernel.old but the boot loader comes back saying "can't find /boot/kernel.old".

I copied the directory over with `cp -a /boot/kernel.old /mnt/mnt` where /mnt/mnt is the mount point for a ufs formatted usb device. Then I mounted it via the memstick boot into the 2nd server and copied the files over. Should this work or is there some other way to copy over a working kernel?


----------



## tingo (Dec 10, 2018)

your way of copying kernel.old should work. Please note that kernel.old is a directory which hold the kernel modules in addition to the kernel.
If you want to boot the old kernel from a bootloader prompt you would do `boot /boot/kernel.old/kernel`
HTH


----------

