# System won't boot after power outage - "can't load 'kernel'" (12.0 RELEASE amd64)



## Julf (Nov 12, 2019)

As the subject line says, my server didn't come up after a power outage, and refuses to boot, with the "can't load 'kernel'" message.


----------



## SirDice (Nov 12, 2019)

Things in general die when you suddenly turn them off and on again. Especially machines that have been running for quite some time 24/7. Are your disks still being detected? Is it possible your disk or your controller died due to the power spikes?


----------



## Julf (Nov 12, 2019)

SirDice said:


> Things in general die when you suddenly turn them off and on again. Especially machines that have been running for quite some time 24/7. Are your disks still being detected? Is it possible your disk or your controller died due to the power spikes?



Looks like disks are being detected, and ls shows /boot exists


----------



## Julf (Nov 12, 2019)

lsdev shows disk0 containing boot, swap and ZFS sub-devices


----------



## SirDice (Nov 12, 2019)

Ok, that's hopeful. As you mention ZFS, is this a full root-on-ZFS system? No encryption or anything like that?


----------



## Julf (Nov 12, 2019)

SirDice said:


> Ok, that's hopeful. As you mention ZFS, is this a full root-on-ZFS system? No encryption or anything like that?



No encryption. And yes, pretty sure it is root-on-ZFS.


----------



## SirDice (Nov 12, 2019)

It's a bit tricky as I don't exactly know what's wrong but try these on that loader prompt:

```
set currdev=zfs:zroot/ROOT/default
boot /boot/kernel/kernel
```

The zroot/ROOT/default assumes you've used the default settings during installation and still have the 'default' boot environment.


----------



## Julf (Nov 12, 2019)

SirDice said:


> It's a bit tricky as I don't exactly know what's wrong but try these on that loader prompt:
> 
> ```
> set currdev=zfs:zroot/ROOT/default
> ...



Thanks!

Unfortunately that gives "Failed to load kernel '/boot/kernel/kernel'


----------



## SirDice (Nov 12, 2019)

Julf said:


> ls shows /boot exists


Does `ls /boot/` or `ls /boot/kernel/` show anything? Does `ls /boot/` perhaps show a kernel.old? You could try that as a fallback. Anything to get it booting, we can fix any resulting kernel issues more easily later on. The loader(8) is a bit spartan.


----------



## Julf (Nov 12, 2019)

SirDice said:


> Does `ls /boot/` or `ls /boot/kernel/` show anything? Does `ls /boot/` perhaps show a kernel.old? You could try that as a fallback. Anything to get it booting, we can fix any resulting kernel issues more easily later on. The loader(8) is a bit spartan.



All 3 show up (/boot/, /boot/kernel/ and /boot/kernel.old/), with lots of files (mostly .ko), but `boot  /boot/kernel.old/kernel` still gives `can't load 'kernel'`


----------



## Julf (Nov 12, 2019)

Hmm, the last response has been stuck in moderation for quite a while....


----------



## Julf (Nov 12, 2019)

If I boot using the boot image, the ZFS file system doesn't seem to be visible.


----------



## `Orum (Nov 13, 2019)

I'm going to guess that this system was upgraded at some point in the past, and the zpool was upgraded, but the new bootloader was never installed.  Running `zpool upgrade` shows a reminder message for a reason!

The only way I know of fixing this is to boot to external media (e.g. USB, network, etc.) that's running the same version of FreeBSD as your kernel/world, install the boot loader from that, and then reboot back to your normal system.


----------



## Julf (Nov 13, 2019)

`Orum said:


> I'm going to guess that this system was upgraded at some point in the past, and the zpool was upgraded, but the new bootloader was never installed.  Running `zpool upgrade` shows a reminder message for a reason!



You are probably right.



> The only way I know of fixing this is to boot to external media (e.g. USB, network, etc.) that's running the same version of FreeBSD as your kernel/world, install the boot loader from that, and then reboot back to your normal system.



Is there a way to install just the boot loader from the installer?


----------



## Julf (Nov 13, 2019)

Is there a way to determine the version of the kernel on the hard disk from the installer?


----------



## Julf (Nov 13, 2019)

Julf said:


> If I boot using the boot image, the ZFS file system doesn't seem to be visible.



This had me flummoxed - `gpart show` does show a `freebsd-zfs` partition, but `zpool list` results in `no pools available`. OK, zpool import shows it is there but in use by "other system", so clearly it was not shut down properly. Could that be why it doesn't boot?


----------



## SirDice (Nov 13, 2019)

Julf said:


> so clearly it was not shut down properly.


It has nothing to do with being cleanly shutdown or not. Normally ZFS pools need to be exported on one system before you can import them on another. In this case however you simply need to force the import.


----------



## Julf (Nov 13, 2019)

SirDice said:


> It has nothing to do with being cleanly shutdown or not. Normally ZFS pools need to be exported on one system before you can import them on another. In this case however you simply need to force the import.



Thanks! Is forcing it safe?


----------



## SirDice (Nov 13, 2019)

Julf said:


> Is forcing it safe?


It's relatively safe to do, and there's no other way to get to the data to fix the booting issue.


----------



## Julf (Nov 13, 2019)

SirDice said:


> It's relatively safe to do, and there's no other way to get to the data to fix the booting issue.



Importing it resulted in `internal error: failed to initialize ZFS library`, so I guess the imported zroot masks the zroot of the install image. 

It also seems to have somehow corrupted the installer image, so I am now rewriting the USB stick (to try ZFS import under another name).


----------



## Julf (Nov 13, 2019)

Julf said:


> Importing it resulted in `internal error: failed to initialize ZFS library`, so I guess the imported zroot masks the zroot of the install image.
> 
> It also seems to have somehow corrupted the installer image, so I am now rewriting the USB stick (to try ZFS import under another name).



Hmm... Even `zpool import -f zroot jroot` results in `internal error: failed to initialize ZFS library` for every ZFS command after the import - so it still replaces the install image root filesystem. There does seem to be a /boot/kernel directory, but no /boot/kernel/kernel.


----------



## SirDice (Nov 13, 2019)

None of the installer images use ZFS.


----------



## Julf (Nov 13, 2019)

SirDice said:


> None of the installer images use ZFS.



Ah, OK. I guess the zfs import still mounts the hard disk root over the installer root?


----------



## SirDice (Nov 13, 2019)

Julf said:


> I guess the zfs import still mounts the hard disk root over the installer root?


Oh, yeah. That's definitely possible. Make use of the `-R` option to mount it on /tmp/temproot or some other place:

```
-R root
                 Sets the "cachefile" property to "none" and the "altroot"
                 property to "root"
```


----------



## Julf (Nov 13, 2019)

SirDice said:


> Oh, yeah. That's definitely possible. Make use of the `-R` option to mount it on /tmp/temproot or some other place:
> 
> ```
> -R root
> ...



Thanks! Is that an option to zpool import?


----------



## SirDice (Nov 13, 2019)

Julf said:


> Is that an option to zpool import?


It is, yes.


----------



## Julf (Nov 13, 2019)

SirDice said:


> It is, yes.



OK, success. I now have the ZFS system mounted on /tmp/tmproot. Now, any advice on how to fix the boot issue?


----------



## SirDice (Nov 13, 2019)

Julf said:


> I now have the ZFS system mounted on /tmp/tmproot


That shows the pool itself and the data it contains is still good. It'll be easier to work from here. The loader(8) is quite versatile but a bit of a pain to use interactively when trying to solve issues. 

Can you show the output from `gpart show`?


----------



## Julf (Nov 13, 2019)

SirDice said:


> Can you show the output from `gpart show`?



Currently running stand-alone, single user, without network, so can't cut&paste, but I hope the attached pic is readable.


----------



## SirDice (Nov 13, 2019)

This is slightly dangerous but at worst the system won't boot, which it doesn't do now anyway, so the risk is minimal. Make sure the installer image you used is the same version as your system was (similar enough, it doesn't need to have the exact same patch level, just the same major version). 

`gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0`

Then reboot, cross your fingers and hope that was enough to get it booting again.


----------



## Julf (Nov 13, 2019)

SirDice said:


> This is slightly dangerous but at worst the system won't boot, which it doesn't do now anyway, so the risk is minimal. Make sure the installer image you used is the same version as your system was (similar enough, it doesn't need to have the exact same patch level, just the same major version).
> 
> `gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0`
> 
> Then reboot, cross your fingers and hope that was enough to get it booting again.



Before I do that, is there any way to check what version the system was?


----------



## SirDice (Nov 13, 2019)

Julf said:


> Before I do that, is there any way to check what version the system was?


freebsd-version(1) uses a little trick:

`what -qs /boot/kernel/kernel`
Point it to your imported filesystem of course or else you get the version of the boot image you used.


----------



## Julf (Nov 13, 2019)

SirDice said:


> freebsd-version(1) uses a little trick:
> 
> `what -qs /boot/kernel/kernel`
> Point it to your imported filesystem of course or else you get the version of the boot image you used.



Hmm... There is no file called "kernel" the boot/kernel directory of the imported filesystem.


----------



## SirDice (Nov 13, 2019)

Julf said:


> There is no file called "kernel" the boot/kernel directory of the imported filesystem.


That would definitely explain the "can't load kernel" messages. Is the entire directory empty or are the *.ko kernel modules still there? If everything is mostly there but you're only missing kernel you can try copying the kernel from the install media you used. Yes, that would be a blank (patch-less) version but something is better than nothing. Alternatively, if you're still unsure about the version, copy the kernel file from the kernel.old directory.


----------



## Julf (Nov 13, 2019)

SirDice said:


> That would definitely explain the "can't load kernel" messages. Is the entire directory empty or are the *.ko kernel modules still there? If everything is mostly there but you're only missing kernel you can try copying the kernel from the install media you used. Yes, that would be a blank (patch-less) version but something is better than nothing.



Yes, all the .ko files are there.


----------



## Julf (Nov 13, 2019)

Unfortunately copying the kernel didn't yet solve the problem. Now it seems the issue is inability to mount root. Of course the error message is just above the last console screen - is there any way to pause/scroll back boot console output?


----------



## SirDice (Nov 13, 2019)

Julf said:


> Unfortunately copying the kernel didn't yet solve the problem. Now it seems the issue is inability to mount root.


Well, it's some progress, now the kernel has started. 



Julf said:


> Of course the error message is just above the last console screen - is there any way to pause/scroll back boot console output?


Standard FreeBSD console allows scrolling if you hit the "Scroll lock" button on your keyboard. You can use the cursor keys or PgUp/PgDown to scroll up and down. Hit "Scroll lock" again to stop. That key might be tricky on a laptop, you may need to use some function (Fn) key for it.


----------



## Julf (Nov 14, 2019)

SirDice said:


> Standard FreeBSD console allows scrolling if you hit the "Scroll lock" button on your keyboard. You can use the cursor keys or PgUp/PgDown to scroll up and down. Hit "Scroll lock" again to stop. That key might be tricky on a laptop, you may need to use some function (Fn) key for it.



Thanks. Had to swap to a proper full keyboard instead of the slimline one I was using. What I get is:

`Trying to mount root from zfs:zroot/ROOT/default []...
Mounting from zfs:zroot/ROOT/default failed with error 2: unknown file system.`


----------



## Julf (Nov 14, 2019)

Is there a safe way to reinstall the system without overwriting existing application data (such as the dovecot mailboxes)?


----------



## SirDice (Nov 14, 2019)

Which kernel did you copy? The error looks like it has problems loading the zfs.ko kernel module. Which may be because the kernel you copied and the modules don't exactly match up. You probably need to copy a whole /boot/kernel/ directory, modules and all. 



Julf said:


> Is there a safe way to reinstall the system without overwriting existing application data (such as the dovecot mailboxes)?


Probably not necessary to do a complete reinstall. We just need to get the kernel in order, then we can boot the system further. On the installation media there's a /usr/freebsd-dist/kernel.txz, you can extract that one; `tar -C /your/tmp/root -zxvf /usr/freebsd-dist/kernel.txz`. Change the /your/tmp/root to where you imported your pool to. Unpacking those files is what the installer normally does.


----------



## Julf (Nov 14, 2019)

SirDice said:


> Which kernel did you copy? The error looks like it has problems loading the zfs.ko kernel module. Which may be because the kernel you copied and the modules don't exactly match up. You probably need to copy a whole /boot/kernel/ directory, modules and all.
> 
> 
> Probably not necessary to do a complete reinstall. We just need to get the kernel in order, then we can boot the system further. On the installation media there's a /usr/freebsd-dist/kernel.txz, you can extract that one; `tar -C /your/tmp/root -zxvf /usr/freebsd-dist/kernel.txz`. Change the /your/tmp/root to where you imported your pool to. Unpacking those files is what the installer normally does.



That worked! System boots OK. Many thanks!


----------



## SirDice (Nov 14, 2019)

Hurray! Good. Keep in mind that you now have a "plain" 12.0 kernel, no security patches. You may want to try running freebsd-update(8), hopefully that detects the "old" kernel and will update it for you. 

As you're here now, I often copy that "plain" kernel from the install media to a /boot/kernel.RELEASE/ or /boot/kernel.GOOD/. You can just copy it, then you can use it in emergencies like this one.


----------



## Julf (Nov 14, 2019)

SirDice said:


> Hurray! Good. Keep in mind that you now have a "plain" 12.0 kernel, no security patches. You may want to try running freebsd-update(8), hopefully that detects the "old" kernel and will update it for you.



Done - just about to see if it still boots. 



> As you're here now, I often copy that "plain" kernel from the install media to a /boot/kernel.RELEASE/ or /boot/kernel.GOOD/. You can just copy it, then you can use it in emergencies like this one.



Excellent advice!


----------



## Julf (Nov 14, 2019)

Julf said:


> Done - just about to see if it still boots.



Success!


----------



## free-and-bsd (Nov 18, 2019)

Still, the question is, how could the system end up without any kernel. Sure, a simple power outage won't do that. ZFS is extremely safe in that sense.
Neither will it upgrade automatically; and when you upgrade it will prompt for reboot a couple of times. Further, ZFS pool will NEVER be upgraded without asking that EXPLICITLY and your answering in affirmative. So you still need to recall the chain of events, just to make sure this doesn't happen again.
I remember having that problem of system not finding /boot/kernel/kernel due to loader's weird ideas about where to look for boot/kernel/kernel. Don't know the mechanics behind it, but what helped me then was adding kern.bootfile=/boot/kernel/kernel line to /etc/sysctl.conf.


----------



## SirDice (Nov 18, 2019)

free-and-bsd said:


> ZFS is extremely safe in that sense.


Reasonably safe, yes. Bulletproof, no.


----------



## `Orum (Nov 20, 2019)

free-and-bsd said:


> Still, the question is, how could the system end up without any kernel.


We can only speculate, but I'm guessing a `rm -r /boot/kernel/*` (or similar) happened at some point, and it went unnoticed until the power outage.

I know I've accidentally rm'ed a few files in the past, only to discover problems related to their absence weeks later.  Fortunately they weren't that important, and I had backups, but yes, he should be snapshotting his whole tree periodically.  Rolling snapshots can save you lots of headaches in the future.


----------



## Julf (Nov 28, 2019)

I will definitely snapshot from here on. Pretty sure I didn't remove the kernel myself, still not sure what happened.


----------

