# I broke my FreeBSD installation



## abishai (Jul 18, 2015)

I don't know what's happened, but I somehow broke my FreeBSD installation and now it doesn't boot at all.

I have 2 ZFS pools - zroot and zdata. The last one is mounted as /home
zroot is a mirror of 2 SSDs, zdata is raidZ of 3 HDDs

On the previous session I did:
1. Imported zroot from old SSD with `zpool import zroot zroot-old` command
2. Added cache and log from free partitions I left on my 2 SSDs.
3. Exported zroot-old

The system is not booting up. I receive message on boot:

```
ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS of pool zdata
gptzfsboot: failed to mount default pool zdata
```
+ registry dump and BTX halted message.

That's very strange indeed - zdata was never a 'default pool' and was added to the system after initial system setup.

I thought that with zroot-old export I broke my zpool.cache. I physically removed zdata HDDs, booted installation media and recreated zpool.cache. I double checked that bootfs is pointing to zroot/ROOT/default. The problem persists.

I booted from pen drive again and recreated bootcode on both of SSDs. Nothing!

I edited loader.conf and pointed zpool.cache manually. Yes, with the same results.

Now I'm totally ran out of ideas - why and where it looking for zdata ?!

UPDATE: I put zdata HDDs back, now it says that /boot/kernel not found on default path zdata:/boot/kernel/kernel and shows boot prompt. I entered `zroot/ROOT/default:/boot/kernel/kernel`, it showed stack and BTX halted error
All of this is really strange. Why it tries to boot from zdata ?

UPDATE2: I installed fresh FreeBSD on one of the SSD, breaking the mirror and with default configuration. It booted and loads zroot/ROOT/default of the mirrored pool, acting as a kernel holder. Idiotic situation, I'm starting to hate people ported ZFS to FreeBSD.


----------



## abishai (Jul 18, 2015)

I wiped both SSDs, containing zroot pool (bye-bye my 2 day work), and booted from installation media. zdata somehow knows about zroot and on `zpool import` shows it in FAULTED state. Maybe this is the root of the problem - pools knows about each others somehow. `zpool destroy zroot` fails with 'no such pool error'. So, now I'm stuck with installation. How to remove the ghost of zroot from zdata ?
I think the attempt to create new zroot before fix of zdata is no good.

UPDATE: Seems wrong, about zdata knows about zroot - it showed up without HDDs attached. Strange, as geom was destroyed with `gpart destroy -F ada{N} command`


----------



## protocelt (Jul 18, 2015)

You can use the `zpool labelclear`(see zpool(8)) command on the SSD drives to delete any old zpool label information that's hanging around which seems to be the case here. Also, and this may have changed recently, using this command will also destroy all label metadata on the drive regardless of source.


----------



## abishai (Jul 18, 2015)

Ok guys, I bring you a story 
I found conditions that triggers the issue and I was able to recreate them.

1. Have more than 1 ZFS dataset.
2. Geoms used by boot ZFS dataset have cache or log partitions with logical block address LESS than partition used by boot ZFS dataset.
3. Cache and log partitions added to the another ZFS dataset.
Here is what I try to say if it looks  too confusing:
`gpart add -s 222 -a 1m -t freebsd-boot -l boot0 ada0
gpart add -s 4g -a 1m -t freebsd-zfs -l log0 ada0
gpart add -s 40g -a 1m -t freebsd-zfs -l cache0 ada0
gpart add -a 1m -t freebsd-zfs -l system0 ada0`
This scheme is not against best practices (caching or log devices can be placed freely and don't require separate disks and nothing is said about partition positioning).
If you made steps 1-3 gptzfsloader will try to boot from dataset you added log0 and cache0 devices to! If you manually set correct ZFS, it will BTX halt.
If boot ZFS has logical block addresses less, i.e.
`gpart add -s 222 -a 1m -t freebsd-boot -l boot0 ada0
gpart add -s 40 -a 1m -t freebsd-zfs -l system0 ada0
gpart add -s 4g -a 1m -t freebsd-zfs -l log0 ada0
gpart add -a 1m -t freebsd-zfs -l cache0 ada0` system boots normally.
Maybe it's worth a PR.


----------



## protocelt (Jul 18, 2015)

abishai said:


> Ok guys, I bring you a story
> I found conditions that triggers the issue and I was able to recreate them.
> [...]


Nothing to add to this other than I'm curious about this as well. I wasn't aware partition order was important either.


----------



## wblock@ (Jul 19, 2015)

abishai said:


> gpart add -s 222 -a 1m -t freebsd-boot -l boot0 ada0


Please show the output of `gpart show ada0`.  When -a 1m is used, both the start and size of the partition will be rounded to multiples of 1M.  However, at least some FreeBSD bootcode does not deal well with a freebsd-boot partition larger than 512K.


----------



## abishai (Jul 19, 2015)

```
=>  34  234441581  ada0  GPT  (112G)
  34  2014  - free -  (1.0M)
  2048  222  1  freebsd-boot  (111K)
  2270  1826  - free -  (913K)
  4096  136314880  2  freebsd-zfs  (65G)
  136318976  8388608  3  freebsd-zfs  (4.0G)
  144707584  89733120  4  freebsd-zfs  (43G)
  234440704  911  - free -  (456K)
```


----------

