# Cannot boot from ZFS after zpool upgrade



## spin (Jul 30, 2016)

Hello.
Here as it were started. We decide to upgrade from 8.4 STABLE to 9.3 Releng. I've download a source via subversion, upgrade successfully to 9.3. Reboot. Then we decided to upgrade ZFS pool. Upgrade was successful, but I don't know how to update a boot code. I've been looking for solution on the internet, while server continue working. But a UPS doesn't protect a power failure, so it reboot a server. And now we are in trouble.
Right after update here some string from messages log

```
kernel: ZFS filesystem version: 5
kernel: ZFS storage pool version: features support (5000)
```
 and

```
kernel: GEOM: da0: the primary GPT table is corrupt or invalid.
kernel: GEOM: da0: using the secondary instead -- recovery strongly advised.
kernel: GEOM: da1: the primary GPT table is corrupt or invalid.
kernel: GEOM: da1: using the secondary instead -- recovery strongly advised.
kernel: GEOM: da2: the primary GPT table is corrupt or invalid.
kernel: GEOM: da2: using the secondary instead -- recovery strongly advised.
kernel: GEOM: da3: the primary GPT table is corrupt or invalid.
kernel: GEOM: da3: using the secondary instead -- recovery strongly advised.
kernel: Trying to mount root from zfs:raid-5 []...
```
That why I'm was afraid to run this command, because I'm not was sure whether this could damage or not
`gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0`
Here is a pool status

```
# zpool status -v
  pool: raid-5
 state: ONLINE
  scan: scrub repaired 0 in 307445734561825859h27m with 0 errors on Thu Feb 12 10:41:09 2015
config:

        NAME        STATE     READ WRITE CKSUM
        raid-5      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            da0     ONLINE       0     0     0
            da1     ONLINE       0     0     0
            da2     ONLINE       0     0     0
            da3     ONLINE       0     0     0

errors: No known data errors
```
Here are some more info

```
# egrep '(^ad.*|^da.*)' /var/run/dmesg.boot
da0 at ahc0 bus 0 scbus0 target 0 lun 0
da0: <SEAGATE ST336607LW 0007> Fixed Direct Access SCSI-3 device
da0: Serial Number 3JA961J600007503TCGJ
da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
da0: Command Queueing enabled
da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da1 at ahc0 bus 0 scbus0 target 1 lun 0
da1: <SEAGATE ST336607LW 0007> Fixed Direct Access SCSI-3 device
da1: Serial Number 3JA976N9000075032M8N
da1: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
da1: Command Queueing enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da2 at ahc0 bus 0 scbus0 target 2 lun 0
da2: <IBM IC35L036UWDY10-0 S23C> Fixed Direct Access SCSI-3 device
da2: Serial Number E3V36YLB
da2: 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
da2: Command Queueing enabled
da2: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
da3 at ahc0 bus 0 scbus0 target 3 lun 0
da3: <IBM IC35L036UWDY10-0 S23C> Fixed Direct Access SCSI-3 device
da3: Serial Number E3V2YT8B
da3: 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
da3: Command Queueing enabled
da3: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
```
And camcontrol devlist

```
# camcontrol devlist
<SEAGATE ST336607LW 0007> at scbus0 target 0 lun 0 (da0,pass0)
<SEAGATE ST336607LW 0007> at scbus0 target 1 lun 0 (da1,pass1)
<IBM IC35L036UWDY10-0 S23C> at scbus0 target 2 lun 0 (da2,pass2)
<IBM IC35L036UWDY10-0 S23C> at scbus0 target 3 lun 0 (da3,pass3)
<HL-DT-ST DVDRAM GSA-H10N JL10> at scbus2 target 1 lun 0 (cd0,pass4)
```
Here is info for zpool 

```
# zpool get all raid-5
NAME PROPERTY VALUE SOURCE
raid-5 size 136G -
raid-5 capacity 39% -
raid-5 altroot - default
raid-5 health ONLINE -
raid-5 guid 4451690707634593073 default
raid-5 version - default
raid-5 bootfs raid-5 local
raid-5 delegation on default
raid-5 autoreplace off default
raid-5 cachefile - default
raid-5 failmode wait default
raid-5 listsnapshots off default
raid-5 autoexpand off default
raid-5 dedupditto 0 default
raid-5 dedupratio 1.00x -
raid-5 free 82,7G -
raid-5 allocated 53,3G -
raid-5 readonly off -
raid-5 comment - default
raid-5 expandsize 0 -
raid-5 freeing 0 default
raid-5 feature@async_destroy enabled local
raid-5 feature@empty_bpobj active local
raid-5 feature@lz4_compress enabled local
raid-5 feature@multi_vdev_crash_dump enabled local
raid-5 feature@spacemap_histogram active local
raid-5 feature@enabled_txg active local
raid-5 feature@hole_birth active local
raid-5 feature@extensible_dataset enabled local
raid-5 feature@bookmarks enabled local
raid-5 feature@filesystem_limits enabled local
```
After UPS power failure, I've got on screen

```
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
ZFS: unsupported feature: com.delphix:hole_birth
zfsboot: No ZFS pools located, can't boot
```
When I booted from Freebsd 10 DVD to console
what I can see: gpart show

```
#gpart show da0
=>   63    71687389    da0    MBR (34G)
    63    71687389     -free- (34G)
#gpart show da1
=>   63    71687389    da1    MBR (34G)
    63    71687389     -free- (34G)
#gpart show da2
=>   63    71687277    da2    MBR (34G)
    63    71687277     -free- (34G)
#gpart show da3
=>   63    71687277    da3    MBR (34G)
    63    71687277     -free- (34G)
```
Can any one help me please in this situation. I'm stupid did not make even a single copy before upgrade.
glabel list


----------



## spin (Jul 30, 2016)

After booted into a console from DVD I run:

```
zfs mount -a
#zpool import -D
no pools availible to import
#zpool import -a
cannot import 'raid-5': pool may be in use from other system, it was last accessed by gw.local (hostid: 0x3f6b0956) on Fri Jul 29 13:49:55 2016
use '-f to import anyway'
```
So zpool can see my pool 'raid-5' and I need to recover a bootcode.
How can I do this, without any damage to my existing data pool?

Here is command for "zpool import"

```
#zpool import
pool: raid-5
id: 4451690707634593073
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and the '-f' flag.
see: http://illumos.org/msg/ZFS-8000-EY
config:
   raid-5           ONLINE
      raidz1-0     ONLINE
        da0          ONLINE
        da1          ONLINE
        da2          ONLINE
        da3          ONLINE
```
So I've try zpool import -f raid-5 and it give me a panic system

```
#zpool import -f raid-5
Fatal double fault:
....
panic: double fault
....and so on...
```


----------



## ab2k (Jul 30, 2016)

Hi, very sad to hear that. Try to burn FreeBSD 10.2 - 10.3 to flash drive or DVD and load from it. It have a newest ZFS code that probably will allow you to import your pool. After it you have to copy everything to other place and do zpool from scratch. Design of your pool relies on da* disks, that is bad itself (they may change at any time - so you may get problems with your pool, better to make GPT labels and create pool with that labels). Hope you will be able to get all your data from it. Good luck.

Small addition: seems GPT is corrupted too.. can you please post output of this command - `gpart show`


----------



## spin (Jul 30, 2016)

Read a man for zfsboot
Boot from DVD? drop to console.
Than run this command to da0

```
dd if=/boot/zfsboot of=/dev/da0 count=1
dd if=/boot/zfsboot of=/dev/da0 iseek=1 oseek=1024
```
Then remove DVD and reboot. It's booted up.
On working system `gpart show` show no output.
And I still worried about a this part in boot process

```
GEOM: da0: the primary GPT table is corrupt or invalid.
GEOM: da0: using the secondary instead -- recovery strongly advised.
GEOM: da1: the primary GPT table is corrupt or invalid.
GEOM: da1: using the secondary instead -- recovery strongly advised.
GEOM: da2: the primary GPT table is corrupt or invalid.
GEOM: da2: using the secondary instead -- recovery strongly advised.
GEOM: da3: the primary GPT table is corrupt or invalid.
GEOM: da3: using the secondary instead -- recovery strongly advised.
Trying to mount root from zfs:raid-5 []...
```
See somewhere that it can be cleaned up with `dd if=/dev/null ........`
but I do not know how.


----------



## ab2k (Jul 30, 2016)

Hi again, check gpart(8) manual page for recovery on GPT disks. I didn't understand - U see it's trying to boot from your pool - so everything works now ? Or you still have a problems with it ?


----------



## spin (Jul 31, 2016)

As I'm discovered this disk previously was GPT, but now they are not. It all four a MBR. And whole disk space, without partitioning was used in ZFS.
So after rereading man zfsboot, I have bootup from FreeBSD DVD media, skip to console, and run dd commands as described in man zfsboot.

```
dd if=/boot/zfsboot of=/dev/da0 count=1
dd if=/boot/zfsboot of=/dev/da0 iseek=1 oseek=1024
```
After that it was able to boot itself from da0 disk.
And I need to find out, how to remove (secondary) of GPT records in the end of disks.
Is this could be safe for my data in ZFS pool?


----------



## surv (Jul 31, 2016)

It may be help
https://forums.freebsd.org/threads/52102/#post-292341


----------

