# My zpool exploded



## tarkhil (Dec 27, 2021)

Severl hours ago my zpool suddenly exploded. The server rebooted and could not boot.


```
root@:~ # zdb -AAA -F -d -e iile-boot
zdb: can't open 'iile-boot': Integrity check failed
```


```
root@:~ # zpool import
   pool: iile-boot
     id: 4380822407036168996
  state: FAULTED
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

        iile-boot            FAULTED  corrupted data
          mirror-0           FAULTED  corrupted data
            gpt/iile-boot-1  UNAVAIL  cannot open
            gpt/iile-boot-0  ONLINE

   pool: iile
     id: 4721818964728306628
  state: FAULTED
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
        The pool may be active on another system, but can be imported using
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

        iile                     FAULTED  corrupted data
          mirror-0               DEGRADED
            8445171921478463808  UNAVAIL  cannot open
            gpt/iile-0           ONLINE
```


```
root@:~ # zdb -AAA -e iile-boot

Configuration for import:
        vdev_children: 1
        version: 5000
        pool_guid: 4380822407036168996
        name: 'iile-boot'
        state: 0
        vdev_tree:
            type: 'root'
            id: 0
            guid: 4380822407036168996
            children[0]:
                type: 'mirror'
                id: 0
                guid: 15675021958327973475
                whole_disk: 0
                metaslab_array: 256
                metaslab_shift: 32
                ashift: 9
                asize: 536866193408
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 1991294491525726088
                    path: '/dev/gpt/iile-boot-1'
                    whole_disk: 1
                    DTL: 2866
                    create_txg: 4
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 6740896146295478304
                    whole_disk: 1
                    DTL: 2172
                    create_txg: 4
                    path: '/dev/gpt/iile-boot-0'
        load-policy:
            load-request-txg: 18446744073709551615
            load-rewind-policy: 2
(very long time)
zdb: can't open 'iile-boot': Integrity check failed

ZFS_DBGMSG(zdb) START:
spa.c:5998:spa_import(): spa_import: importing iile-boot
spa_misc.c:411:spa_load_note(): spa_load(iile-boot, config trusted): LOADING
vdev.c:131:vdev_dbgmsg(): disk vdev '/dev/gpt/iile-boot-0': best uberblock found for spa iile-boot. txg 3110443
spa_misc.c:411:spa_load_note(): spa_load(iile-boot, config untrusted): using uberblock with txg=3110443
vdev.c:136:vdev_dbgmsg(): mirror-0 vdev (guid 15675021958327973475): metaslab_init failed [error=97]
vdev.c:136:vdev_dbgmsg(): mirror-0 vdev (guid 15675021958327973475): vdev_load: metaslab_init failed [error=97]
spa_misc.c:396:spa_load_failed(): spa_load(iile-boot, config trusted): FAILED: vdev_load failed [error=97]
spa_misc.c:411:spa_load_note(): spa_load(iile-boot, config trusted): UNLOADING
ZFS_DBGMSG(zdb) END
```

I'm currently saving zpool images, but are there any chances? What should I try? Or the kernel managed to destroy all data, leaving nothing?

UPD.


```
root@:~ # sysctl vfs.zfs.spa.load_verify_metadata=0
vfs.zfs.spa.load_verify_metadata: 1 -> 0
root@:~ # sysctl vfs.zfs.spa.load_verify_data=0
vfs.zfs.spa.load_verify_data: 1 -> 0

root@:~ # zpool import -f -R /mnt -o readonly -N iile-boot
internal error: cannot import 'iile-boot': Integrity check failed
Abort (core dumped)

root@:~ # tail /var/log/messages
Dec 27 13:41:52  ZFS[2167]: pool I/O failure, zpool=iile-boot error=97
Dec 27 13:41:52  ZFS[2171]: vdev problem, zpool=iile-boot path= type=ereport.fs.zfs.vdev.corrupt_data
Dec 27 13:41:52  ZFS[2175]: failed to load zpool iile-boot
Dec 27 13:41:54  ZFS[2183]: pool I/O failure, zpool=iile-boot error=97
Dec 27 13:41:54  ZFS[2187]: vdev problem, zpool=iile-boot path= type=ereport.fs.zfs.vdev.corrupt_data
Dec 27 13:41:54  ZFS[2191]: failed to load zpool iile-boot
Dec 27 13:41:54  ZFS[2199]: pool I/O failure, zpool=iile-boot error=97
Dec 27 13:41:54  ZFS[2203]: vdev problem, zpool=iile-boot path= type=ereport.fs.zfs.vdev.corrupt_data
Dec 27 13:41:54  ZFS[2207]: failed to load zpool iile-boot
Dec 27 13:41:54  kernel: pid 2131 (zpool), jid 0, uid 0: exited on signal 6 (core dumped)
```

UPD2. Looks like I've hit https://github.com/openzfs/zfs/issues/12559 and this issue is not in kernel yet. So beware of zstd!


----------



## tarkhil (Dec 28, 2021)

Attempt to recover with 
`zpool import -R /mnt -o readonly -f -N -FX  iile`

resulted in panic. However, zdb -l shows some pretty alive uberblock. Looking for help from ZFS developers.


----------



## grahamperrin@ (Dec 28, 2021)

`zfs --version`
`sysrc -f /boot/loader.conf zfs_load openzfs_load`

Also, maybe not directly relevant at the moment, but always good to know: 

`uname -aKU`


----------



## diizzy (Dec 28, 2021)

One of your drives seems dead?


----------



## grahamperrin@ (Dec 28, 2021)

tarkhil said:


> … like I've hit https://github.com/openzfs/zfs/issues/12559 and this issue is not in kernel yet. …



From GitHub there:



> … fixed in PR #12177. …



– _Livelist logic should handle dedup blkptrs #12177_

<https://cgit.freebsd.org/src/commit/?id=86b5f4c121885001a472b2c5acf9cb25c81685c9> (2021-06-07)



> Livelist logic should handle dedup blkptrs



<https://github.com/freebsd/freebsd-src/commit/86b5f4c121885001a472b2c5acf9cb25c81685c9> shows the commit in _main_.

<https://forums.freebsd.org/posts/536019> tarkhil mentioned _13.0RC1_, forum searches find no mention of STABLE or CURRENT so let's assume:

13.0-RELEASE in this case
patch level not yet known
<https://bokut.in/freebsd-patch-level-table/#releng/13.0>


----------



## ralphbsz (Dec 28, 2021)

Recover the data? Theoretically perhaps, practically difficult. Not without developers with internals knowledge. That becomes a tradeoff between the value of the data, and the value of the time of a developer.

How did this happen? What kernel / ZFS version were you running when it happened? Did you have dedup and/or compression enabled? Were you creating/destroying datasets? This kind of information is useful for two reasons: (a) to help people identify what could be the root cause (which bug, which hardware failure), which might help you avoid the problem in the future; (b) to help others deciding how to run their system. For example, if you found this problem in FreeBSD version 314-PI, then I might decide to instead run version 2718-E. Or I might avoid using "squish" compression, and use "stomp" instead.


----------



## blind0ne (Dec 28, 2021)

Really, why I always hear "don't use software raid's", after such posts the "scare" is much more larger. On the other hand heard about few projects that already few decades or less running zfs and are fine as far as I "checked".
Don't know how lucky or smart you should be to run both variants, hope 'op' won't lost data.


----------



## grahamperrin@ (Dec 29, 2021)

ralphbsz said:


> … avoid the problem …



There's discussion of avoidance in two or more of the issues in GitHub.


----------



## tarkhil (Dec 29, 2021)

diizzy said:


> One of your drives seems dead?


Long ago, that's not THE problem.


ralphbsz said:


> Recover the data? Theoretically perhaps, practically difficult. Not without developers with internals knowledge. That becomes a tradeoff between the value of the data, and the value of the time of a developer.
> 
> How did this happen? What kernel / ZFS version were you running when it happened? Did you have dedup and/or compression enabled? Were you creating/destroying datasets? This kind of information is useful for two reasons: (a) to help people identify what could be the root cause (which bug, which hardware failure), which might help you avoid the problem in the future; (b) to help others deciding how to run their system. For example, if you found this problem in FreeBSD version 314-PI, then I might decide to instead run version 2718-E. Or I might avoid using "squish" compression, and use "stomp" instead.


FreeBSD 13.0, no dedup, comression, can't recall if it was lz4 or zstd. I've thought of trying FreeBSD 12.2 to read the pool, yes. 


blind0ne said:


> Really, why I always hear "don't use software raid's", after such posts the "scare" is much more larger. On the other hand heard about few projects that already few decades or less running zfs and are fine as far as I "checked".
> Don't know how lucky or smart you should be to run both variants, hope 'op' won't lost data.


The problem is not in software raid. zfs somehow exploded. I was able to import one pool, but that was boot one, so data on it are of very little value.


grahamperrin said:


> There's discussion of avoidance in two or more of the issues in GitHub.


Avoid the problem is the best solution. Can I rent your time machine?


----------



## tarkhil (Dec 29, 2021)

`/usr/local/sbin/zdb -u -l /dev/vtbd1p3 > /root/uber.txt`

shows a pretty nice uberblocks. After panic pool seems to be a bit more alive.

Now I'll try txgs as listed in zdb -u output


----------



## grahamperrin@ (Dec 29, 2021)

tarkhil said:


> Can I rent your time machine?



;-) the avoidance advice was intended for other readers.



grahamperrin said:


> `zfs --version`
> `sysrc -f /boot/loader.conf zfs_load openzfs_load`
> `uname -aKU`



Compare with <https://forums.freebsd.org/threads/...s-freebsd-13-0-release-use.81726/#post-548577>



tarkhil said:


> … Looking for *help from ZFS developers*.



For data recovery in your case, I'm no expert, but I reckon:

current data should be recoverable, with or without an import – excluding data that was, or should have been, intentionally destroyed; zfs-destroy(8) | zfs-destroy(8)
snapshot content might be trickier – gut feeling.


----------



## tarkhil (Dec 29, 2021)

grahamperrin said:


> ;-) the avoidance advice was intended for other readers.
> 
> 
> 
> ...


`root@gw:/usr/ports # zfs --version
zfs-2.0.0-FreeBSD_gf11b09dec
zfs-kmod-v2021121500-zfs_f291fa658
root@gw:/usr/ports # sysrc -f /boot/loader.conf zfs_load openzfs_load
sysrc: unknown variable 'zfs_load'
openzfs_load: YES
root@gw:/usr/ports # uname -aKU
FreeBSD gw.iile.ru 13.0-RELEASE FreeBSD 13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f: Fri Apr  9 04:24:09 UTC 2021     root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64 1300139 1300139`

I don't care of zfs destroy'ed data, they were intended to be destroyed. I'll continue experiments, I have a copy of data and New Year vacations anyway.


----------



## tarkhil (Dec 29, 2021)

Okay, after some panics and attempts, zdb shows plenty of data, files, directories.

But

`/usr/local/sbin/zpool import -o readonly -R /mnt -f -N iile`

produces in logs 
`Dec 29 15:42:04 recover kernel: vtbd1: hard error cmd=write 1048577080-1048577095
Dec 29 15:42:04 recover kernel: vtbd1: hard error cmd=write 11721044024-11721044039
Dec 29 15:42:04 recover kernel: vtbd1: hard error cmd=write 11721044536-11721044551
Dec 29 15:42:04 recover ZFS[780]: vdev I/O failure, zpool=iile path=/dev/gpt/iile-0 offset=270336 size=8192 error=5
Dec 29 15:42:04 recover ZFS[784]: vdev I/O failure, zpool=iile path=/dev/gpt/iile-0 offset=5464303345664 size=8192 error=5
Dec 29 15:42:04 recover ZFS[788]: vdev I/O failure, zpool=iile path=/dev/gpt/iile-0 offset=5464303607808 size=8192 error=5
Dec 29 15:42:04 recover ZFS[792]: vdev probe failure, zpool=iile path=/dev/gpt/iile-0
Dec 29 15:42:04 recover ZFS[796]: vdev state changed, pool_guid=4721818964728306628 vdev_guid=1211236719488683795
Dec 29 15:42:04 recover ZFS[800]: vdev problem, zpool=iile path= type=ereport.fs.zfs.vdev.no_replicas
Dec 29 15:42:04 recover ZFS[804]: failed to load zpool iile
Dec 29 15:42:04 recover ZFS[808]: failed to load zpool iile
Dec 29 15:42:04 recover ZFS[812]: failed to load zpool iile`

Disk is pretty healthy, dd copied it without a problem. 

`# /usr/local/sbin/zpool import -o readonly -R /mnt -f -F -N iile

Dec 29 15:45:21 recover kernel: vtbd1: hard error cmd=write 11721044024-11721044039
Dec 29 15:45:21 recover kernel: vtbd1: hard error cmd=write 11721044536-11721044551
Dec 29 15:45:21 recover kernel: vtbd1: hard error cmd=write 1048577080-1048577095
Dec 29 15:45:21 recover ZFS[842]: vdev state changed, pool_guid=4721818964728306628 vdev_guid=1211236719488683795`

Is it possible (not "in theory", but with some description) to read files with zdb?


----------



## grahamperrin@ (Dec 29, 2021)

Thanks,



tarkhil said:


> `ea31abc261f`



<https://cgit.freebsd.org/src/commit/?id=ea31abc261f&h=releng/13.0> was a few months ago, if that's your boot usual boot pool I should recommend updating the base OS.


----------



## grahamperrin@ (Dec 29, 2021)

If you're confident that there's not a hardware issue, have you tried an extreme rewind?

(Proceed with caution; I haven't done so for years.)



tarkhil said:


> Is it possible (not "in theory", but with some description) to read files with zdb?



Does <https://www.google.com/search?q=zdb+recover+import+site:reddit.com/r/zfs/&tbs=li:1#unfucked> or some other combination of those search phrases lead to anything relevant? (I took a quick look at a handful, nothing immediately promising.)

If you have not already seen it: Turbocharging ZFS Data Recovery | Delphix (2018-03-14)



> Besides being able to display the new debug information, zdb has another new feature …



– some discussion of extreme rewinds, although that's not how I found the post.

<https://openzfs.github.io/openzfs-docs/man/8/zdb.8.html> or zdb(8).


----------



## tarkhil (Dec 29, 2021)

grahamperrin said:


> If you're confident that there's not a hardware issue, have you tried an extreme rewind?
> 
> (Proceed with caution; I haven't done so for years.)
> 
> ...


I've copied the data out of the disk and now running 
`/usr/local/sbin/zpool import -R /mnt -o readonly -f -N -FX iile`

On the latest openzfs from ports. For 3.5 hours it's running, no panic, no fail, no visible result (yet). Anyway image is on ZFS, snapshoted, I can always to rollback and start over.


----------



## grahamperrin@ (Dec 29, 2021)

Thanks, do I understand this correctly?

`zfs-2.1.99-1` and `zfs-kmod-v2021121500-zfs_f291fa658` were enough to import (read only) with an `-X` extreme rewind, after the inferior versions _failed_ to import with (outdated) `13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f`

(Sorry, I'm being slightly lazy with my readings of your notes, without looking properly at the manual pages!)


----------



## tarkhil (Dec 30, 2021)

grahamperrin said:


> Thanks, do I understand this correctly?
> 
> `zfs-2.1.99-1` and `zfs-kmod-v2021121500-zfs_f291fa658` were enough to import (read only) with an `-X` extreme rewind, after the inferior versions _failed_ to import with (outdated) `13.0-RELEASE #0 releng/13.0-n244733-ea31abc261f`
> 
> (Sorry, I'm being slightly lazy with my readings of your notes, without looking properly at the manual pages!)


Yes. But unfortunately only for one, not important pool. The second one crashed after several hours of importing, now I'm trying to rewind to the oldest transaction found by zdb.


----------



## grahamperrin@ (Dec 30, 2021)

tarkhil said:


> … Looking for help from ZFS developers. …



I pinged a couple of chat areas (Discourse and IRC), where those experts are likeliest to hang out, with reference to your most recent post:

<https://matrix.to/#/!onUfzoJXZaAXjP...?via=libera.chat&via=matrix.org&via=nixos.org>
<https://discord.com/channels/727023752348434432/757305697527398481/926040503936507954>



tarkhil said:


> … now I'm trying to rewind to the oldest transaction found by zdb.



I'll edit responses from IRC and Discord, as they arrive, into this post. (Don't want to flood this topic.) Here, pasted with permission:



> Recent zdb can just copy out whole files, …
> 
> zdb -r
> 
> ...





> I'm also curious because I've never actually seen "integrity check failed", so I'm guessing that's a FBSD-specific error code


----------



## tarkhil (Dec 30, 2021)

Ultimately, it helped. I don't know how much data I've lost and actually don't care, I have the most important data intact (or I hope to find it intact after copying from R/O-mounted pool).
So, `zdb -u -l ...`
gives list of possible txg's, and 
`/usr/local/sbin/zpool import -R /mnt -o readonly -f -N -FX -T txg pool`
after some time returns seemingly working pool.

And again I think that ZFS data-protection was inspired by Stalingrad defence.


----------



## grahamperrin@ (Dec 30, 2021)

```
________________________________
< cowsay cow c o w copy on write >
 --------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
```


----------

