# ZFS bug or metadata corruption?



## boblog (Jun 21, 2010)

Hello

My ZFS pool has become degraded and im having difficulties recovering. 


```
NAME        STATE     READ WRITE CKSUM
        storage     DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            ad2     ONLINE       0     0     0
            ad14    ONLINE       0     0     0
            ad14    FAULTED      0   389     0  corrupted data
          raidz1    ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            ad12    ONLINE       0     0     0
            ad20    ONLINE       0     0     0
```

Before i had an ad18 device in the raidz1 now containing two ad14 devices. 

zpool replace ad18 ad10 complains about device not in pool. 

I added a spare and tried replacing ad14, hoping it would replace the faulted device, no such luck.


```
NAME        STATE     READ WRITE CKSUM
        storage     DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            ad2     ONLINE       0     0     0
            spare   ONLINE       0     0     0
              ad14  ONLINE       0     0     0
              ad4   ONLINE       0     0     0
            ad14    FAULTED      0   389     0  corrupted data
          raidz1    ONLINE       0     0     0
            ad6     ONLINE       0     0     0
            ad12    ONLINE       0     0     0
            ad20    ONLINE       0     0     0
        spares
          ad4       INUSE     currently in use
```

If i pull the faulted device from the system there is no change in the pool. 

Why did this happen and can i recover?


----------



## boblog (Jun 23, 2010)

The solution is to add a spare and mirror the duplicate device, drop the original, add a second drive and zpool replace the duplicate ghost device. Takes a few hours but the array is now ONLINE.


----------



## Matty (Jun 23, 2010)

did you offline the disk first?
afaik:
1. offline faulty disk 
	
	



```
zpool offline disk
```
2. physically replace disk and do a 
	
	



```
zpool replace disk
```
3. the resilvering proces should start.


----------



## boblog (Jun 24, 2010)

I didn't dare as the sparing to replace ad14 acted on the online ad14.

I belive if i messed with ad14 i would have offlined one of the two remaining good drives in the vdev. I don't know what happened with ad18, the drive that became a "ghost" ad14. 

I will have to run diags next week. 

This post looks related sort of:

http://lists.freebsd.org/pipermail/freebsd-fs/2010-March/007895.html


----------

