# ZFS: One disk has faulted but I can't see spare disk



## ghostcorps (Jun 1, 2013)

Hi guys,

 I have found that my pool is currently degraded due to a disk faulting. I am currently scrubbing the pool but I am concerned about what I need to do to replace it if the scrub does not do the trick.

 I have 8 x 2 TB SATA disks on a RAID controller. I have been using 7 of these with one spare da2-da9. However now that I may need to replace da6 I can't seem to find da9.

 Is it possible that da6 has already been replaced by da9?

`bsd# zpool status -v`

```
pool: datastore
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
  scan: scrub in progress since Sat Jun  1 17:20:33 2013
        69.8G scanned out of 8.52T at 66.8M/s, 36h51m to go
        0 repaired, 0.80% done
config:

        NAME                     STATE     READ WRITE CKSUM
        datastore                DEGRADED     0     0     0
          raidz2-0               DEGRADED     0     0     0
            da2                  ONLINE       0     0     0
            da3                  ONLINE       0     0     0
            da4                  ONLINE       0     0     0
            da5                  ONLINE       0     0     0
            9757540748441121428  FAULTED      0     0     0  was /dev/da6
            da6                  ONLINE       0     0     0
            da7                  ONLINE       0     0     0

errors: No known data errors
```


----------



## Toast (Jun 1, 2013)

`# zpool replace <pool name> <dead disk> <new disk>`

```
zpool replace [-f] pool device [new_device]

         Replaces old_device with new_device.  This is equivalent to attaching
         new_device, waiting for it to resilver, and then detaching
         old_device.

         The size of new_device must be greater than or equal to the minimum
         size of all the devices in a mirror or raidz configuration.

         new_device is required if the pool is not redundant. If new_device is
         not specified, it defaults to old_device.  This form of replacement
         is useful after an existing disk has failed and has been physically
         replaced. In this case, the new disk may have the same /dev path as
         the old device, even though it is actually a different disk.  ZFS
         recognizes this.

         -f      Forces use of new_device, even if its appears to be in use.
                 Not all devices can be overridden in this manner.
```


----------



## ghostcorps (Jun 1, 2013)

Thanks @Toast,

I have this command ready to go if I need it. The problem is that I should be able to see 8 x SATA disks from da2-da9 however I can only see 7 of these.

`#dmesg`.

```
...
da2 at hpt27xx0 bus 0 scbus2 target 0 lun 0
da2: <HPT DISK 0_0 4.00> Fixed Direct Access SCSI-0 device
da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da3 at hpt27xx0 bus 0 scbus2 target 1 lun 0
da3: <HPT DISK 0_1 4.00> Fixed Direct Access SCSI-0 device
da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da4 at hpt27xx0 bus 0 scbus2 target 2 lun 0
da4: <HPT DISK 0_2 4.00> Fixed Direct Access SCSI-0 device
da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da5 at hpt27xx0 bus 0 scbus2 target 3 lun 0
da5: <HPT DISK 0_3 4.00> Fixed Direct Access SCSI-0 device
da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da6 at hpt27xx0 bus 0 scbus2 target 4 lun 0
da6: <HPT DISK 0_4 4.00> Fixed Direct Access SCSI-0 device
da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da7 at hpt27xx0 bus 0 scbus2 target 5 lun 0
da7: <HPT DISK 0_5 4.00> Fixed Direct Access SCSI-0 device
da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
da8 at hpt27xx0 bus 0 scbus2 target 6 lun 0
da8: <HPT DISK 0_6 4.00> Fixed Direct Access SCSI-0 device
da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C)
...
```

What are the chances that the faulted disk is not even being loaded and the spare is being picked up as da6 which means it is showing faulty cos it has not been imported?


----------



## ghostcorps (Jun 1, 2013)

Update:

The RAID card*'*s BIOS is showing that one disk is no longer available. So it seems pretty conclusive that the old /dev/da6 is dead.

I can fix it with this:

`#zpool export datastore
#zpool replace datastore 9757540748441121428 <Spare>`

But how do I work out which drive is the <Spare>? I can see /dev/da2-/dev/da8 but don't know if da9 became da6 or da8.


----------



## da1 (Jun 1, 2013)

Hi,

I think you can replace a disk without exporting/importing the pool.

To know which disk is the spare, I would first check the order of the hdd's in the controller BIOS and from there, check the disks in the OS (ex: if the controller reports disk5 as the spare, go to the OS and count the 5th disk and that is your spare).

PS: You might want to think about using labels from now on as they do make life easier.
Also, to see the disks you can use:
`# camcontrol devlist`.

PS(2): Rather then defining a spare in the controller BIOS, I would configure all disks as standalone and then create the ZFS pool with a spare. This way, upon `zpool status <pool_name>` you can actually see which disk is the spare, making it easier to manage.


----------



## ghostcorps (Jun 1, 2013)

Thanks da1,

Funnily enough I just started using labels on another build. It makes life so much easier 

I think you have hit the nail on the head. Disk_5 on the controller is now missing, this disk counted out to da6. So assuming that now: da7=>da6, da8=>da7 & da9=>da8.

I can run this:

`#zpool replace datastore 9757540748441121428 da8`

I read somewhere that there was a flag to specify a live pool but I assume that this comes with its own problems.


----------



## da1 (Jun 1, 2013)

I'm not aware of any flag like that. A simple replace command will do the job.


----------



## ghostcorps (Jun 1, 2013)

Cool.

I can't find it now, I must have been mistaken. 

So it is doing its thing, but 144 hours? Really?


```
bsd# zpool status
  pool: datastore
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jun  2 00:12:36 2013
        3.72G scanned out of 8.52T at 17.2M/s, 144h6m to go
        541M resilvered, 0.04% done
config:

        NAME                       STATE     READ WRITE CKSUM
        datastore                  DEGRADED     0     0     0
          raidz2-0                 DEGRADED     0     0     0
            da2                    ONLINE       0     0     0
            da3                    ONLINE       0     0     0
            da4                    ONLINE       0     0     0
            da5                    ONLINE       0     0     0
            replacing-4            UNAVAIL      0     0     0
              9757540748441121428  FAULTED      0     0     0  was /dev/da6
              da8                  ONLINE       0     0     0  (resilvering)
            da6                    ONLINE       0     0     0
            da7                    ONLINE       0     0     0

errors: No known data errors
```


----------



## da1 (Jun 1, 2013)

It depends on the write speed (it will increase in time, you will see) but on average I think it should take less time than scrubbing.


----------



## Terry_Kennedy (Jun 2, 2013)

ghostcorps said:
			
		

> So it is doing its thing, but 144 hours? Really?


Are you using deduplication, by any chance? How much memory is installed in the system?

You can always throw money at it: 


```
(0:29) rz3:/sysprog/terry# zpool status
  pool: data
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Feb 27 14:23:19 2013
        5.36T scanned out of 17.8T at 3.56G/s, 0h59m to go
        0 resilvered, 30.18% done

        ...
```


----------



## ghostcorps (Jun 2, 2013)

Terry_Kennedy said:
			
		

> Are you using deduplication, by any chance? How much memory is installed in the system?



No deduplication that I know of.

This is what my memory situation looks like

`#top`

```
...
Mem: 60M Active, 309M Inact, 2453M Wired, 61M Cache, 315M Buf, 80M Free
Swap:
...
```

`#dmesg | grep memory`

```
real memory  = 4294967296 (4096 MB)
avail memory = 3084144640 (2941 MB)
```


Memory is another issue I am looking at atm. I've screwed up a recent recovery and mislabeled the swap! lol It is fixed and waiting for a reboot. The eta is down to 35hrs now but I get paranoid about stopping recovery processes. I'll just wait it out


----------



## Terry_Kennedy (Jun 2, 2013)

ghostcorps said:
			
		

> ```
> Mem: 60M Active, 309M Inact, 2453M Wired, 61M Cache, 315M Buf, 80M Free
> Swap:
> ```



You're running 9 TB of ZFS on an i386 with 3 GB of RAM? No wonder it's slow!

I think the developers should add a warning (similar to the WITNESS warning on -CURRENT) which displays if someone tries to load zfs.ko on i386 architecture. It's probably too late now, though.


----------



## ghostcorps (Jun 2, 2013)

Terry_Kennedy said:
			
		

> You're running 9TB of ZFS on an i386 with 3GB of RAM? No wonder it's slow!
> 
> I think the developers should add a warning (similar to the WITNESS warning on -CURRENT) which displays if someone tries to load zfs.ko on i386 architecture. It's probably too late now, though.



Lol 

It is an amd64 

`#sysinfo - a`

```
CPU information

Machine class:  amd64
CPU Model:      Intel(R) Core(TM)2 Duo CPU     E4600  @ 2.40GHz
No. of Cores:   2
Cores per CPU:
```

And it should be 4G of RAM but you are right it is ancient. 


I have never had any issue with its performance or capability until now and even this seems to be due to missing swap. It is my media server and has no trouble streaming HiDef MKV's either raw or transcoded while also running SABnzbd, rtorrent and a handful of other services. Even with a missing HDD and no SWAP the stream only started skipping while it was being scrubbed. This was all while Apache was reserving a GB and a half all to itself! (since been fixed).

I'll migrate to my old game PC once this is all smoothed out. That should keep me rocking for a while.


----------



## phoenix (Jun 2, 2013)

Scrub and resilver work in the order data was written to the drive (by transaction ID). So, if you have an old pool with lots if deletes and creates, lots of snapshot creates and deletes, then your pool will be fragmented (old data stored physically on risk ahead of new data, etc).

Thus, it can be slow.

Remember also that you are limited by the write speed of the new disk. You can read data off all the other drives in parallel, but you are limited to a single write queue.


----------



## ghostcorps (Jun 3, 2013)

phoenix said:
			
		

> You can read data off all the other drives in parallel, but you are limited to a single write queue.



Thanks 

This is very good to know! I was trying to work out the impact of reading/writing while resilvering/scrubbing. I'll stick to reading only for now. Though there is only three hours to go, apparently.

`# zpool status`

```
pool: datastore
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jun  2 00:12:36 2013
        7.89T scanned out of 8.52T at 58.3M/s, 3h8m to go
        1.13T resilvered, 92.61% done
config:

        NAME                       STATE     READ WRITE CKSUM
        datastore                  DEGRADED     0     0     0
          raidz2-0                 DEGRADED     0     0     0
            da2                    ONLINE       0     0     1  (resilvering)
            da3                    ONLINE       0     0    37  (resilvering)
            da4                    ONLINE       0     0     0
            da5                    ONLINE       0     0     0
            replacing-4            UNAVAIL      0     0     0
              9757540748441121428  FAULTED      0     0     0  was /dev/da6
              da8                  ONLINE       0     0     0  (resilvering)
            da6                    ONLINE       0     0    28  (resilvering)
            da7                    ONLINE       0     0     0

errors: No known data errors
```


----------



## HarryE (Jun 3, 2013)

I would check (`smartctl`) the da3, da2 and da6 after the resilvering ended. There are cheksum errors on them.


----------



## ghostcorps (Jun 3, 2013)

Will do, thanks for the heads up.


----------



## ghostcorps (Jun 3, 2013)

I have finished replacing the dead drive with no trouble. 

When I check the drives, the all return the following:

`# smartctl -a /dev/da2`

```
smartctl 6.1 2013-03-16 r3800 [FreeBSD 8.3-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HPT
Product:              DISK 0_0
Revision:             4.00
User Capacity:        2,000,398,934,016 bytes [2.00 TB]
Logical block size:   512 bytes
(pass2:hpt27xx0:0:0:0): MODE SENSE(6). CDB: 1a 0 84 0 40 0
(pass2:hpt27xx0:0:0:0): CAM status: CCB request was invalid
(pass2:hpt27xx0:0:0:0): MODE SENSE(6). CDB: 1a 0 1c 0 40 0
(pass2:hpt27xx0:0:0:0): CAM status: CCB request was invalid
>> Terminate command early due to bad response to IEC mode page
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
```

I'm chasing up a few things now.

I was advised that hpt27xx.ko was much better to the RocketRaid drivers that Highpoint put out rr272x_1x.ko. Does anyone have an alternate opinion?


----------



## da1 (Jun 3, 2013)

You have to specify the controller type to smartctl. Example `smartctl -a -d 3ware,0 /dev/sda`.


----------



## ghostcorps (Jun 3, 2013)

Ahh, that would explain it. Stupid mistake, sorry. The checksum errors have mysteriously disappeared since rebooting, but I still ran the tests. This came up with no errors logged on all 7 disks 

`#smartctl -a -d hpt,1/1 /dev/hpt27xx`

I'll call this one solved. Thanks for all your help guys.  As always, I have learned a lot 

ps. The RocketRaid rivers on the Highpoint site are the devil, they are not 8.3 compatible.


----------



## HarryE (Jun 3, 2013)

Check the power cable connectors, too. Due to cheap materials used these days, the contacts get rather loose, combined with vibration and heat, results  in disk errors. (Thank you ZFS, for noticing them) 
Been there, done that.


----------

