# ZFS and SMART



## pww (Jun 15, 2012)

15 June 2012

I am using FreeBSD 8.2 with a 4-drive ZFS pool. SMART is telling me that one drive in the pool has an "offline_uncorrectable" sector during the short-offline self-test. (It seems to be somewhere in unused space since a zfs scrub doesn't cause drive errors or the sector to be spared out.)

Anyway, I'm thinking of doing the following to see if I can clear the error:

zfs offline the drive
dd write the affected drive area (to cause the drive to spare out the sector)
zfs online the drive

Is this a good idea? Will the zpool need to manually reslivered or scrubbed? (And yes I have backups, but I'd rather not incur system downtime by my stupid mistake)

Thanks
-phil


----------



## gkontos (Jun 15, 2012)

Resilver should happen automatically once you replace the drive again. You will need to reinitialize the HAST resource again though.

My suggestion is to failover before you perform all these tasks in order to avoid unnecessary synch traffic.  

George


----------



## wblock@ (Jun 15, 2012)

SMART is known for being overly optimistic.  If it's reporting errors, time to swap the drive out.  Then you can work on it at leisure rather than hoping it won't fail in use.


----------



## badtux (Jun 16, 2012)

The sector in question will be spared when a write occurs, so no need to do any shenanigans to make that happen. In general all consumer drives come from the manufacturer with sectors that cannot be read and will give the SMART error in question when you tell the drive to do its full self-diagnostics (which reads every sector of the drive). This is not an issue as long as you don't try to read a sector that hasn't been written first -- and ZFS doesn't. As long as you don't see an alarming number of relocations happening on writes (check your SMART data) you're fine.

Why doesn't the drive spare the sector itself when it proves unable to read it? Well, if it's a "RAID" drive, it's expecting the RAID subsystem to do so when that stripe gets read and returns a read error, at which point good RAID systems will rebuild the sector from the stripes on the other drives and write it with the correct data, thereby avoiding a full RAID rebuild... if the drive spares the sector itself before a write happened to the sector, it is silently corrupting the stripe by replacing a sector with a blank one rather than with the correct data that the RAID system is writing.


----------



## wblock@ (Jun 16, 2012)

badtux said:
			
		

> The sector in question will be spared when a write occurs, so no need to do any shenanigans to make that happen. In general all consumer drives come from the manufacturer with sectors that cannot be read and will give the SMART error in question when you tell the drive to do its full self-diagnostics (which reads every sector of the drive).



That has not been my experience (did not see existing remapped sectors as errors when running the long tests).  There are two kinds of bad sectors.  Drives have bad sectors that are mapped out at the factory.  Those I don't worry about, that was just from manufacturing.  The ones that bother me are "grown" errors that suddenly appear.  Maybe there will be just one, or maybe a little part of the disk surface peeling off is foretelling a bigger part of the surface peeling off.  If the drive is not in use, you can grab the SMART stats, run a long test (or just dd(1) data to the whole drive, then get the stats again to see whether there are any new relocated sectors.  Then decide whether it's a good idea to put that drive back in use, even in a RAID.


----------



## badtux (Jun 16, 2012)

wblock, one of my past jobs was qualifying hard drives for a vendor of NAS systems. Every single component had to pass my qualification tests before it went into our NAS systems. I believe we were the first vendor to use "enterprise grade" SATA drives in a NAS system. One of the things that my initial qualification test for hard drives did was read every single sector of the drive. What I found was that approximately 40% of the hard drives that came into our shop from both of the major vendors we worked with -- Seagate and Western Digital -- had at least one bad sector on the initial read right out of the box from the vendor and failed the initial qualification test that I devised. Once we ran a disk scrub on the drives (i.e., wrote zeros to every sector) the sectors were replaced from the relocation list and all was well, they passed further tests without any issues. So the qualification test was modified to do the final read test after the scrub and checked the SMART relocation list counts and rejected only those drives which had a number of relocations which exceeded our strict standards, rather than requiring a zero error count fresh from the manufacturer. (Note that this process occurred for roughly three days straight -- yes, three days straight of reading and writing patterns to the hard drives before they were allowed to be shipped). 

In case you're wondering, we had 1 (one) system arrive DOA at a customer site during the entire time I was in charge of manufacturing quality at that manufacturer, and we don't know what happened out there at the customer site because when it arrived back, it booted up immediately when we plugged it in. That's how persnickity I am about quality -- out of the first two hundred systems we shipped, not a single "real" DOA.

In any event, SMART implementations vary wildly between hard drive vendors, so it's always iffy reading too much into what they report unless you have extensive experience with that specific vendor (as in, you've qualified hundreds of drives of that type in the past). If read errors happen *WHEN READING DATA THAT HAS ALREADY BEEN WRITTEN* then I view the hard drive as pretty much toast, if it's in service in a storage array the RAID attempts to re-build that stripe in hope that the bad block will get relocated, logs it, and if a certain number of these errors happen we eventually fail the drive. But if read errors happen reading sectors that have never been written, my experience is that this is completely and totally normal for SATA hard drives, "enterprise" or not, and that the correct solution to avoid this happening if it is an issue for a specific application is to scrub the drives prior to putting them into service.


----------



## pww (Jun 17, 2012)

Still, though, is it safe to "offline" the drive, *dd* part of it, and then "online" the drive? Or is there something else that I need to do to make the drive "new" to the zpool before reintroducing it again?

Thanks 
-phil


----------



## badtux (Jun 18, 2012)

ZFS will handle writing it before reading it. No need to second-guess ZFS.


----------

