# Question about damaged SATA drive



## pkc (Jan 2, 2014)

I have a 1[ ]TB SATA drive which is known to be damaged somehow. I initiated a transfer about 12 hours _[ago? -- mod.]_ of 150[ ]GB from a USB _[drive? -- mod.]_ to this drive using cp, but so far only 50[ ]GB have transferred. This makes sense. 

However, the terminal session from which I initiated cp over SSH is almost unresponsive -- keyboard input receives a response after a minute or so. This is not an issue because I can just abort the transfer if I want, and in fact if I log in on a separate session the system is as responsive as usual, but I was just wondering what the technical reason would be for this.

Thanks.


----------



## ralphbsz (Jan 5, 2014)

Educated guess: the reason it is so slow is that the damaged drive has to retry some I/Os many many times.  During this time, the process is stuck in the kernel.  It can take disk drives easily a few seconds to retry I/Os (if they recalibrate).  The minute you are seeing is a bit extreme; the only way I can explain it is that the disk drive itself spends a few seconds on each I/O, and then the kernel (probably some parts of the SATA and block device stack) retry the I/O another few times.

My usual rule of thumb is that I/Os should finish (at the disk drive level, not counting kernel retries) within a few seconds, even under the most extreme workloads (with many dozens of I/Os queued), even in error cases.  But I know that error handling in the kernel can exceed that, but only up to a few dozen seconds (20 or 30 seconds for an I/O is the absolute upper limit).  The only reason for longer I/O times is kernel bugs; on one Unix-like OS (name withheld to protect the innocent), we ended up rebooting the machine after about 90 hours, and the I/O still hadn't finished.


----------

