# Hard disk failure?



## hurricane_sh (Feb 3, 2012)

My server was down for one hour without any warnings, a reboot brought it back. After examining the logs, I only found the following unusual messages in /var/log/messages, no errors in dmesg, mysql error log, etc. These error messages happened two hours before the down time. Anyone help me with these questions?

1. Did the errors suggest a hard disk problem? Should I get a new server?

2. I'm confused how these errors could shut down mysql and apache, but I couldn't find any other traces. Should I dig further and where should I look into?

Thanks!


```
Feb 3 00:12:23 mail kernel: ad4: FAILURE - WRITE_DMA48 status=51<READY,DSC,ERROR> error=10<NID_NOT_FOUND> LBA=323957119
Feb 3 00:12:23 mail kernel: g_vfs_done():ad4s1a[WRITE(offset=165866012672, length=16384)]error = 5
Feb 3 00:18:02 mail kernel: ad4: FAILURE - WRITE_DMA48 status=51<READY,DSC,ERROR> error=10<NID_NOT_FOUND> LBA=323957119
Feb 3 00:18:02 mail kernel: g_vfs_done():ad4s1a[WRITE(offset=165866012672, length=16384)]error = 5
Feb 3 00:18:08 mail kernel: ad4: FAILURE - WRITE_DMA48 status=51<READY,DSC,ERROR> error=10<NID_NOT_FOUND> LBA=366849919
Feb 3 00:18:08 mail kernel: g_vfs_done():ad4s1a[WRITE(offset=187827126272, length=16384)]error = 5
Feb 3 00:18:32 mail kernel: ad4: FAILURE - WRITE_DMA48 status=51<READY,DSC,ERROR> error=10<NID_NOT_FOUND> LBA=366849919
Feb 3 00:18:32 mail kernel: g_vfs_done():ad4s1a[WRITE(offset=187827126272, length=16384)]error = 5
```


----------



## SirDice (Feb 3, 2012)

hurricane_sh said:
			
		

> 1. Did the errors suggest a hard disk problem? Should I get a new server?


It sure looks that way. A new server isn't necessary, a new disk probably is.



> 2. I'm confused how these errors could shut down mysql and apache, but I couldn't find any other traces. Should I dig further and where should I look into?


Just replace the disk. These errors could popup everywhere, including your swap space.


----------



## hurricane_sh (Feb 3, 2012)

Thanks, SirDice! Every time a server got problem, I was usually left with some confusion, I hate that. x(

I will go with a new server, since the current one is still working, moving to a new server requires litter down time. Replacing the disk means hours of interrupted service.


----------



## SirDice (Feb 3, 2012)

If you do get a new server you might want to look into using the 'old' one as a 'hot' standby. 
MySQL data can easily be synced and the websites could be transferred using rsync(1) for example.

If something happens to your 'primary' you could switch over to the other server. The services would still continue and you have some extra time to get the issue fixed.


----------



## hurricane_sh (Feb 3, 2012)

It will double the cost, I can't afford that. RAID seems to be able to survive hard disk failures, never tried it.


----------

