# Terrible clock skew



## syshackmin (Dec 6, 2010)

Hi everyone,

I've been experiencing some terrible clock skew and just can't figure it out. By terrible I mean I'm losing 30 minutes a day. The loss only occurs when I bring the system under heavy load. The load is multiple Rsync backups to a ZFS pool(with gzip compression) backed by a 16 disk Raid50. I'm using a hardware Raid controller for battery backed write caching.

Mobo: Supermicro X8DTL
CPU: Dual Intel 5620 quad cores
Raid: Areca 1620

I have ntp enabled but the skew happens to fast and it stops trying. I've tried a bunch of stuff from the various lists. I tried all my clock sources. TSC(-100) HPET(900) ACPI-fast(1000) i8254(0). I tried changing the kern.hz flag lower. I also tried disabling the enhanced speed step feature of this chip. Nothing works.

Currently I'm defaults except the following settings.

EIST Disabled in BIOS
kern.hz="100"
kern.timecounter.hardware=i8254

Is there anything else I could do to debug this? I can't really blame the hardware because I have these same boards/chips running in Linux with no clock issues.

Thanks!
Dave


----------



## Terry_Kennedy (Dec 7, 2010)

syshackmin said:
			
		

> I've been experiencing some terrible clock skew and just can't figure it out. By terrible I mean I'm losing 30 minutes a day. The loss only occurs when I bring the system under heavy load. The load is multiple Rsync backups to a ZFS pool(with gzip compression) backed by a 16 disk Raid50. I'm using a hardware Raid controller for battery backed write caching.
> 
> Mobo: Supermicro X8DTL
> CPU: Dual Intel 5620 quad cores
> Raid: Areca 1620


FWIW, I have a bunch of X8DTH-iF boards here (a very close relation to yours) with E5520 CPUs and 16-port 3Ware 9650SE-16ML controllers, and they stay NTP-synchronized with no complaints regardless of load. We're running different RAID controllers, though.

The CMOS clock on my X8DTH's drifts slightly when not disciplined - a powered-off system will come up and sync with an NTP offset of a second or so if the system has been off for a couple hours. Nowhere near as bad as what you're seeing, though.


----------



## danbi (Dec 7, 2010)

Is there any reason why you disabled EIST?


----------



## syshackmin (Dec 8, 2010)

I disabled EIST from a suggestion on the FreeBSD site.

http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CALCRU-NEGATIVE-RUNTIME

Didn't help. 

The fact that load causes it is the strangest part. There is very little drift while the server is idle.


----------



## Orum (Dec 9, 2010)

Clocks have been known to change their drift due to differences in temperature.  So maybe you have a cooling problem?

However, this sounds like fairly extreme drift for heating alone to be the issue.  I would make sure you sync with a ntp server, on startup, run a ntpd to keep it in sync, and be aware that certain securelevels (if you're using them at all) restrict clock adjustment to <= 1 second, which can impact the ability of ntp to sync when you have really bad clocks.


----------



## acheron (Dec 9, 2010)

What version of FreeBSD do you have ?
I'm running current as of r216223 and I have the same problem under heavy load ie my clock cannot be kept in sync.


----------



## syshackmin (Dec 9, 2010)

I'm running the release version.

8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49 UTC 2010

I may be experiencing heat problems so I'm going to look into that more. I had ordered a new fan shroud for the case and will be installing that this weekend hopefully.


----------



## syshackmin (Jan 9, 2011)

Figured this out. I got a new heat sink and a fan shroud for my case(was a giant 24-drive chassis) and the skew is gone. CPU must have been cooking under load and reducing its clock.


----------

