# High latency on disk 0



## belon_cfy (Jun 27, 2012)

Hi,

One of my server*s* is experiencing extremely high latency on ada0, the problem occur*s* very often (almost every minute). 


```
dT: 1.001s  w: 1.000s  filter: ada[0-9]$
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
   10     23      1      4   1543     22    591  571.2  104.4| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada1
    0      0      0      0    0.0      0      0    0.0    0.0| ada2
    0      3      3     12    0.1      0      0    0.0    0.0| ada3
    0    125    125   1026    0.1      0      0    0.0    1.7| ada4
```

It is running FreeBSD 9.0 + ZFS v28.

I did replace the disk but it didn't solve the problem after it resilvered. I did move the data to another identical machine but the other one is running fine. 

*diskinfo -cvt* on four disks is showing consistent performance. All are getting very similar results, so I assume that the problem is not due to the SATA cable or interface.

Is there anything wrong on zfs or FreeBSD? Or possibly the SATA cable?

Below is my server spec and *zpool status*:

```
pool: vol
 state: ONLINE
 scan: scrub repaired 0 in 7h34m with 0 errors on Wed Jun 13 20:00:30 2012
config:

        NAME        STATE     READ WRITE CKSUM
        vol         ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada2p3  ONLINE       0     0     0
            ada3p3  ONLINE       0     0     0
        cache
          ada4      ONLINE       0     0     0

errors: No known data errors

  pool: zroot
 state: ONLINE
 scan: scrub repaired 0 in 0h3m with 0 errors on Wed Jun 13 12:26:49 2012
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada1p2  ONLINE       0     0     0
            ada0p2  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0


# zdb | grep ashift
            ashift: 12
            ashift: 12
            ashift: 12
            ashift: 12
```

- Intel Xeon 5335 Quad Core
- 8GB RAM
- 4 X 1TB Seagate HDD


----------



## SirDice (Jun 27, 2012)

What's on ada0p1?


----------



## Sebulon (Jun 27, 2012)

SirDice said:
			
		

> What's on ada0p1?



My money is on the bootsector.

@belon_cfy

Maybe try watching *top* and *zpool iostat -v* at the same time to see what process is generating the IO?

/Sebulon


----------



## belon_cfy (Jun 27, 2012)

Sebulon said:
			
		

> My money is on the bootsector.
> 
> @belon_cfy
> 
> ...



*H*i Sebulon,

It is only serving NFS service. *N*o suspicious process in *top* either.

The latency randomly increased and causing it freeze for few seconds. Have you encountered similar cases before?


----------



## kpa (Jun 27, 2012)

What kind of NFS traffic, lot of reads or writes or both at the same time? If it's writes and they are synchronous ZFS doesn't cope well with those under heavy load unless you add a separate log device(s).

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Separate_Log_Devices


----------



## phoenix (Jun 27, 2012)

Try swapping cables/SATA ports with another drive, say ada2.  Then monitor gstat(8) output to see if the latency stays with the drive (dead/dying drive?) or with the port (dead/dying port/controller?).


----------



## Sebulon (Jun 27, 2012)

belon_cfy said:
			
		

> Hi Sebulon,
> 
> It is only serving NFS service. No suspicious process in *top* either.
> 
> The latency randomly increased and causing it freeze for few seconds. Have you encountered similar cases before?



How about:
`# top -aS`
Can you see anything more interesting?


What was it about NFS and 9.0... Memory, bug, something, dark side...

Maybe you are IO-bound. SATA II or SAS I is good for about 250-300 MB/s, which is what about two modern hard drives can saturate, when ZFS flushes IO. Getting warm?

/Sebulon


----------



## t1066 (Jun 28, 2012)

How full is the pool? Could this be a problem of fragmentation?


----------



## belon_cfy (Jun 28, 2012)

t1066 said:
			
		

> How full is the pool? Could this be a problem of fragmentation?



Only 34% in use.


```
zpool list
NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
vol    1.77T   629G  1.15T    34%  1.00x  ONLINE  -
zroot  39.8G  2.62G  37.1G     6%  1.00x  ONLINE  -
```

Possibly caused by AHCI?


----------



## t1066 (Jun 28, 2012)

If you are willing to experiment, you could add a hard disk as an SLOG (separate log device).


----------

