# ZFS resilvering slow down the whole system.



## belon_cfy (Dec 3, 2012)

Hi
May I know is there anyway to force stop the resilvering process? zpool scrub -s doesn't work in this case. The resilvering really seriously impact the storage server performance. 

Or anyway to lower the prioriy or speed of resilvering? 

I'm running FreeBSD 9.0-RELEASE-p4 with zfs v28 on root and data volume.


----------



## usdmatt (Dec 3, 2012)

I can't help but I'm interested by this question. I've heard that FreeBSD has no disk IO scheduling so it can't run scrub/resilver reads/writes at a lower priority like other OS's.

I had an old NAS come in a while back that had 12 disks and an entry level CPU (It had a RAID controller so no real need for CPU horsepower). I thought I'd test it out as a backup server, but the system was pretty much unusable if you tried a scrub as the disk performance outstripped the processor.

If disk scheduling is not planned, difficult to add or not going to help maybe ZFS on FreeBSD should have some sort of tuning knob to directly affect how quick the scrub/resilver code runs. On live systems with multiple redundancy some people may prefer to not have resilver maxing out IO.


----------



## User23 (Dec 3, 2012)

This may help: http://forums.freebsd.org/showthread.php?t=31628

Please post your zpool layout. Is the pool more than 80% full? How many harddrives per raidz?


----------



## phoenix (Dec 3, 2012)

usdmatt said:
			
		

> I can't help but I'm interested by this question. I've heard that FreeBSD has no disk IO scheduling so it can't run scrub/resilver reads/writes at a lower priority like other OS's.



gsched(8)  GEOM-based disk scheduling.

ZFS also includes it's own disk scheduler, so don't run gsched underneath ZFS, or bad things happen.  



> If disk scheduling is not planned, difficult to add or not going to help maybe ZFS on FreeBSD should have some sort of tuning knob to directly affect how quick the scrub/resilver code runs. On live systems with multiple redundancy some people may prefer to not have resilver maxing out IO.



There are sysctls to tune scrub/resilver.  I have not delved into them, nor tried to understand them.  But, they are there.


----------



## belon_cfy (Dec 3, 2012)

User23 said:
			
		

> This may help: http://forums.freebsd.org/showthread.php?t=31628
> 
> Please post your zpool layout. Is the pool more than 80% full? How many harddrives per raidz?



Maybe I should try the setting you have suggested. 

This is my zpool layout:

```
NAME                       STATE     READ WRITE CKSUM
        vol                        DEGRADED     0     0     0
          mirror-0                 ONLINE       0     0     0
            ada0p3                 ONLINE       0     0     0
            ada1p3                 ONLINE       0     0     0
          mirror-1                 DEGRADED     0     0     0
            ada2p3                 ONLINE       0     0     0
            replacing-1            OFFLINE      0     0     0
              9030038177899120958  OFFLINE      0     0     0  was /dev/ada3p3
              gpt/data-disk0       ONLINE       0     0     0  (resilvering)
        logs
          ada4p1                   ONLINE       0     0     0
        cache
          ada4p2                   ONLINE       0     0     0


NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
vol    1.77T   821G   987G    45%  1.00x  DEGRADED  -
```


----------



## belon_cfy (Dec 4, 2012)

phoenix said:
			
		

> gsched(8)  GEOM-based disk scheduling.
> 
> ZFS also includes it's own disk scheduler, so don't run gsched underneath ZFS, or bad things happen.
> 
> ...



Does it means ZFS disk scheduler should work on FreeBSD too?


----------



## Sfynx (Dec 4, 2012)

I once could stop a heavy resilver by simply zpool removeing the target drive of the replace again, in your case gpt/data-disk0 (well, in fact it continued at a very high speed without any more drive activity until it reached 100% after which the drive was removed, but the server was responsive again). Of course you could be left with a degraded pool if the drive you're replacing does no longer exist or is defective, so in that case completing the replace asap is still recommended of course. In case it has to wait, double redundancy (e.g. RAID-Z2) calms the mind


----------



## phoenix (Dec 4, 2012)

Completely untested, but interesting reading nonetheless.  Feel like being the guinea pig?   Put the following in /etc/sysctl.conf then `# service sysctl start`


```
# ZFS tuning
#         Mostly taken from:  http://broken.net/uncategorized/zfs-performance-tuning-for-scrubs-and-resilvers/
vfs.zfs.l2arc_write_boost=160000000             # Set the L2ARC warmup writes to 160 MBps
vfs.zfs.l2arc_write_max=320000000               # Set the L2ARC writes to 320 MBps
vfs.zfs.resilver_delay=0                        # Prioritise resilver over normal writes (default 2)
vfs.zfs.scrub_delay=0                           # Prioritise scrub    over normal writes (default 4)
vfs.zfs.top_maxinflight=128                     # Up the number of in-flight I/O (default 32)
vfs.zfs.resilver_min_time_ms=5000               # Up the length of time a resilver process takes in each TXG (default 3000)
vfs.zfs.vdev.max_pending=24                     # Set the queue depth (number of I/O) for each vdev
  #-->                                          # Set this really high.  Then monitor the L(q) column in gstat under load.
  #-->                                          # Set it to just slightly lower than the highest number you see.
```


----------



## gkontos (Dec 4, 2012)

Am I the only one here who strongly believes that resilvering should not be interrupted?


----------



## belon_cfy (Dec 5, 2012)

Sfynx said:
			
		

> I once could stop a heavy resilver by simply zpool removeing the target drive of the replace again, in your case gpt/data-disk0 (well, in fact it continued at a very high speed without any more drive activity until it reached 100% after which the drive was removed, but the server was responsive again). Of course you could be left with a degraded pool if the drive you're replacing does no longer exist or is defective, so in that case completing the replace asap is still recommended of course. In case it has to wait, double redundancy (e.g. RAID-Z2) calms the mind



I did detach the replaced drive but it won't stop the resilvering process but turned into unstoppable scrubbing process.


----------



## Crivens (Dec 5, 2012)

gkontos said:
			
		

> Am I the only one here who strongly believes that resilvering should not be interrupted?



Nope.


----------



## t1066 (Dec 6, 2012)

Be sure to check what sysctl tunables are available for your system. In 9.1, try what phoenix describes in his post.


@phoenix

If you set vfs.zfs.l2arc_write_max this high, you may want to set vfs.zfs.l2arc_feed_min_ms=1000. Otherwise, the maximum l2arc write speed would be 1600MBps.


----------



## phoenix (Dec 10, 2012)

Ah, you're right.  Didn't catch that.  Thanks.

Note:  the sysctls I posted will prioritise scrub/resilver over normal I/O, making the resilver finish faster.  I believe this is the opposite of what the OP wanted, so don't just copy/paste the code without reading it.


----------



## belon_cfy (Jul 1, 2013)

phoenix said:
			
		

> Completely untested, but interesting reading nonetheless.  Feel like being the guinea pig?   Put the following in /etc/sysctl.conf then `# service sysctl start`
> 
> 
> ```
> ...



Hi Phoenix, I don't see the parameters been defined in FreeBSD 9.1. However found in FreeBSD 10-CURRENT.


----------



## phoenix (Jul 9, 2013)

The above is taken from my 9.1 systems (although now running 9.1-STABLE) and they work just fine.


----------

