# ZFS slow - what am I doing wrong ?



## flyflytn (Jul 29, 2010)

Hello all, I'm a long time linux admin, just started a week ago with freebsd.
I have 2 servers, with Intel S5000 variant motherboards, both have only 2GB ram, one is using the onboard intel embedded raid controller and the other has a 3ware 9550.

These are the cards I've been dealt, not my choice to have such small amount of memory in them.

I just cannot get any decent performance using ZFS out of these servers, and after searching around and trying out the various tuneables, I am no closer.

I'll concentrate on the machine without the 3ware card as I've done most my testing with it.
It has 2 re2 400gb drives mirrored, as the ufs2 system volume. It has 4 re2 400GB drives as the zfs pool.

I created the pool:
`zpool create data raidz ad{8,10,12,14}`
I then tried some simple dd tests which gave me writes around 30-35MB/s and reads sometimes even slower. I used bonnie++ as a better test, which had very similar results.

I then tried the ZFS amd64 tunings recommended:

```
vm.kmem_size_max="1024M"
vm.kmem_size="1024M"
vfs.zfs.arc_mac="100M"
```
Now it gets a little weird and inconsistent:
If I then try the benchmarks again, bonnie reports around 65-130MB/s writes and about the same on reads, and consistantly seems to do this, but with large variation.
If I then do a simple copy of a 4GB file from the system ufs2 volume to the zfs mount, its terribly slow....25MB/s or less.
Then, and only sometimes, subsequent benchmarks waver between the 65-90 all the way down to 25MB/s again. A reboot seems to clean things up as far as the benchmarks are concerned, bringing tests back to 130MB/s

I'm currently trying to borrow some memory from another server to increase memory to 4G, but I still cannot understand why I get such poor and inconsistent performance from ZFS - the same drives used in a raid5 linux raid will consistantly give me 150MB-200MB/s throughput, and copying from volume to volume is very fast.

So are there any definitive tunings that have been posted elsewhere ? I've only found bits and pieces, some limiting the write speed, some limiting ARC and kernel pools, but nothing so far that seems to make it 'work'.

What am I expecting ? Probably 150MB/s throughput as a ballpark. I don't care what the benchmarks really give me, they're just benchmarks, but file copying both large and small is what is important, and I've never achieved anything acceptable so far.


----------



## Savagedlight (Jul 29, 2010)

Have you tried [post=63019]this[/post]?


----------



## flyflytn (Jul 29, 2010)

Thanks for the reply.

Unfortunately that seems to have exacerbated the problem - while benchmarks are now a bit faster, file copying is a slower than even before.

I then added 8GB of memory, so now I have 10. I turned off my previous tunes since anything above 4GB is supposed to automatically be optimal. Performance has raised a bit, but still nothing that I find useable


----------



## Savagedlight (Jul 29, 2010)

Here's something else you can try.

*If you use AHCI* (have loaded ahci.ko and drives are recognized as using AHCI):
/boot/loader.conf

```
vfs.zfs.vdev.min_pending=1 #default=4
vfs.zfs.vdev.max_pending=1 #default = 35
```
*If you don't use AHCI* (didn't load ahci.ko and/or the drives are not recognized as using AHCI):
/boot/loader.conf

```
vfs.zfs.vdev.min_pending=4 #default=4
vfs.zfs.vdev.max_pending=8 #default = 35
```

If you do not know which to use, I'd recommend trying the second alaternative and see if it helps any.


----------



## flyflytn (Jul 29, 2010)

OK, I disabled AHCI in the bios, but I couldn't really test the same setup as that caused 2 drives to not be detected anymore, so I was left with the system on 2 drives and only 2 drives for a pool.

So I tried your first configuration, which didn't seem to have any effect on my system.

What I did change though was completely disable the onboard intel raid bios - I didn't have any drives configured in onboard raid volumes in the previous tests, they were just single drives so I assumed it would just act as direct access, but I started to suspect it might be causing issues nonetheless. This made a significant change - bonnie reports about 100MB/s writes and 170MB/s reads, and most file copies are making sense now - repeatable and reasonable results. However there's still some strange behaviour when copying to the volume which I'll investigate some more. Copying from the zfs volume to the mirrored system was doing what I expected - maxxing out at the system's mirror throughput of about 70MB/s. However going the other way was quite slow again.

I still think there is more room for improvement, as the drives are not working as hard as they would on my old linux setup (raid5). Every few seconds the drives are inactive for a second during the bonnie & dd tests.
(I also made a slight misreport in my first post - the linux system was performing at around 200-230MB/s on the same hardware)

I'll continue to see if I can tweak some more things in ZFS remotely during the weekend.
I'm also going to try a regular PC at home and see if this is something that only my servers have issues with or not.


----------



## flyflytn (Aug 2, 2010)

*[SOLVED] ZFS slow - what am I doing wrong ?*

Gah! Turns out that the server in question just simply has horrible bottleneck problems. I couldn't get any performance out of it with Linux either, it was even worse. I believed that this server should have performed the same as another we have since it was the same hardware - looks like the hardware is slightly different after closer inspection.

Anyway, thanks to Savagedlight and another posting about the write speed limit, with judicious use of vfs.zfs.vdev.max_pending and vfs.zfs.txg.write_limit_override I was able to make this server go at maximum speed it was capable of. I actually found that tuned ZFS raidz1 was more than twice as fast as untuned linux/xfs/raid5, but still only about 2/3 the speed of a comparable speed server - so this particular server has bottlenecked throughput which iostat & gstat confirmed after lots of testing.


----------

