# zfs compression read performance



## Noodle (Dec 9, 2010)

I have a RAIDz configuration, with 5 sata hard disk. I create two file system, one with compression and checksum on, another one with both off.

I use following command to do the read and write test on both file system. Without compression and checksum, the performance is good:

```
write:
/usr/bin/time -h dd if=/dev/zero of=test.file bs=1024 count=3000000
3000000+0 records in
3000000+0 records out
3072000000 bytes transferred in 29.320816 secs (104771982 bytes/sec) (99.9MBps)

read:
/usr/bin/time -h dd if=test.file of=/dev/zero bs=1024 count=3000000
3000000+0 records in
3000000+0 records out
3072000000 bytes transferred in 8.882904 secs (345832856 bytes/sec) (329MBps)
```

With compression and checksum, write is good, but read is soo poor:

```
Write:
/usr/bin/time -h dd if=/dev/zero of=test.file bs=1024 count=3000000
3000000+0 records in
3000000+0 records out
3072000000 bytes transferred in 23.105969 secs (132952659 bytes/sec) (126MBps)

read:
/usr/bin/time -h dd if=test.file of=/dev/zero bs=1024 count=3000000
3000000+0 records in
3000000+0 records out
3072000000 bytes transferred in 246.525695 secs (12461176 bytes/sec) (11MBps)
4m6.52s real	 1.10s user	 4m4.70s sys
```

After some test, I found it's compresssion, as long as I have compression on, write speed very stable, but with if I set count = 300000, the read speed could reach 219MBps, if I set count = 524288 (512M), read speed drop to 11M again. So, compression only good on small files?

ahci enabled, kmem_size: 5G, total memory 8G

Any idea what I did wrong?

Thanks

Noodle


----------



## AndyUKG (Dec 10, 2010)

Hi,

  Reading and compressing data from /dev/zero is meaningless WRT benchmarking your system, you should test on real world data or failing that at least /dev/random would be better.
Anyway without going into the details of your specific config the following holds true for a lot of systems, if the data you want to store on your ZFS volume isn't already compressed then turning on compression will probably give you good performance. If you want still more performance then using more disks and not using RAIDz (ie using mirrors instead) will probably be the way to go.

thanks Andy.


----------



## Galactic_Dominator (Dec 12, 2010)

AndyUKG said:
			
		

> Reading and compressing data from /dev/zero is meaningless WRT benchmarking your system, you should test on real world data or failing that at least /dev/random would be better.



You're right one the first point, but benchmarking /dev/random would probably be even worse as it's going to spit out an essentially uncompressable stream.  In this scenario, gzip is going to do much worse than lzjb, but neither will fare as as good as a standard non-compressed fs.

The standard rule of thumb is you don't compress files twice, especially if you want performance.  ZFS is no exception to this.

/dev/zero and /dev/random are essentially best and worse case scenarios, respectively.


----------



## fronclynne (Dec 12, 2010)

You can't do a fair, or even coherent, test of read & write speeds with your methods.  I would suggest (for starters) having two distinct filesystems (preferably a tmpfs(5) or UFS or on a thumbdrive or a DVD; with the other your ZFS that you're testing).  Second, read a large file into cache with something like:  `# dd if=/path/to/large/file of=/dev/zero bs=1m` (assuming the file is smaller than your main memory) then write that file to your ZFS like `# dd if=/path/to/large/file of=/zfs/path bs=1m`
Fiddle with bs= values, run the test three or more times for each value of bs=, also play with things like conv=block,sync,notrunc &cet &cet.  Then do it all again for a compressed volume.  Create a new file each time (on the ZFS) and leave it for your read speed test.

Then try testing read speeds with the various files you've left lying about *after a reboot* (to assure that your cache is clear), again playing with various bs= & conv= values, compressed and uncompressed.

Put all of the command lines into a text file followed by the output (I guess you can do this with script(1), I dunno) and start crunching your numbers.

The other option is to just use the system for a while (days or weeks) in each of the two states you're trying to test and keep an eye on performance.

_addendum_: someone who understands the inner workings of ZFS is welcome to correct me, but on UFS there can be massive differences in read speed when you fiddle with the *vfs.read_max=* sysctl(8).


----------



## rusty (Dec 12, 2010)

fronclynne said:
			
		

> _addendum_: someone who understands the inner workings of ZFS is welcome to correct me, but on UFS there can be massive differences in read speed when you fiddle with the *vfs.read_max=* sysctl(8).



Indeed, A good post @ http://ivoras.sharanet.org/blog/tree/2010-11-19.ufs-read-ahead.html where it's noted that NCQ is the main cause of performance increase.


----------

