# Please give me some suggestion for ZFS



## jackie (Nov 16, 2010)

Dell R510 with 12 disks and PERC H700 raid controller .
The server main used for web server and file backup . and want to install FreeBSD Root on ZFS 

I found this  .
12x 1TB disks can be done in multiple ways:
12-disk raidz2 vdev (10 TB, can lose any 2 disks, horribly slow, don't do it) 
2x 6-disk raidz2 (8 TB, can lose 2 disks in each vdev, faster than above) 
3x 4-disk raidz2 (6 TB, can lose 2 disks in each vdev, faster than above) 
3x 4-disk raidz1 (9 TB, can lose 1 disk in each vdev, faster than above) 
4x 3-disk raidz1 (8 TB, can lose 1 disk in each vdev, faster than above) 
6x 2-disk mirror (6 TB, can lose 1 disk in each vdev, fastest)

What about 2x 6-disk raidz1 ?
Another question.
I read this article http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/RAIDZ1 
why install bootcode in every disk and how to replace when one disk broken.
If used two groups another group  need  install boot code ?


----------



## danbi (Nov 16, 2010)

In your classification 'faster' should be really 'more IOPs'. This is not the same as delivering data faster. There is also very big difference in performance with read and write operations -- for both speed and IOPs.

In addition, there is also the the question of administration. For example, 6 mirror vdevs will give you best IOPs, suitable for database storage, for example. It will also give you the flexibility to upgrade drives in pairs, thus not having to replace many drives in order to increase capacity. It will be however slower in 'speed', such as copying large files, than any of the raidz variants, because you can write to as many as 6 drives, whereas with raidz, you actually write to all disks concurrently.

About boot code --- You may chose to apply the 'root on zfs' guides to only one of the vdevs. For example, create an 3 drive raidz1 following that guides, with bootcode, swap etc. Then add next vdevs to the pool (or you may add vdevs to the pool before populating it, in order to benefit from spreading the root over all your drives, although there many not be any measurable gain).
Then, you need to instruct your server to boot from any of those three drives and make sure when you replace these (three) drives, you follow again the proper procedure (gpart, write boot code etc). You can replace other drives without doing anything more specific, because they do not have any gpart, boot or swap configuration.

Or, you may create a separate mirrored pool for the root and another pool with the rest of the drives for data.


----------



## dennylin93 (Nov 16, 2010)

I remember someone (phoenix I think) mentioning that 6~7 drives should be used for the best RAIDZ performance. Something like 2x 6-disk raidz1.


----------



## phoenix (Nov 16, 2010)

danbi said:
			
		

> In addition, there is also the the question of administration. For example, 6 mirror vdevs will give you best IOPs, suitable for database storage, for example. It will also give you the flexibility to upgrade drives in pairs, thus not having to replace many drives in order to increase capacity. It will be however slower in 'speed', such as copying large files, than any of the raidz variants, because you can write to as many as 6 drives, whereas with raidz, you actually write to all disks concurrently.



Everything I've found online and on the zfs-discuss mailing list indicates that multiple mirror vdevs gives the best performance, for both IOps and throughput.

If you want speed, you use mirror vdevs.
If you want space, you use raidz1 vdevs.
If you want redundancy, you use raidz2 or raidz3 vdevs.



> Or, you may create a separate mirrored pool for the root and another pool with the rest of the drives for data.



Unless you absolutely need ZFS for the / filesystem, I'd recommend putting the OS onto UFS using CompactFlash or USB sticks, with gmirror to keep them safe. 

And, even if you do need ZFS on root, then I'd still recommend separating it out into it's own pool, using a single mirror vdev on CompactFlash or USB sticks.  That's the setup that Solaris uses (separate rpool for OS, data pool for the rest).

The OS only needs a couple of GB of disk space, no sense "wasting" 2 entire harddrives for it.


----------



## phoenix (Nov 16, 2010)

dennylin93 said:
			
		

> I remember someone (phoenix I think) mentioning that 6~7 drives should be used for the best RAIDZ performance. Something like 2x 6-disk raidz1.



The "rule of thumb" for raidz vdevs is "lots of small vdevs", aka "try to have fewer than 9 drives per raidz vdev".  Every couple of months a thread about the width of raidz vdev crops up on the zfs-discuss mailing list.

The general consensus is that 4 drives is the sweet spot for raidz1, 6 drives is the sweet spot for raidz2, and 8 drives is the sweet spot for raidz3.

However, there's nothing stopping you from using more or fewer drives per vdev.  It all depends on how much storage space you need, what your IOps requirements are, and how often you expect drives to fail.

raidz1 gives you the most storage space as it has the least amount of redundancy.  It's also faster than the other raidz variants.  However, if you lose a drive while resilvering another drive, the entire vdev is lost.  For this reason, most people tend to use raidz1 for small drives, and in narrow vdevs (4-6 drives).

raidz2 is the middle ground between storage space and redundancy.  Due to the nature of the resilvering (and scrubbing) code, you don't want raidz2 vdevs to be too wide.  The exact definition of "too wide" varies from person to person.  Some find the performance of a single 21-disk raidz3 vdev to be okay.  Others find a 12-disk raidz2 to be horribly slow.

Personally, I recommend multiple vdevs in a pool, each with under 9 disks.  My current plan for our next storage box is to use 4x 6-disk raidz2 vdevs using 4 separate controllers.


----------



## jackie (Nov 17, 2010)

Thanks ï¼ 

3x4 raidz1  one vdev install freebsd zfs on root . Is this suitable ï¼Ÿ


----------



## olav (Nov 17, 2010)

If you want a lot of iops, I suggest adding a mirrored zil log and a l2arc cache device to the pool 
http://blogs.sun.com/brendan/entry/test


----------



## nORKy (Nov 17, 2010)

What do you think about
root: 1x2  (mirror) (600Mo max * 2)
data: 3x4 (raiz1)


----------



## danbi (Nov 17, 2010)

phoenix said:
			
		

> Everything I've found online and on the zfs-discuss mailing list indicates that multiple mirror vdevs gives the best performance, for both IOps and throughput.
> 
> If you want speed, you use mirror vdevs.
> If you want space, you use raidz1 vdevs.



This was my religious belief just a week or two ago too 

On the same system:

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ (2712.33-MHz K8-class CPU)
4 GB RAM
3ware 9650SE-4LPML in single-drive volume mode
4xST3320620SV (strange choice of drives for RAID, I know, but this was an legacy from a "smart" colleague who was applying the rule "if they are more expensive, they are better" to any hardware spec)

stripped mirror ZFS (raid10):


```
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ns.digsys.bg     8G    84  93 89131  32 50621  18   235  99 105501  16 193.2   4
Latency               393ms    6056ms    3086ms     131ms   92341us     537ms
Version  1.96       ------Sequential Create------ --------Random Create--------
ns.digsys.bg        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 19635  99 +++++ +++ 16986  99 18327  98 +++++ +++ 15952  99
Latency              9192us     108us     158us   21934us      54us     108us
1.96,1.96,ns.digsys.bg,1,1281974480,8G,,84,93,89131,32,50621,18,235,99,105501,16,193.2,4,16,,,,,19635,99,+++++,+++,16986,99,18327,98,+++++,+++,
15952,99,393ms,6056ms,3086ms,131ms,92341us,537ms,9192us,108us,158us,21934us,54us,108us
```
raidz1 ZFS (RAID5):


```
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
test.digsys.bg   8G    84  98 147408  54 68057  26   217  99 195744  34 212.6   6
Latency               270ms    5439ms    7464ms   90377us     107ms     535ms
Version  1.96       ------Sequential Create------ --------Random Create--------
test.digsys.bg      -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 20228  99 +++++ +++ 18078  99 18725  99 +++++ +++ 17200  99
Latency             13442us      91us     135us   24977us      53us     100us
1.96,1.96,test.digsys.bg,1,1288947984,8G,,84,98,147408,54,68057,26,217,99,195744,34,212.6,6,16,,,,,20228,99,+++++,+++,18078,99,18725,99,+++++,+
++,17200,99,270ms,5439ms,7464ms,90377us,107ms,535ms,13442us,91us,135us,24977us,53us,100us
```

Go figure... almost double the read/write performance with raidz1.

```
zpool iostat
```
sometimes shows over 200MB/s read or write on the pool.

I am starting to look more favorably at raidz for backup/storage servers now 

By the way, the same system had abysmal performance using UFS on 3ware managed RAID10 or RAID5 volume.

With regards to "do you need ZFS for the root" thing, I believe anyone who cares about the consistency of their servers should be glad there is now ZFS in FreeBSD that can verify if  data on disk is correct. It is not a problem to find out, that you have bad copy of the ftpd executable for example, because you could either recompile it, or copy it from install media or other server and have it behave as expected -- or you may have corrupted copy of that same executable, because the file system did not tell you it reads not what it wrote there... It is very important to have the OS files without corruption.
One of the drives from a former 3ware based RAID system of mine was writing/reading bad data when used under ZFS --- made me feel uneasy, because that same drive was happily sitting in very important data processing system before that. Glad other considerations mandated that move to ZFS.
ZFS also behaves better with flash media than UFS, especially for writing.


----------



## jackie (Nov 18, 2010)

nORKy said:
			
		

> What do you think about
> root: 1x2  (mirror) (600Mo max * 2)
> data: 3x4 (raiz1)



But this is only 12 disks .


----------



## AndyUKG (Nov 18, 2010)

> Dell R510 with 12 disks and PERC H700 raid controller




I don't think the H700 is supported yet, however you can choose other older cards instead when you customise the server...

ta Andy.


----------



## jackie (Nov 19, 2010)

AndyUKG said:
			
		

> I don't think the H700 is supported yet, however you can choose other older cards instead when you customise the server...
> 
> ta Andy.



It looks like supported ! I found in this thread 
http://forums.freebsd.org/showthread.php?t=13720&highlight=R510


----------



## AndyUKG (Nov 19, 2010)

Ah ok, well if they had the system running then cool. Its a new gen card and I recently had an issue that the H200 isn't supported, neither are on the hardware compatability list (at least under their Dell names).

ta Andy.


----------

