# Requesting tips for ZFS



## Zare (May 12, 2011)

What kind of ZFS setup would you do, if you needed an array which can survive only one disk fault, that would host 2 disks initially and allow expansion with additional drives in on-line state? Also, are every ZFS corruption detection and correction mechanisms working if pool is constructed with GEOM ELI devices?

That hypothetical array, consisting of two 1.5TB disks with growth to four disks in a few months, would run on Intel D510MO embedded board, that's dual core 1.66GHz 64-bit Atom with 4 GB of 800MHz RAM (max). SATA controller can be used with ahci(4)(), any performance issues? It also has one PCI slot so I can install another SATA controller, if necessary. 

Can get good price for Samsung EcoGreen F2 1.5TB drives, if anyone has experiences with those + ZFS, I would appreciate it. Thanks in advance.


----------



## phoenix (May 12, 2011)

Your only option, if starting with 2 disks, is to use a single mirror vdev:
`# zpool create mypool mirror ada0 ada1`

Later, when you want to add 2 more disks, you simply add a second mirror vdev to the pool:
`# zpool add mypool mirror ada2 ada3`

Simple as that.

If you want to use GELI, you can.  You configure the GELI devices first.  Then add the *ada0.eli* devices to the pool instead of the *ada0* devices.  However, I believe you won't be able to boot from that pool, and will need a separate UFS partition with the boot loader and kernel installed there.


----------



## bbzz (May 12, 2011)

But if you end up with 4 disks total and mirror configuration, you "waste" 50% on redundancy.
Wouldn't it be better to use raidz1 (raid5). This way you will use only one disk for redundancy, with 3x1.5TB for raw storage. 
You would have to start with 3 disks initially, however. Maybe something to think about?


----------



## phoenix (May 12, 2011)

Sure, but you also get better performance out of a multi-mirror pool than you do out of a multi-raidz pool.  

And if you only have 2 disks to start ... it's pretty hard to create a raidz vdev.


----------



## usdmatt (May 12, 2011)

As phoenix says, with the original requirement of 2 disks with an additional 2 disks later, the only real option is a mirror with an additional mirror later on.

However, for personal use I wouldn't really be happy spending money on 4 1.5TB disks to only get <3TB of space.
If it were me I'd do the following when the additional disks turned up:


Scrounge a 1.5TB USB drive from somewhere

Create a single disk pool out of the USB disk

"zfs send" the live pool to USB

Scrub the temp pool to make sure it's OK (USB disks are prone to being useless)

Got any critical data? - back that up somewhere else as well

Scrap the live pool and rebuild as raidz

"zfs send" the data back


From my own experience with 4 disks, and that on the net, a 4 disk raidz actually outperforms 2 x 2 mirror for raw throughput. (Look for posts by submesa, he did some graphs on it at some point). IOPS might be higher but the advantages don't really kick in unless you've got loads of disks and heavy usage.

@bbzz
Just to make clear, if you start with 3x1.5TB disks you would get a pool made up of 1 raidz vdev with 3TB of space. The only way to increase space on that pool would be to add another raidz vdev of 3 disks. You can't add a 4th disk to that existing vdev. You could probably add a 2 disk mirror but it's not recommended to mix vdev types.


----------



## UNIXgod (May 12, 2011)

some links:

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide


----------



## bbzz (May 12, 2011)

usdmatt said:
			
		

> Just to make clear, if you start with 3x1.5TB disks you would get a pool made up of 1 raidz vdev with 3TB of space. The only way to increase space on that pool would be to add another raidz vdev of 3 disks. You can't add a 4th disk to that existing vdev. You could probably add a 2 disk mirror but it's not recommended to mix vdev types.



This proves I'm still 'noobing' this thing! 
I figured you could just add another disk to pool if you wish to increase it!
So how does this actually work. Seems to me that you end up with raid50 instead of raid5, right? 
What about raidz2? I was planning to get 6x2TB disks and do raid6. So if I were to increase this pool later, I would need another 6 disks? And I would end up with raid60 not raid6.

@UNIXgod 
Thanks for links!


----------



## Zare (May 12, 2011)

Thanks people.

I'm probably going to do a mirrored setup with two drives initially, then when two additional come I could backup data from the vdev, create a 4 disk raidz vdev and restore the data.



> If you want to use GELI, you can



I know that I can. I'd like someone to ensure me that ZFS self-healing capabilities are going to work on top of ELI devices, since geom_eli module does translations by itself.


----------



## usdmatt (May 12, 2011)

@Zare
As far as ZFS is concerned, there's no difference between using GELI devices or raw disks. The GELI device just acts like a normal disk. The is down to the 'stackable' design of GEOM. A raw disk is accessed using the GEOM interface, and every GEOM module provides that same interface.  This means you can stack a bunch of GEOM modules on top of a raw device pretty much how ever you like (geli'd gmirror for example) and the device you end up with still acts just like a raw disk.

@bbzz
Yes, with a pool made up of a 6 disk raidz2, you could only expand by adding another raidz2 vdev which would effectively give you raid60. You could probably get away with adding a new vdev of 4 disks as there's no requirement for the size of each vdev to match, although it's probably recommended somewhere to have even vdevs if at all possible.

A 'single disk' vdev can be expanded to a mirror, then to a 3 or 4 way mirror. You can also reduce a 4-way mirror to 3-way, to 2-way, then back to a single disk. Adding or removing a disk from a raidz{1,2,3} vdev is not possible.
The ZFS community has been waiting for the mythical 'block pointer rewrite' feature to allow this for years but I've not heard anything for a long time. If Oracle do ever implement it, I wouldn't be surprised if they close the source first, leaving the community to go without or re-implement.


----------



## bbzz (May 12, 2011)

I guess this is something to think about; with raid60 two disks on both raid6 chains means I could end up loosing 4 disks total (if two are on each raid6) before loosing data. But I'm loosing another 4TB ...


----------



## danbi (May 13, 2011)

With raidz2, no matter how many vdevs you have, consider you can lose only 2 disks before you start losing data. The chances that the third disk will be in the same vdev are not to be ignored 

The idea of raidz2 (and higher) is that with the current huge drive capacities and not that great transfer speeds, if you have failed drive chances are another drive may fail while you resilver the replacement of the first drive. RAID, as such was designed to work with many small drives. But times change...


----------



## bbzz (May 13, 2011)

But this is not the case with raid60. With raid60 you can loose 4 disks before you start loosing data (that's two disks on each side of stripped raid6). Is this the same with zfs?


----------



## phoenix (May 13, 2011)

Yes, it's the same.  You can lose 2 disks per raidz2 vdev before you lose the pool.

However, if you lose 3 disks in the same vdev, you lose the pool.  Same with RAID60:  if you lose 3 disks in one of the sub-arrays, you lose the entire RAID60 array.


----------

