# zfs raidz2 mixing different disk vendors



## wszczep (Apr 16, 2014)

Hi Everybody,

I know that RAIDZ vdev should consist of the disks of the same size. At least I can `gpart` them to achieve the same size. It is also advised that one should use preferably different vendors or models from different production dates.

But what about having disks from different vendors with different features, I mean mixing (I know they are only three of models here) e.g.:
Toshiba 500Gb, 4K sector, 7200 rpm, 32Mb cache
ST Barracuda 500Gb, 4K sector, NCQ, 7200 rpm, 16Mb cache
WD  Black 500Gb, 4K sector, 7200 rpm, 64Mb cache

How different cache sizes, or existence of NCQ will affect RAIDZ2 vdev efficiency? As far as I have read above models have also different maximum speed rate of controller->disk path, though they are all SATA-3 compliant.


----------



## SirDice (Apr 16, 2014)

I have never tried this but it should work. Although I'm guessing the performance will be quite bad due to different timings on the various drives. This could result in one drive being hit harder than the other drives. Which would be bad for its lifetime. Ideally you want the disks in one pool to be the same or at least similar in features and performance.


----------



## ralphbsz (Apr 16, 2014)

There is a whole slew of issues here.

FIrst the issue of devices of different size.  It is theoretically possible to use those "efficiently", but only if there are enough of them, and the definition of "efficiently" may not be what you wanted.  Trivial example 1: You have two disks, one 1TB, one 4TB, they are pretty much the same performance (disk performance has not changed radically in the last few years), and you order your storage system to do mirroring.  It will store 1TB mirrored on two disks, and leave the extra 3TB on the second disk unused.  Non-trivial axample 2: You have 5 disks, one 4TB, and four 1TB, and again you order mirroring.  An intelligent storage system will put the first copy of each data on the big disk, and spread the second copies on the four other drives (I think ZFS will actually be able to do this, but I've never tried it).  This is very space-efficient, as every byte on disk is utilized.  It is terribly performance inefficient: the four 1TB disks will be idle 3/4 of the time, waiting for the 4-fold overloaded 4TB disk.

Extending this example to wider codes (in particular parity-based codes, like RAID-Z or RAID-Z2) gets complicated very quick.  The exact performance (in particular in the vitally important case of rebuild) depends crucially on how the data is laid out on disk (in particular whether it is reclustered or not).  But the basic problem remains: Efficient use of space, and efficient use of performance contradict each other.  As far as I know, ZFS will always use the whole "disk" (or partition or LV or logical unit) presented to it for RAID-Z and RAID-Z2, and will not do any strange space allocation, so it will default towards bad space efficiency and good performance efficiency (for identical-performance disks).

Now let's switch to the case of disks of different performance (and for simplicity let's make them the same size).  Already non-trivial example 1: Two disks, same size, one slow and one fast, using mirroring.  First, consider writes: they will have to hit both disks synchronously, so the better performance of the fast drive will be completely wasted, as the system will mostly be waiting for the slow disk.  Second, consider reads: Here the storage system can *in principle* send the read to either of the two disks, and a smart storage system would send it preferentially to the fast disk.  I don't know whether ZFS will do adaptive feedback, disk performance measurement, and smart steering of read IOs to the faster disk.  If it does not, the extra performance of the faster disk is completely wasted; if it does; the extra performance is at least available for reads (and thereby helps writes a little bit too, since no disk is badly overloaded by reads).

As far as I know, ZFS will use the whole disk for RAID-Z and RAID-Z2, which means full-track (and therefore full-width) writes will run at the performance of the slowest disk.  But ZFS handles small writes and full-track writes differently.  This means that in theory it could be performance-adaptive, and send small writes preferentially to the fastest 2 or 3 disks (which again would indirectly help reads and full-track writes).  I have no idea whether it does that or not.

Once you go into storage systems that can use more disks in a RAID group than the width of the RAID code (for example a system that uses a 4-P RAID-5 code, or a 4+P+Q RAID-6 code spread over 100 disks), things get extremely complex.  I know that commercial systems exist that measure disk performance, and use it as an input to data placement decisions, trying to find a compromise between using the full capacity of all drives, and utilizing the speed of the faster drives optimally.

Once you allow disks to be different in both performance and size, the problem goes into combinatorial runaway.  I don't know of any storage system that uses a pool of non-uniform disks in a single RAID tier intelligently (with active capacity and performance control), but it might exist.

Fortunately, in your case the three disks are very similar in performance: Same RPM, same sector sizes, similar caches.  Their performance will not differ by more than 20% or 30% probably, so optimizing the system to wring the last possible bit of performance out of the fastest one is not terribly important.  I would just put them together, and performance will probably be acceptable (fundamentally limited by the slowest drive).

Once you get into disks of significantly different capacity and speed, the common solution to the problem is tiering, HSM, separate storage pools, policies, and disaggregation.  Imagine a system that has some Violin boxes (very low capacity, extremely fast), a few shelves of SSDs, a few shelves of 15K RPM SAS drives, a few shelves of 4TB SATA disks, and some tape.  Nobody would run a single RAID group over this storage, but turn them into separate storage tiers, and move data around "appropriately".


----------



## wszczep (Apr 17, 2014)

ralphbsz said:
			
		

> There is a whole slew of issues here.
> [...]
> Fortunately, in your case the three disks are very similar in performance: Same RPM, same sector sizes, similar caches.  Their performance will not differ by more than 20% or 30% probably, so optimizing the system to wring the last possible bit of performance out of the fastest one is not terribly important.  I would just put them together, and performance will probably be acceptable (fundamentally limited by the slowest drive).



In the beginning I thought about different vendors in order to lower possibility of disk failure. Last year I had a problem with two disks failing one by one. They were of the same vendor with similar serial numbers. Fortunatelly I had raidz2.

After reading Yours and SirDice's posts I have decided to go again for the same model and vendor, but ordered them from two different suppliers - hoping for different production sets. Thus I will not loose performance. I also hope that extending current four 500GB disks raidz2 pool by another four 1TB disks raidz2 will not lower badly the performance. As far as I know the ZFS is not so smart to spread current data over expanded pool, so newly added data will be served probably faster then old data. Cause current "old" 500GB disks are not all of the same type - here slowest disks "rules".

Thank You for Your help.


----------

