# 36 TB NAS Server



## re0 (Jan 13, 2012)

Need to build an almost infinitely expandable nas for a client. Budget for parts is $3000 and storage to start needs to be 12T 

So here is the parts list:

Mobo

SUPERMICRO MBD-X8DTH-6F-O Dual LGA 1366 Intel 5520 Extended ATX Dual Intel Xeon Server Motherboard
http://www.newegg.ca/Product/Product...82E16813182174

Cpu

Intel Xeon E5504 Nehalem 2.0GHz 4 x 256KB L2 Cache 4MB L3 Cache LGA 1366 80W Quad-Core Server Processor BX80602E5504
http://www.newegg.ca/Product/Product...16819117187CVF

CPU cooler
Intel BXSTS100A Active heat sink with fixed fan
http://www.newegg.ca/Product/Product...16835203002CVF


Ram

Kingston 24GB (3 x 8GB) 240-Pin DDR3 SDRAM ECC Registered DDR3 1333 (PC3 10600) Server Memory Model KVR1333D3D4R9SK3/24G
http://www.newegg.ca/Product/Product...82E16820139273

Case

NORCO RPC-4224 4U Rackmount Server Case with 24 Hot-Swappable SATA/SAS Drive Bays
http://www.newegg.ca/Product/Product...82E16811219038

PSU

SuperMicro PWS-865-PQ 865W Single Server Power Supply - OEM
http://www.newegg.ca/Product/Product...82E16817377004


Hard Drives
8 x HITACHI Deskstar 5K3000 HDS5C3020ALA632 (0F12117) 2TB 32MB Cache SATA 6.0Gb/s

http://www.newegg.ca/Product/Product...82E16822145475


Cache Drive

Crucial M4 CT064M4SSD2 2.5" 64GB SATA III MLC Internal Solid State Drive (SSD)
http://www.newegg.ca/Product/Product...82E16820148441

USB Boot device
Mushkin Enhanced Mulholland 4GB USB 2.0 Flash Drive 
http://www.newegg.ca/Product/Product...82E16820226077


Raid Type

Software 3 x 8 drives in raidz2 (software raid 6) all in same zpool

File System

ZFS


So my software needs are

1 A webpage accessible torrent downloader (is it possible to have the client click a torrent link and have it download on the nas and not just point the file save location to the nas? they currently use vuze. or is it better to remote desktop the server and then use vuze on it?

2 nfs access

3 samba access

4 easy replacement of failed disks

5. email alert of failed disks

6. monthly/weekly scrubbing

7. STABILITY!

8. best way to take advantage of the server qualities of this mobo

#7 is why i think FreeBSD is the right choice of OS. I have read lots over the last 2 weeks and have zero unix like os experience. I realize that this information is scattered alone the web but it's quite a difficult task to use the RIGHT information. any help suggestions comments would be great!


----------



## brd@ (Jan 13, 2012)

Hi,

A few notes:

- Hitachi makes the best 2TB drives on the market, so good job picking those. I would see if you could get the UltraStar drives instead though.
- More RAM always helps. Especially if you want to do things like ZFS dedupe, you will need more than 24GB
- I don't know about in FreeBSD 9.0 with ZFS version 28, but in the previous version you want to avoid raidz2. It caused slow performance. You'd be better off performance wise doing a raidz1 with a hotspare


----------



## Sylhouette (Jan 13, 2012)

@brd


> raidz1 with a hotspare


FreeBSD does not have hot spare!
It is a cold spare and human intervention is needed to replace a faulted drive!

A lot of people think it is hot, but it is not.
On the FreeBSD mailing list i voted for an adjustment on the FreeBSD ZFS wiki page to make this clear for all.
http://lists.freebsd.org/pipermail/freebsd-fs/2012-January/013428.html
No responce.

A lot of FreeBSD pages point to the solaris manuals and the man page, and there it state that the spare is hot, so people think it is hot, even zpool add accepts a spare without any warning that the disk is a cold standby disk.

Sooner or later someone gets bitten by this.

regards
Johan Hendriks


----------



## phoenix (Jan 13, 2012)

brd@ said:
			
		

> - Hitachi makes the best 2TB drives on the market, so good job picking those. I would see if you could get the UltraStar drives instead though.
> - More RAM always helps. Especially if you want to do things like ZFS dedupe, you will need more than 24GB
> - I don't know about in FreeBSD 9.0 with ZFS version 28, but in the previous version you want to avoid raidz2. It caused slow performance. You'd be better off performance wise doing a raidz1 with a hotspare



It all depends on what you call "slow".    And the "slowness" can be mitigated by adding multiple raidz2 vdevs to the pool, as writes will be striped across all the vdevs.

With drive sizes over 1 TB, you really don't want to be using raidz1, unless you have really good backups and you trust your drives completely.    It can take several days to resilver a 2 TB SATA drive in a multi-TB pool like this.  And your entire pool will be running without any redundancy for the entire resilver process.  And you'll be hammering the other drives in the raidz1 vdev during the resilver process.  Lose another drive in the raidz1 vdev while resilvering ... and lose everything in the pool.


----------



## brd@ (Jan 13, 2012)

Sylhouette said:
			
		

> @brd
> 
> FreeBSD does not have hot spare!
> It is a cold spare and human intervention is needed to replace a faulted drive!
> ...



I guess it depends on how you define a hot spare. I think of a hot spare as a drive plugged in and running, waiting to be used.

Thanks for the info, I will have to find a way to test and see if we can get this fixed.


----------



## brd@ (Jan 13, 2012)

phoenix said:
			
		

> With drive sizes over 1 TB, you really don't want to be using raidz1, unless you have really good backups and you trust your drives completely.    It can take several days to resilver a 2 TB SATA drive in a multi-TB pool like this.  And your entire pool will be running without any redundancy for the entire resilver process.  And you'll be hammering the other drives in the raidz1 vdev during the resilver process.  Lose another drive in the raidz1 vdev while resilvering ... and lose everything in the pool.



Good point.. I forgot to take drive size into account. Hopefully the performance is fixed in v28.


----------



## Sylhouette (Jan 13, 2012)

> I guess it depends on how you define a hot spare. I think of a hot spare as a drive plugged in and running, waiting to be used.
> 
> Thanks for the info, I will have to find a way to test and see if we can get this fixed.



This is from the zpool man page.

```
Hot Spares
     ZFS allows devices to be associated with pools as "hot spares".  These
     devices are not actively used in the pool, but when an active device
     fails, it is automatically replaced by a hot spare. To create a pool with
     hot spares, specify a "spare" vdev with any number of devices. For exam-
     ple,

       # zpool create pool mirror da0 da1 spare da2 da3

     Spares can be shared across multiple pools, and can be added with the
     "zpool add" command and removed with the "zpool remove" command. Once a
     spare replacement is initiated, a new "spare" vdev is created within the
     configuration that will remain there until the original device is
     replaced. At this point, the hot spare becomes available again if another
     device fails.
```
So the defenition of the man page tells me it is hot, really hot 

It looks like something is in the works, i found the following link.
ZFS fault monitoring and management daemon
http://svnweb.freebsd.org/base?view=revision&revision=222836 
If i interpeted it right, it should be doing the monitoring and management and hopefully the replacements of faulted disks.

This link i found on the whats cooking site http://ivoras.net/freebsd/freebsd10.html.

regards
Johan


----------



## FBSD (Jan 20, 2012)

re0,

Have you looked at FreeNAS at all? FreeNAS is based on FreeBSD but geared and tweaked for NAS use.

Keep us updated about progress.


----------



## gkontos (Jan 21, 2012)

re0 said:
			
		

> #7 is why i think FreeBSD is the right choice of OS. I have read lots over the last 2 weeks and have zero unix like os experience. I realize that this information is scattered alone the web but it's quite a difficult task to use the RIGHT information. any help suggestions comments would be great!!!



Just out of curiosity, how are you planning on setting up, tuning and maintain such an installation with zero experience?


----------



## throAU (Feb 1, 2012)

What gkontos said.

The man page probably needs to be updated.  As suggested above, the drive is installed as a spare, but it will not automatically fail over onto the hot spare on FreeBSD without user intervention.

On a side-note...


I love seeing client requests like this.  "I want infinitely expandable storage.  I have $3000!".

Good luck with it.  Sounds like what they really want is a Netapp or similar.

However, my advice would be:

Get a proper server, not one that is built with random parts.  Going with backyard self built hardware may save you a couple of $ up front, but it WILL burn you when it breaks, and random component X or Y is no longer available, and certainly not available on 4 hrs notice.

Get one from HP/Dell/IBM/etc that has a 4hr response time (or at least option for such), 3-5 yrs support time-frame, with remote lights out management (that will enable you to remotely power it up, fix an OS that fails to boot, and notify you of hardware failures), redundant hot-swap power supplies, plenty of NICs, etc.  Add disks/controllers/shelves to it as required.

I suspect your budget will need to grow somewhat.

If someone is looking to store 12tb of data, my guess is they'll be upset when it is unavailable.  Don't cheap out on parts - if it is not achieveable to get the correct enterprise-level parts within the budget, tell them so.  If they persist, fair enough, but make sure you raise the issue (potential support problems), and don't end up being the "support guy" for a piece of junk that is not supportable.

Building a FreeBSD NAS like the one you are proposing (i.e., out of parts from Newegg) is fine for home.  For business ("a client"), I wouldn't touch it with a 10 foot pole.  I've been there, been burnt before.  Please learn from my experience 

2c.


----------



## phoenix (Feb 1, 2012)

Actually, $3200 CDN will get you 48 TB of raw disk storage in a 4U rackmount with redundant power, and full IPMI remote management.   Doesn't come with any support beyond standard hardware warranties, but that's what IT staff is for.  


SuperMicro 836 chassis with 24 hotswappable SATA drive bays and dual PSU
SuperMicro H8DGi-F motherboard
AMD Opteron 6128 CPU (8 cores)
32 GB ECC DDR3 RAM
3x SuperMicro AOC-USAS2-8Li SATA controllers (multilane, 8 port)
24x 2.0 TB Western Digital Caviar Black 6.0 Gbps SATA harddrives
Kingston SSDnow 64 GB SSD

Motherboard supports two CPUs, up to 12 cores per CPU, and up to 256 GB of ECC RAM.  There's also 6 PCIe slots, so if you need more storage, just add more SATA controllers with external connectors, and buy a JBOD chassis full of drives.  When you expand to 48 drives and need more, then replace the controllers with ones that support M8 ports (8 channels per port; 2 ports per card).  If you need to expand beyond 96 drives, then you'll have to look into RAID controllers that supports up to 24 channels per card, or look into SAS expanders.

Makes one hell of a ZFS storage system. We're just putting the finishing touches on our third one.  32 GB of RAM and 32 GB of L2ARC provide lots of breathing room for dedupe and compression.  I've seen "zpool iostat" go over 220 MBps of read bandwidth, and just a hair less than 200 MBps of write bandwidth.

You don't need fancy $10,000 US+ storage boxes from the name brands.  You just have to know your hardware, keep a few spare parts on hand, and be willing to monitor things yourself.

With IPMI, you get full remote management, right into the BIOS, POST, power control, fan control, temperature monitoring, etc.  With both a proprietary Windows GUI (IPMIView) and OSS CLI tools (ipmitool).

It's actually quite fun to listen to presentations from "the big boys" (EMC, Nimbus, DataDomain, etc) about their fancy $100,000 US boxes ... that do exactly the same thing as our $3000 CDN boxes.    While the expensive boxes have pretty management GUIs, there's really nothing they do that FreeBSD + ZFS can't.


----------



## throAU (Feb 1, 2012)

Don't get me wrong.  It can be done, and if you have dedicated IT staff who know FreeBSD to make it happen, fair enough.



> that's what IT staff is for.



I suspect that getting someone with "zero unix experience" to set up and maintain such a solution however, is a recipe for disaster.  Having someone on-hand to monitor the solution, keep spares, etc needs to be factored into your budget.  If you don't have someone for that job, they will need to be paid (and/or trained - learning on the job with company data is not an ideal situation).

Also, the cost over there must be a lot cheaper, over here, disk is around $125AU/tb (2tb drive price = 250 bucks) for 7200rpm 3.5" sata.  Power supplies here are not cheap, nor are decent cases, etc.

Again, I'm not intending to poo-poo FreeBSD as a NAS.  All I'm saying is that you need to know what you're doing, and buy proper hardware for it.  From this side of the world, a 3k budget for the solution required is a little on the low side (will barely even buy the drives, if you take mirroring or RAID-Z into account), even if you have a competent admin for it.  

With an EMC/Netapp/etc a lot of what you are paying for is the software integration, setup, thid party support tick (ESX is supported on <Foo> storage) and maintenance contract.  Plus, the ability to upgrade to new controllers without downtime for example (haven't seen FreeBSD do that yet).  Even my low end Equallogic here has been running here for 4 years with only 2 outages due to extended power failure exhausting our UPS.


edit:
On another note - I am actually looking into building my own NAS here as well (for VM test lab/non-critical disk archive stuff).  What sort of IOPs are you getting on one of those boxes?


----------



## gkontos (Feb 1, 2012)

@phoenix,

what you just described sounds like an excellent system. However, like you have said before it takes someone who has found the hard way how ZFS works 

Speaking as a client, sometimes I may end up paying much more than I actually should because someday, I bought a solution for my company from a guy who _read lots over the last 2 weeks and has zero unix like os experience_.

My point is that we as engineers, often turn clients to large solution providers by our actions.


----------



## Adrculda (Feb 3, 2012)

Here...
32 Bay Storage Server

System is about $1000 now, you just have to put the drives in and install an OS. I would suggest FreeNAS. 

I still don't see why people do what *I* do. Look for retired enterprise hardware as its still fast enough for a media server or small storage server.

Me personally I'm using two HP MSA20 Drive Shelf's with HP MSA1500 controller, two fiber channel switches (data path redundancy) are used as well. NAS backbone is in 2GB - 60/125 fiber and server backbone is 10G (Mellanox MHET2x-1TC) to and from the 48 port switch.

And for all those that wonder: yes this is at my house and it's the family media server. At the bottom of the rack there are two HP R5500 UPS's plus two ERM (Extended Run-time Module).

Servers are as follows 
1 x DL360 G4 - 3.6GHZ Xeon Duo Core / 8GB RAM -> Firewall
2 x DL580 G4 - Quad 3.4GHZ Duo Core / 32GB RAM -> Windows Home Server / 2008 Enterprise Server (Apache / IIS)

Everything set me back under $5000


----------



## frankpeng (Feb 25, 2012)

I have built several FreeBSD servers by using non-ECC memory. It has no problem so far. I used 3TB hard drives and gmirrored them.


----------



## throAU (Feb 25, 2012)

frankpeng said:
			
		

> I have built several FreeBSD servers by using non-ECC memory. It has no problem so far. I used 3TB hard drives and gmirrored them.



By using non-ecc memory and gmirror instead of ZFS, what you're saying is that you have either had no issues yet, *or* you have not detected data corruption yet.

It's entirely possible that your disk subsystem has correctly written 2 copies of corrupted data to disk....


----------



## jrm@ (Jun 20, 2012)

I'm soliciting suggestions for a storage solution for our group.  We work in an educational setting on statistical models that use lots of genome data as input and produce lots of data as output.  Since most of this data is text, zfs with compression would make a nice fit.  Funds are limited, so I've been asked to put something together that will give us lots of space and just work for something under $3500 CDN.  I'm not a system administrator, I'm just stuck with job, so I might be overlooking something trivial.



			
				phoenix said:
			
		

> SuperMicro 836 chassis with 24 hotswappable SATA drive bays and dual PSU
> SuperMicro H8DGi-F motherboard
> AMD Opteron 6128 CPU (8 cores)
> 32 GB ECC DDR3 RAM
> ...



I'm told that Western Digital doesnâ€™t recommend using the Caviar Black hard drives for more than a raid 1 or raid 0 in a desktop computer (relevant for jbod and zpools?) and it's not on Supermicro's tested list.  Based on comments here and elsewhere I'm leaning towards either WD RE4 or Hitachi UltraStar drives.  Will one perform better than the other in this setup?

What about LSI 9211-8i host bus adapters instead of the SuperMicro AOC-USAS2-8Li SATA controllers?  I've read about this combination @pheonix described elsewhere, but the SuperMicro controllers work in a UIO slot, but I don't see any mention of those slots on the SuperMicro H8DGi-F motherboard?

Is there anything obvious I'm overlooking?

Thanks


----------



## phoenix (Jun 20, 2012)

UIO slots are PCIe slots.  UIO cards work in PCIe slots, and vice-versa.  The only difference between UIO cards and regular PCIe cards is which side of the PCB the chips are on, and which side the bracket connects to.  You can remove the UIO bracket and stick on a PCIe bracket.

However, I've read good things about the 9211-8i on the zfs-discuss lists, and it uses the same LSI 2008 chipset as the SuperMicro controller, so it will work just as well.

We use WD RE drives when connected to hardware RAID controllers.  And we use Caviar Blue or Caviar Black (or Seagate 7200.11 and 7200.12) when connected to plain SATA controllers.  Between smartd and ZFS we have not run into any issues using non-RE drives.


----------

