# questions about users experience with ZFS on FreeBSD



## wonslung (Jul 26, 2009)

I've been using ZFS on freebsd for awhile as my major home file server.  So far i've had NO problems whatso ever and i absolutely love it.  It's been fast on cheap hardware that doesn't work on solaris, and i've really enjoyed using it for jails and other freebsd specific stuff.

I've been reading the Sun ZFS mailing list, subscribed to it awhile back when i was first looking into using ZFS.  Theres a few threads there where users have made various mistakes and lost all their data, and more where catastrophic failures have happened due to no real user error.  I Know that zfs is technically still considered experimental in FreeBSD and therefore comes with no guarantee but i was wondering if these types of failures have been happening to folks with FreeBSD.  I plan to build a backup server for my main fileserver when i can afford it, but that isn't an option right now.  I do have a redundant setup but so did some of the others.

My main question is, has anyone had a pool get lost due to a power outage?

Is there a problem with using consumer grade hardware (desktop ram/motherboard/processor) with ZFS on FreeBSD?  I see a lot of threads saying that ZFS should have ECC memory, and i plan to use server grade hardware on the next fileserver but is using non-ecc memory going to put me at major risk due to memory errors?  

My system is built on FreeBSD 7.2, intel q9550 processor 8gb of ddr2 800 12 1tb hard drives and 2 compact flash drives mirrored with gmirror for the boot device 
I configured zfs with 3 raidz1 vdevs each with 4 drives  and an 80 gb log device 

I've had a few power outages with no problems, theres a lot of thunderstorms where i live...i plan to invest in a UPS but right now i just have a normal surge protecter.

I know i should have asked all of these questions to begin with, i should be able to build a complete backup server for this in 2-4 months time.

Thanks


----------



## hedwards (Jul 26, 2009)

I've been using ZFS on a sub $500 consumer desktop without really any problems. I have yet to have any issues with ZFS itself even though the computer will crash from time to time.

I've had more problems though with my own actions and with the weirdness related to mixing MBR and GPT on the same disk and with a short foray into version 13 on one half of my mirror.


----------



## wonslung (Jul 26, 2009)

i've had no problems either but if you read the sun-zfs mailing list you'd understand my concern...i get the impression that on solaris zfs is one of those things that "works great until it doesn't" if you know what i mean.


I've found it to be amazingly fast and wonderful...i love snapshots and clones....but that list makes me paranoid....i just hope it'll be ok until i can build a backup server...i have about 5 TB of videos and music on it so far =)


----------



## Voltar (Jul 27, 2009)

I've been using ZFS (RAID-Z) for a few months on a development box without issues. It does take a lot of tuning and tweaking to get it running smoothly at first, but runs like a dream afterward. I've gone as far as pulling the plug on the machine and pulling a drive out, dd'ing random parts of a drive to corrupt it and each time the file system has come back without issues. The one thing I would have a complaint about though, is the cheap Highpoint RocketRaid card that I have in said development machine, it didn't want to cooperate, but that's because of their shitty card, not ZFS. 

I'm currently building a new file server for media and offsite/at home backups of a few servers, that will be running 16 drives in 4 drive vdevs, all in one pool.

So far I have about 8 TB of data on ZFS file systems and overall I've very satisfied with ZFS on FreeBSD thus far.


----------



## wonslung (Jul 27, 2009)

Voltar said:
			
		

> I've been using ZFS (RAID-Z) for a few months on a development box without issues. It does take a lot of tuning and tweaking to get it running smoothly at first, but runs like a dream afterward. I've gone as far as pulling the plug on the machine and pulling a drive out, dd'ing random parts of a drive to corrupt it and each time the file system has come back without issues. The one thing I would have a complaint about though, is the cheap Highpoint RocketRaid card that I have in said development machine, it didn't want to cooperate, but that's because of their shitty card, not ZFS.
> 
> I'm currently building a new file server for media and offsite/at home backups of a few servers, that will be running 16 drives in 4 drive vdevs, all in one pool.
> 
> So far I have about 8 TB of data on ZFS file systems and overall I've very satisfied with ZFS on FreeBSD thus far.



cool cool, yah, when i originally set mine up i did it wrong...sort of.  It's very hard to find a good but cheap jbod sata raid card.  Most of them for multi drive are designed for hardware raid or only hold 4 drives.  I ended up getting a second hand highpoint rocketraid 2340.  It's not the best card in the world but even with it not having any real dedicated hardware, with my system i get speeds up to 350 MB/s

The problem is, when i set it up i hooked up 12 new drives to the system which hadn't been configured and it didn't pass the empty drives on to freebsd...so i couldn't format them for what i'd consider to be a "normal" jbod setup.  The raid controler has a jbod setting but it's not what i should have used.  I made one "jbod" per drive then they showed up in freebsd, and i was able to make the zfs filesystem.

LATER i found out that if you format/partion the drive with normal system and hook it up to the raid card it shows up as "LEGACY" which is what i SHOULD have done....so what i need to do is fail one drive at a time, format the drive with another system with 1 large freebsd slice and add it back.  I've been kind of paranoid about trying this because right now everything is working perfectly.

Thanks to pheonix i was smart enough to at least use glabels when i set this system up....i guess i've been putting off the resilvering until i got the backup done....but anyways, glad to hear that it's working well for you...i have 8 more slots for hard drives to fill.

Currently the 3 vdevs i have are made up of 4 1tb drives.

If i added a new vdev of 4 1.5 tb drives would it make a difference or should i only use 1 tb drives unless i replace the others?


----------



## graudeejs (Jul 27, 2009)

ON my Desktop PC (25GB ram, 1x160GB + 1x250GB HDD) I almost have no problems....

I do have some small lags, when writing fast to HDD, but They are much smaller and less since Beta1...

Also I have this problem:
http://www.freebsd.org/cgi/query-pr.cgi?pr=137037
But even with that... I still use ZFS and I have never lost even a single bit of my data......


I don't plan to switch to old GPT/UFS anymore


----------



## Voltar (Jul 27, 2009)

wonslung said:
			
		

> cool cool, yah, when i originally set mine up i did it wrong...sort of.  It's very hard to find a good but cheap jbod sata raid card.  Most of them for multi drive are designed for hardware raid or only hold 4 drives.  I ended up getting a second hand highpoint rocketraid 2340.  It's not the best card in the world but even with it not having any real dedicated hardware, with my system i get speeds up to 350 MB/s
> 
> The problem is, when i set it up i hooked up 12 new drives to the system which hadn't been configured and it didn't pass the empty drives on to freebsd...so i couldn't format them for what i'd consider to be a "normal" jbod setup.  The raid controler has a jbod setting but it's not what i should have used.  I made one "jbod" per drive then they showed up in freebsd, and i was able to make the zfs filesystem.
> 
> LATER i found out that if you format/partion the drive with normal system and hook it up to the raid card it shows up as "LEGACY" which is what i SHOULD have done....so what i need to do is fail one drive at a time, format the drive with another system with 1 large freebsd slice and add it back.  I've been kind of paranoid about trying this because right now everything is working perfectly.



Son of a ... I just went through the same issue with my RocketRaid 2320. I bought the card awhile ago for use as just a controller card, no RAID features used because I didn't trust it, and didn't want to be stuck with a dead card and no access to my data. 

After painstaking tinkering, I came up with a solution that works so far. The RocketRaid card will force you to initialize a disk if it doesn't have a partition table, but then you're tied into using the controller if you do that. I also wanted to use the full disks, not slices because I've read that ZFS enables certain features (write caching?) when using a disk instead of a slice. However I don't know for certain on that, but I figured I would just go with the entire disk for simplicity. Since that wasn't an option, and HighPoint's customer support being completely lackluster and clueless, I took a few old drives and did some experimenting. You can pop a single drive into a system and create a partition on it using the entire disk. Now pop that disk into your highpoint controller and let it boot once. Check to make sure you have the drive showing (daX most likely). Now, I don't know about all RocketRaid controllers but the 2320 writes info to sector 9 of the hard drive when a disk is used in legacy mode. 

Now that you have a drive that the controller sees in legacy mode, with a partition table, use dd to copy the partition table and MBR to a file (`# dd if=/dev/daX of=~/rr_mbr bs=512 count=1`). Now when you want to add a drive on the RR controller, just hook the drive up to your system SATA bus, eSATA/USB, etc, dd the saved MBR/Partition table to it, and when you hook it up to your RocketRaid controller it'll already be in legacy mode. 

Now after that you can create your vdevs/zpool(s) using the full disks. In my tests using the above method I made a four drive raidz vdev, and tortured it, corrupted it and resilvered it and it held up time after time. 

This leads me to believe that even though you give ZFS the entire disk, it doesn't touch the partition table. The plus to this is you can use disks instead of slices in your vdevs and your drives can be swapped from controller to controller. 

That's basically a rough write up of what I did, I have an article going up soon that goes a bit more into detail about all of this.




> Thanks to pheonix i was smart enough to at least use glabels when i set this system up



I neglected to do that on my original setup for my current storage server, so it's on my list of things to do. Not terribly important but it'll make nice if I go off moving a bunch of drives around.




> If i added a new vdev of 4 1.5 tb drives would it make a difference or should i only use 1 tb drives unless i replace the others?



The size of the drives don't matter as long as the replication level is the same.


----------



## wonslung (Jul 27, 2009)

Voltar said:
			
		

> Son of a ... I just went through the same issue with my RocketRaid 2320. I bought the card awhile ago for use as just a controller card, no RAID features used because I didn't trust it, and didn't want to be stuck with a dead card and no access to my data.




a lot of people knock highpoint cards because they aren't "hardware raid" but for me, it's been amazing.  I get great speeds...the card itself is pci-ex8 and is quite fast...so what if it passes everything off to the cpu...i have 4 of them =)



> I also wanted to use the full disks, not slices because I've read that ZFS enables certain features (write caching?) when using a disk instead of a slice.



Well that's not true for freebsd.  In solaris it's SOMEWHAT true, by default if you give zfs a slice in solaris it disabled write caching because solaris has no method of splitting the cache...or something and due to the way zfs writes data in short bursts every 5-30 seconds that can be REALLY BAD but if you have a single slice on solaris you can re-enable write caching without problems.  I've read in these forums and elsewhere that this is NOT a problem in FreeBSD so using slices is fine.  I intend to make one large freebsd slice and not format it...just so the card will "give" it to freebsd as a da device.  From everything i've read the writecache still works in FreeBSD, maybe someone else would care to comment.


> This leads me to believe that even though you give ZFS the entire disk, it doesn't touch the partition table. The plus to this is you can use disks instead of slices in your vdevs and your drives can be swapped from controller to controller.


see, this is EXACTLY why i want it to be in legacy mode.....but you're saying i can use the disk and highpoints controller will still see it? what about when you hook it to another machine....how does THAT work?  doesn't it still think it's a highpoint device and therefore not read it?

when i initialized mine i made each drive a single jbod...maybe i didn't really understand what you were saying and i should read it again =)



> I neglected to do that on my original setup for my current storage server, so it's on my list of things to do. Not terribly important but it'll make nice if I go off moving a bunch of drives around.


i did it because i was scared that i might somehow end up with the device name switch...after asking around this can be catastrophic to ZFS.


> The size of the drives don't matter as long as the replication level is the same.


ok, another question then....my case holds 20 drives, i have 12 right now.  I'd like to have 1 or 2 hot swap drives.  would it be a problem to use smaller raidz vdevs?
right now i have 3 groups of 4 would it be ok to add another group of 4 and one group of 3?


----------



## wonslung (Jul 27, 2009)

oh, after re-reading what you did it sounds like you're doing exactly what i plan on doing, just in a different way.

when you have it in legacy mode it HAS to have some kind of slice on it....even if it's the entire DISK as one slice....
that's exactly what i plan to do, i just need to do it one drive at a time but so far i've been way too much of a puss to try.

when i bought the card i didn't know about "legacy mode" but that's EXACTLY what i want....did you get it into this mode WITHOUT putting a slice on it? if so i'm totally confused...


----------



## phoenix (Aug 3, 2009)

The only 2 major issues I've run into so far were due to user error:

using 24-drives in a single raidz2 vdev
using device nodes directly instead of labelled devices via glabel
The first issue gave us all kinds of performance issues, and killed the system when the first drive died.

The second issue was an issue when a drive died, I pulled the drive, and then the server was rebooted.  The 3Ware card re-numbered all the drives, the devices nodes were all shifted down by one for all the drives after the missing one, and the pool became corrupted.  Thankfully, putting even a normal drive back into the slot and booting allowed the 3Ware card to correctly number the drives, and the pool can back up in a degraded state.

Beyond that, we've gone through lots of system lockups during the initial tuning phase on FreeBSD 7.0-STABLE/7.1-RELEASE, lots of power failures, a couple of drive replacements, and things are still running along nicely.

We now do weekly "zpool scrub" via crontab to pick up any data corruption.  In just over a month of doing that, we haven't found any issues.

So far, things are working quite nicely.


----------



## wonslung (Aug 4, 2009)

I'm pretty happy with my setup...it's just amazing to read the ZFS mailing list for solaris...it's likely the fact that more people using opensolaris are using it especially for ZFS or that opensolaris uses it by default but there seems to be LOTS of issues there...I really love how well it works for jails.  If any other users come along who have large pools, i'd be interested in hearing from you.


----------



## ernie (Aug 17, 2009)

I am looking at building a FreeBSD mail server for 20-30 users. I am really keen on using zfs, but I was wondering what the best FreeBSD version to use in this application. I have 8GB RAM so an amd64 FreeBSD version I assume to support that much RAM.

The hardware is as follows:

Intel Core2 Quad core Q8300 CPU
8GB RAM
1 * 64GB SSD boot drive
4 * 1.5TB SATA drives for zfs


I was looking at raidz1 but am open to suggestions.

- Ernie.


----------



## wonslung (Aug 17, 2009)

ernie said:
			
		

> I am looking at building a FreeBSD mail server for 20-30 users. I am really keen on using zfs, but I was wondering what the best FreeBSD version to use in this application. I have 8GB RAM so an amd64 FreeBSD version I assume to support that much RAM.
> 
> The hardware is as follows:
> 
> ...



7.2 is fine, but one thing i'd suggest is this:
get 7.2-stable or 8.0 and instead of using the  ssd as a boot drive, get a compact flash card as your boot drive and use the ssd as a your ZIL (slog)

For a mail server, having the ssd as either ZIL or L2ARC is going to be much better than having it as your boot drive.


----------



## phoenix (Aug 17, 2009)

ernie said:
			
		

> I am looking at building a FreeBSD mail server for 20-30 users. I am really keen on using zfs, but I was wondering what the best FreeBSD version to use in this application. I have 8GB RAM so an amd64 FreeBSD version I assume to support that much RAM.



I'd suggest either waiting for FreeBSD 8.0 to be released, or installing an 8.0-BETA (BETA3 should be released soon) in order to get the best / most current ZFS support (ZFSv13).

If you can't wait, installing FreeBSD 7.2-RELEASE, and then updating to 7-STABLE (tag=RELENG_7 in the cvsup supfile) will get you ZFSv13 support as well, but without the major kernel changes that 8.0 will have.

ZFSv13 has quite a few little extras compared to ZFSv6 (available in FreeBSD 7.0-7.2).



> The hardware is as follows:
> 
> Intel Core2 Quad core Q8300 CPU
> 8GB RAM
> ...



For the best performance, you should consider creating two pairs of mirrored vdevs instead of a raidz vdev:

```
# zpool create storage mirror da0 da1 mirror da2 da3
```
That will create a pool named *storage* and create two mirrored vdevs.  The pool will automatically stripe the two mirrors together, in effect, creating a RAID10 array.

Writes should be much faster in this setup than in a 4-drive raidz1 or raidz2 setup.  You'll have 3.0 TB of disk space, and still be able to lose up to 2 drives (similar specs to a raidz2 setup).

If you absolutely need the disk space, then a 4-drive raidz1 would be doable.  That would give you 4.5 TB of space, but then you could only lose 1 drive, and your disk I/O will be lower.

If you have space in the case, then I'd also recommend getting a CF-to-SATA adapter, and using a 4 GB CompactFlash disk for the OS install.  Then you can use the SSD as either a separate intent log (ZIL/slog device), or as a cache device (L2ARC).  Leave / and /usr on the CF disk, and create ZFS filesystems for /var, /usr/local, /usr/src, /usr/obj, /usr/ports, /home, and so on.  And then maybe use tmpfs(4) for /tmp.


----------



## jb_fvwm2 (Aug 17, 2009)

I am "only just curious" as to how (GPT, gjournal, softupdates,)
might be (irrelevant to/useful for/able to be used with/not to
be used with) the setup mentioned above.  (Just curious because
I could probably plan for it but maybe not find the time to
put it to use).


----------



## ernie (Aug 18, 2009)

phoenix said:
			
		

> I'd suggest either waiting for FreeBSD 8.0 to be released, or installing an 8.0-BETA (BETA3 should be released soon) in order to get the best / most current ZFS support (ZFSv13).
> 
> If you can't wait, installing FreeBSD 7.2-RELEASE, and then updating to 7-STABLE (tag=RELENG_7 in the cvsup supfile) will get you ZFSv13 support as well, but without the major kernel changes that 8.0 will have.
> 
> ZFSv13 has quite a few little extras compared to ZFSv6 (available in FreeBSD 7.0-7.2).


 Can do, I have a few weeks to get this ready if 8.0-BETA3 is near I can run with that. I will run the system along side the existing mail server in case 8.0-BETA3 has some issues.




> For the best performance, you should consider creating two pairs of mirrored vdevs instead of a raidz vdev:
> 
> ```
> # zpool create storage mirror da0 da1 mirror da2 da3
> ...


 Sounds good, I will try both raid1 and what you suggest and run iozone on it to see how it feels.



> If you have space in the case, then I'd also recommend getting a CF-to-SATA adapter, and using a 4 GB CompactFlash disk for the OS install.  Then you can use the SSD as either a separate intent log (ZIL/slog device), or as a cache device (L2ARC).  Leave / and /usr on the CF disk, and create ZFS filesystems for /var, /usr/local, /usr/src, /usr/obj, /usr/ports, /home, and so on.  And then maybe use tmpfs(4) for /tmp.



What are the characteristics of the ZIL/slog device that make a SSD suitable? Can the SSD be partitioned to do the / and ZIL/slog or does that require the whole drive? I know little about the ZIL.

- Ernie.


----------



## phoenix (Aug 18, 2009)

jb_fvwm2 said:
			
		

> I am "only just curious" as to how (GPT, gjournal, softupdates,)
> might be (irrelevant to/useful for/able to be used with/not to
> be used with) the setup mentioned above.  (Just curious because
> I could probably plan for it but maybe not find the time to
> put it to use).



Why would you need GPT for a 4 GB flash disk?    There's nothing wrong with using it, as it will (hopefully/probably) eventually replace the BIOS/DOS/PC (whatever the technical term is) partition table.  It's only needed for the disks that are not being used by ZFS.

GJournal can be used, but since / and /usr are only updated during OS upgrades, there's not really much need for it.  (Obviously, the ZFS filesystems don't need/use it.)

GMirror is a handy tool to use for the CF disks, to create a RAID1 array between two of them (that's what we do).

Softupdates isn't really needed either, again, since the / and /usr filesystems should only be updated during OS upgrades.


----------



## wonslung (Aug 18, 2009)

Phoenix is correct.  I  used his method for my boot device and it works brilliantly.  I also got to benefit hugely from his experience when it came to glabels and raidz vdevs (i used glabel for my zfs drives and didn't make one giant raidz group, thanks to his helpful posts)


----------



## trev (Aug 20, 2009)

wonslung said:
			
		

> I've been reading the Sun ZFS mailing list, ...  Theres a few threads there ... where catastrophic failures have happened due to no real user error.



I haven't used ZFS with FreeBSD yet (waiting for 8.0 , but at work we have a couple of 12 and 13 TB ZFS pools under Solaris 10 (Sparc). No problems so far...


----------



## ernie (Aug 21, 2009)

trev said:
			
		

> I haven't used ZFS with FreeBSD yet (waiting for 8.0 , but at work we have a couple of 12 and 13 TB ZFS pools under Solaris 10 (Sparc). No problems so far...



I have been trying on 8.0-BETA2 but it's too unstable, threads just keep stopping. The ZFS worked fine though.

I just tried 7.2-STABLE much better. ZFS is version 13 in 7.2-STABLE as well, so it was happy importing the pool I made in 8.0-BETA2 that was after I did a buildworld to  update all the command line tools, just updating the kernel to 7.2-STABLE  was not enough, the old tools wouldn't talk cleanly to it.


----------



## avilla@ (Aug 25, 2009)

do any of you use a zvol as a swap device? i do, but swapping makes my system really slow, even locked for minutes sometimes... any suggestion?


```
$ zfs get all system/swap 
NAME         PROPERTY              VALUE                  SOURCE
system/swap  type                  volume                 -     
system/swap  creation              Mon Jun  1 13:26 2009  -     
system/swap  used                  3G                     -     
system/swap  available             14.4G                  -     
system/swap  referenced            654M                   -
system/swap  compressratio         1.00x                  -
system/swap  reservation           none                   default
system/swap  volsize               3G                     -
system/swap  volblocksize          8K                     -
system/swap  checksum              off                    local
system/swap  compression           off                    local
system/swap  readonly              off                    default
system/swap  shareiscsi            off                    default
system/swap  copies                1                      default
system/swap  refreservation        3G                     local
system/swap  primarycache          all                    default
system/swap  secondarycache        all                    default
system/swap  usedbysnapshots       0                      -
system/swap  usedbydataset         654M                   -
system/swap  usedbychildren        0                      -
system/swap  usedbyrefreservation  2.36G                  -
system/swap  org.freebsd:swap      on                     local

$ zpool status
  pool: system
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        system      ONLINE       0     0     0
          ada0p2    ONLINE       0     0     0

errors: No known data errors

$ gpart show
=>       34  156301421  ada0  GPT  (75G)
         34        128     1  freebsd-boot  (64K)
        162  156301293     2  freebsd-zfs  (75G)
```

perhaps acting on primarycache and secondarycache properties could help?


----------



## wonslung (Aug 26, 2009)

i use zvol as swap.

i added it like this:

```
zfs set org.freebsd:swap=on pool/swap
```
as mentioned in another thread...and i've had no problems

but my pool is 3 raidz vdevs...don't know if it makes a difference....it might because it's likely a lot faster than a single drive.


----------



## avilla@ (Aug 26, 2009)

wonslung said:
			
		

> i added it like this:
> 
> ```
> zfs set org.freebsd:swap=on pool/swap
> ...



i have this set as well, as you can see



> but my pool is 3 raidz vdevs...don't know if it makes a difference....it might because it's likely a lot faster than a single drive.



yes, that's probably a reason...


----------



## bigearsbilly (Aug 26, 2009)

anectdotal evidence:
I've used linuxes for about 1o years, solaris 10 for a while and
for the last 2 years or so primarily BSD.
the only catastrophic loss of data I have ever had was with ZFS on solaris 10.


not sure why. I still have the same disks but a new mobo.
1 disk has uncorrectable sectors (SMARTD)

also be careful not to rm zfs.cache.


----------



## avilla@ (Aug 27, 2009)

bigearsbilly said:
			
		

> anectdotal evidence:
> I've used linuxes for about 1o years, solaris 10 for a while and
> for the last 2 years or so primarily BSD.
> the only catastrophic loss of data I have ever had was with ZFS on solaris 10.



that was years ago if i understand correctly, things have changed, zfs was so much young
and that wouldn't be a problem, all my files are stored on a server: this is just a laptop, and zfs has too many nice features (like snapshots) to go back to ufs



> also be careful not to rm zfs.cache.



i know that the system won't boot... are there any other risks?


----------



## wonslung (Aug 27, 2009)

bigearsbilly said:
			
		

> anectdotal evidence:
> I've used linuxes for about 1o years, solaris 10 for a while and
> for the last 2 years or so primarily BSD.
> the only catastrophic loss of data I have ever had was with ZFS on solaris 10.
> ...



yah, thats part of why i started this thread.  I follow the ZFS mailing list for solaris and every other thread is about some sort of pool loss or data loss.  I've watched these threads in this forum and it's just not even close.  I use ZFS under FreeBSD and it's rock solid for me.  I'm sure part of the reason is that it is experimental and only used by a few in FreeBSD, and those users probably go into it expecting there might be an issue, where as on Solaris it's the default and people expect it to work.  BUT even taking that into account, it seems to me, that it just doesn't have the issues on FreeBSD.  I wonder if it's due to better device drivers?


----------



## trash (Oct 13, 2009)

"works great until it doesn't" if you know what i mean

yes! i know what you mean. im in the doesnt stage right now! 
anyone have experience of trying to import a solaris v13 zfs mirror into freebsd stable 7.2 ? i get 'corrupt gpt tables' im a bit worried about doing anything to them because 1) i dont know what they are 2) theres no backup of my mirror - in fact the mirror *was* the backup


----------



## astadtler (Oct 16, 2009)

I've been running it for quite a while I had 7.2 w/ a single vdev pool for a long time just to play around.  Upgraded to 8.0 in the beta stages a few months ago and made a 4 vdev raidz1 and its great.

Current Specs:
Athlon 64 3200+
Supermicro H8SSL-i
2gb 4x512mb DDR266 RAM
4x1.5tb Seagate 7200.11 SATA in Raidz1
1x120gb Western Digital IDE (system drive)

Only problem I really had was it seems to run out of ram running rtorrent with 600 torrents and afp/smb/nfs/iscsi.  I had to get rid of my gmirror for the system drive since geom ate even more memory.  Unless anyone has any suggestions I had to tweak down the memory usage which took a performance hit but it doesn't run out of ram.


```
vm.kmem_size_max="1024M"
vfs.zfs.arc_max="512M"
vfs.zfs.prefetch_disable="1"
```


----------



## dennylin93 (Oct 16, 2009)

I'm setting up a new mail server right now, and I'm planning to use ZFS for the storage.

Currently, I'm waiting for 2 other 500 GB drives so that I can use RAID-Z. I didn't do any tuning (apart from setting kern.maxvnodes=400000 in /etc/sysctl.conf) and it has worked nicely so far.

My specs (HP DL320 G5p):

```
FreeBSD 7.2-RELEASE amd64
Intel Xeon X3210 2.13GHz
4 GB RAM
```


----------



## wonslung (Oct 18, 2009)

yeah, i'm LOVING my raidz media NAS

currently i have the following setup
intel q9550 
8gb ddr2 800
12 1tb hard drives in 3 raidz vdevs with 4 drives each (3x4=12)
FreeBSD is installed to a gmirror of compact flash cards, / /usr and /etc are on the cf cards, 
/usr/local /usr/ports /usr/src /tmp and /var are on ZFS filesystems with a lot of other filesystems....multiple jails running via ZFS clones which is REALLY cool.performance is much better than i'd imagined it would be with such cheap hardware and so much stuff....i'm running 7 jails, and a TON of music/Video....it's been great.  I have about 10 clients on my network with a peak of about 6 at a time accessing the data, a lot of 1080p video and haven't had any problems to speak of.


----------



## ents (Oct 26, 2009)

killasmurf86 said:
			
		

> ON my Desktop PC (25GB ram, 1x160GB + 1x250GB HDD) I almost have no problems....
> 
> I do have some small lags, when writing fast to HDD, but They are much smaller and less since Beta1...
> 
> ...




Have you solved your "small lags" problem yet?
I believe I have/had the same problem and found this from opensolaris mailinglist:

http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26485.html

I solved my problem with vfs.zfs.txg.synctime=1 in the loader.conf


I don't know if this was right solution but there are no stalls anymore.


----------



## ernie (Nov 9, 2009)

Is it possible to make a ZFS vdev from a network volume like an NFS share or iSCSI?

The tradition method of ZFS seems to use a single chassis with lots of drives, but I am trying to think of a high availability solution using multiple chassis, say 3 of them, so if one fails from say a motherboard fault, the other two can keep going.

Obviously a network volume is going to have a performance hit, but that's the price you pay.

- Ernie.


----------



## phoenix (Nov 9, 2009)

Yes, you can create vdevs using iSCSI exports.  And if Solaris/FreeBSD ever get support for AoE (ATA over Ethernet), FoE (FiberChannel over Ethernet), and other such storage technologies, then those could be used as well.

You can create vdevs using any block device (physical hard drive, iSCSI export, hardware RAID array/LUN, etc).

You can also create vdevs using files, however, that puts you at the mercy of the host filesystem.  This is generally recommended only for testing purposes.

One could create an NFS or SMB/CIFS share, create files on those shares, and then create vdevs using those files.  But that would defeat the end-to-end checksumming and other features of ZFS, as it would no longer control the entire storage stack.


----------



## VictorM (Dec 28, 2009)

just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.


----------



## dmdx86 (Jan 2, 2010)

VictorM said:
			
		

> just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.



My understanding is that with RAIDZ, there is no RAID-5 write hole, therefore worrying about keeping a battery-backed cache to protect your data is no longer an issue. Right?


----------



## wonslung (Jan 2, 2010)

VictorM said:
			
		

> just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.



I don't think there is a problem with using cheap or dumb controllers.


ZFS works well at many levels.  I know PLENTY of home users who use cheap 4 port pci cards across 3 or 4 pci slots OR
use the 8 port pci-x cards in normal pci slots and have no problems.

They don't get massive speeds or anything but all the care about is being able to watch a single hd stream (4-10 MB/s) across samba.

I think it is all about what you need and what you're willing to put into it.


----------



## FLAGEL (Jan 10, 2010)

dmdx86 said:
			
		

> My understanding is that with RAIDZ, there is no RAID-5 write hole, therefore worrying about keeping a battery-backed cache to protect your data is no longer an issue. Right?



I might very well be misinformed, but if you're using SSD-devices for ZIL you will want to turn off the SSD write cache because it's volatile (unless you have really expensive battery backed SSD-devices) and a power-loss will very likely end in disaster. The downside of turning off write cache is a huge loss i IOPS, hence the need of a battery backed write cache (RAID-card) that takes care of the IOPS-problem.


----------



## danbi (Apr 25, 2010)

After days of downtime, because of corrupted 3ware RAID arrays, I decided to move all these setups to ZFS. All for good! Especially with regards to storage management. The only downside is the need for more RAM. But this is trouble only with old servers using now exotic RAM types. RAM is very cheap these days.

In my opinion, the only significant benefit of using battery backed RAID controller for ZFS is the added layer of management. You may, for example verify each drive separately. But you may do the same with smartmontools with any controller. The IOP benefit is something I did not observe --- but plan to test more extensively soon. 
Of course, there is nothing wrong to use 3ware-type RAID controlers with ZFS, even if you don't use any of their RAID functionality. If you attach more disks, this is the way to go.

For the ZIL, probably the best solution is battery backed RAM disk. You don't need huge capacity here and small capacity, fast SSDs are probably not easy to find anymore. For the lower end/cheap systems -- any flash device is good for this purpose. Trouble is, you cannot remove the ZIL after you start using it.

Same for L2ARC. Here, an SSD is wonderful! But for lower end systems, you may use USB flash tokens, even many of these. Commodity motherboards come with 6 or more USB ports. If you get say 20 MBps read from an $10 4GB flash stick, with 6 of these you will get 120 MBps for $60 at 24GB. Not bad, eh? Still, for larger sizes SSDs win.

I would like to strongly second the statement that Phoenix made "use glabel"! This has saved my day several times and I now routinely label any storage media. I was wondering, if this breaks the ZFS philosophy "use entire disk devices for better performance" -- but haven't seen any degradation so far.

My boot devices of late are .. USB sticks  in gmirror configuration. For several reasons:
- recent motherboards don't even have PATA anymore, especially the desktop models. This makes CF cards useless.
- I happen to use lots of Supermicro systems recently and these have almost always two internal USB slots. Just find small size USB sticks that fit there -- no danger to disconnect these, as when they hang out of the case.
- these are disposable in principle, although I have yet to see one fall on me.
There are also USB flash modules, that directly attach on the motherboard pins (for front panel USB ports for example). These too stay inside the case, are somewhat faster and of course more expensive.
The only trouble with USB flash for boot devices is that FreeBSD 8.0 doesn't play nicely with them at boot time. There is of course an easy fix.


----------



## phoenix (Apr 26, 2010)

danbi said:
			
		

> In my opinion, the only significant benefit of using battery backed RAID controller for ZFS is the added layer of management. You may, for example verify each drive separately.



With ZFS on top, you want to disable the auto-verify features of the controller.  Let ZFS "scrub" handle that.  Scrub checks the actual data on the drives, and can detect (and repair) these errors without impacting disk I/O as much.  You really don't want the controller doing a verify while ZFS is doing a scrub while you are doing normal disk I/O.  



> Trouble is, you cannot remove the ZIL after you start using it.



ZFSv19 introduced the ability to remove log devices.  Prior to ZFSv19, though, you need to make sure to use mirrored log devices.  If a non-mirrored log device dies, it's possible to lose the entire pool.



> I would like to strongly second the statement that Phoenix made "use glabel"! This has saved my day several times and I now routinely label any storage media. I was wondering, if this breaks the ZFS philosophy "use entire disk devices for better performance" -- but haven't seen any degradation so far.



If you label the disk device (/dev/ad0) and not a slice (/dev/ad0s1), then you are still using the entire disk.  glabels use 1 sector of the disk for metadata.


----------



## wonslung (Apr 29, 2010)

my cf cards didn't use pata.

I used a cheap sata=>cf card adapter.


----------

