# FreeBSD, ZFS, SSDs, and proper sector alignment.



## jnr (Sep 14, 2009)

Hi all, I have an Intel X25-M G2 SSD on the way that I'd like to use as my boot drive on FreeBSD amd64. I will likely wait for the 8.0 release to set this up.

Research into the subject says performance improves dramatically when partitions are properly aligned to the drive's sectors. Here is a post detailing the steps for Linux. Is the procedure similar for BSD? I'd like to use the following partition layout:

80GB Intel SSD divided into an 8GiB FreeBSD-swap partition with the rest dedicated to a ZFS pool for everything but /home.

Two 1.5TB Seagate drives mirrored in a pool for /home. I already have this set up, so it's not a problem.

Thanks in advance for any help you can provide. SSDs and pooling are both very new to me.


----------



## aragon (Sep 14, 2009)

It is possible to override the geometry auto detection with fdisk(8) and that should let you create partitions as you want, but if you want to boot off the partitions you create like that you might run into problems.  Feel free to experiment with fdisk but I have a better suggestion for you to consider.

Instead of overriding the sectors/track setting to 56, calculate the track boundaries at 63 sectors/track that also fall on 4096 byte boundaries (there is plenty of overlap).  I wrote a quick [post=75444]script to calculate[/post] this for you.

Plug in a byte offset and it'll correct it to the two nearest offsets that fall on 4096 byte boundaries _and_ 63 sectors/track boundaries.  You can then use the offsets it produces in fdisk without overriding the 63 sectors/track geometry setting.  This should achieve precisely the same thing while remaining 100% compatible with boot managers, your BIOS, and any other disk partitioning tools.


----------



## aragon (Sep 14, 2009)

BTW, you should also do the same when creating partitions with bsdlabel(8) or your efforts in fdisk will be canceled out.


----------



## jnr (Sep 14, 2009)

Sounds perfect. I'll give that a shot in a week or so with an 8.0 RC or RELEASE if it's out.


----------



## jnr (Sep 14, 2009)

I've done some more reading, and it sounds as though I can make a third partition on my SSD to use as the ZIL for my 1.5TB drive pool.

According to solarisinternals,


> The size of the separate log device may be quite small. A rule of thumb is that you should size the separate log to be able to handle 10 seconds of your expected synchronous write workload. It would be rare to need more than 100 MBytes in a separate log device, but the separate log must be at least 64 MBytes.


So I can still keep the majority of my 80GB SSD for the OS while also speeding up my HDDs.

If anyone knows of any caveats to such a setup I'd love to hear them before I commit to it


----------



## jnr (Sep 19, 2009)

jnr said:
			
		

> If anyone knows of any caveats to such a setup I'd love to hear them before I commit to it



To answer my own question, the pool will apparently fail to import if the log device ever goes missing.


----------



## tobe (Nov 12, 2009)

Hi,

First, many thanks for the informations. I'm going to replace my HDD with a SSD in a few days (i'm waiting for the postman ).

While i was reading the technical details, i was thinking about the default fragment size for UFS filesystems. Actually it's 2048 bytes, half the size of a physical sector on a SSD drive.

Should i change it to 4096 or it is ok to use the default size?
_Edit: and change the block size to 32k to keep the 8:1 ratio._

Thanks,
TobÃ©


----------



## aragon (Nov 12, 2009)

Hard to say without testing and benchmarking.  The file system will be less space efficient, and not sure if there'll be an appreciable speed up to make it worthwhile.  Remember fragment size is only a minimum.


----------



## tobe (Nov 13, 2009)

I'm not really interested in maximizing performances, all i want is to keep my new expensive SSD running as long as possible and it seems that avoiding read-modify-write cycles is something to consider. But hell, it's too much work  I think i'll just use it like any HDD in the hope that prices will drop until it breaks.

Btw, after googling a bit, it seems that the block size to consider is the erase block size and it's bigger than 4kb.


----------



## tobe (Nov 17, 2009)

So finally, i decided to align the second partition (ad0s1b), not the first one (ad0s1a). I don't care of sector alignment on the root file system because it almost read-only in usage.

Aligning the second partition is safe and need only one computation: the size in blocks of the root partition. For example, a 128M partition is made of 262144 = 128*1024*1024/512 blocks. Then remove 63 blocks because the first partition start 63 blocks after the slice. Use this value to set the fist partition size in blocks and create all other partitions with a size rounded to M or G.

fdisk:

```
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 63, size 125045361 (61057 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 1023/ head 15/ sector 63
```

bsdlabel:

```
#       size   offset    fstype   [fsize bsize bps/cpg]
a:    128961        0    4.2BSD     2048 16384  8064 
b:   2097152   128961      swap                    
c: 125045361        0    unused        0     0
d:   4194304  2226113    4.2BSD        0     0     0 
e:   4194304  6420417    4.2BSD        0     0     0 
f:   8388608 10614721    4.2BSD        0     0     0 
g: 106042032 19003329    4.2BSD        0     0     0
```

My root partition is 63M here.


----------



## aragon (Nov 18, 2009)

Not a bad way to go either.  There's just one possible problem with your setup - you created "a" with an offset of 0.  One should leave 16 blocks at the beginning unallocated as that is reserved for metadata according to bsdlabel(8).

I see your "a" ends on a 128 KB boundary.  I need to revise my script sometime.


----------



## jem (Nov 18, 2009)

Might be worth looking at GPT partitioning.  Using that, it's simple to start your partitions exactly where you want them to start and it eliminates the two layers of 'container' you get with MBR slices containing bsdlabel partitions.

I've switched all my FreeBSD machines to GPT partitioning, including a six year old Athlon XP box, and have had no compatibility problems at all even with legacy BIOS's.

Assuming your Intel SSD is exactly 80GiB, it'll have 167772160 512 byte blocks.

With GPT the first 34 sectors and the last 33 sectors of the device are used by the protective MBR, GPT headers and partition tables.

A partition table similar to your requirements would be as follows:


```
# gpart show ad0
=>          34  167772093  ad0	gpt  (80GB)
            34        128    1  freebsd-boot  (64K)
           162          6       - free -  (3K)
           168    4194304    2  freebsd-ufs  (2G)
      41904472    4194304    3  freebsd-ufs  (2G)
      46098776    8388608    4  freebsd-ufs  (4G)
      54487384  113284704    5  freebsd-ufs  (54G)
     167772088          5       -free -  (3K)
```

ad0p1 is your bootcode partition and will be read only once at boot so it's probably not important align it.  ad0p2-ad0p5 are your UFS partitions and each one starts on a 8 block (4096 byte) boundary and is a multiple of 8 blocks in size.  I omitted swap as I imagine you wouldn't want to put that on an SSD.


----------



## tobe (Nov 18, 2009)

> you created "a" with an offset of 0. One should leave 16 blocks at the beginning unallocated as that is reserved for metadata according to bsdlabel(8).


 Oops, i need to fix that offset (same thing on my laptop) 



> I see your "a" ends on a 128 KB boundary.


 It ends on a 1M boundary (128961+63)*512/(1024*1024) = 63


----------

