# ZFS: 4k alignment for SSD (M4 Crucial)



## hedgehog (Oct 1, 2013)

Hi everyone.

I have seen tons of different information regarding this question, so I'll try to sum it up here. I'm going to create an additional ZFS pool on the SSD and migrate the system there, keeping the data files and home directory on the old HDD (ZFS too). I will be using FreeBSD 9.2-RELEASE and would like to have TRIM enabled and working. As far as I learned, I can do the alignment in the following way (let's assume the SSD is going to be ada1:


 Create partitions (not sure if the freebsd-boot partition needs to be alighed too)
`# gpart create -s gpt ada1`
`# gpart add -b 64 -s 128k -t freebsd-boot -a 4k ada1`
`# gpart add -t freebsd-zfs -b 2048 -a 4k -l ssd0 ada1`
 Create a temporary 4k aligned layer for ZFS, create pool and remove the gnop layer:
`# gnop create -S 4096 /dev/gpt/ssd0`
`# zpool create ssd /dev/gpt/ssd0.nop`
`# zpool export ssd`
`# gnop destroy /dev/gpt/ssd0.nop`
`# zpool import ssd`
 Create ZFS data sets, migrate the data, write bootcode etc.

So, the questions are:

Will that work? Is that a good way to do this?
Will be there any problems if there is would be another pool on the old HDD, not 4k aligned (it has 512b sector size)?


----------



## wblock@ (Oct 1, 2013)

Might as well give the boot partition 512K of space.  That space is otherwise unused, and making it as large as possible could be useful later if the bootcode grows beyond 128K.


----------



## J65nko (Oct 1, 2013)

It looks good to me, but be aware that creating a pool with your method will automagically mount it and possibly overlay your existing filesystem tree if you are not careful  

<shameless_plug>
My Makefile for Vermaden's FreeBSD ZFS root install adapted for 4K sector disks automates the creation of a 4K aligned zpool(8). It also allows you to verify the 4K (2^12) alignment of the resulting pool. 

The usage of variables allows you to specify the disk device(s), the ZFS pool name etc.

You also can easily test your modifications to that Makefile by using an md(4) memory disk instead of a real disk. There are targets to create, destroy the <file>md<file> memory disks and their existing partitioning.

</shameless_plug>


----------



## Sebulon (Oct 1, 2013)

hedgehog said:
			
		

> `# zpool import ssd`



That command has made the pool import with the raw device instead of the GPT-label, at least for me. Remember to specify in which directory ZFS should start looking:
`# zpool import -d /dev/gpt ssd`

/Sebulon


----------



## hedgehog (Oct 2, 2013)

wblock@ said:
			
		

> Might as well give the boot partition 512K of space.  That space is otherwise unused, and making it as large as possible could be useful later if the bootcode grows beyond 128K.


Good point, I was thinking of that too.



			
				J65nko said:
			
		

> It looks good to me, but be aware that creating a pool with your method will automagically mount it and possibly overlay your existing filesystem tree if you are not careful
> 
> <shameless_plug>
> My Makefile for Vermaden's FreeBSD ZFS root install adapted for 4K sector disks automates the creation of a 4K aligned zpool(8). It also allows you to verify the 4K (2^12) alignment of the resulting pool.
> ...


Thank you, sounds like a great tool! However, I would like to learn what is happening and how does it work before using any automated tools.



			
				Sebulon said:
			
		

> That command has made the pool import with the raw device instead of the GPT-label, at least for me. Remember to specify in which directory ZFS should start looking:
> `# zpool import -d /dev/gpt ssd`
> 
> /Sebulon


Hmm, thank you, I'll take that into account. Will the `$ zpool get all POOL` show me what device the pool is bound too? I will be unable to check that myself in the next 8 hours.

Nevermind the last question, just remembered that `$ zpool status` shows that.


----------



## hedgehog (Oct 4, 2013)

J65nko said:
			
		

> It looks good to me, but be aware that creating a pool with your method will automagically mount it and possibly overlay your existing filesystem tree if you are not careful


It didn't overlay my existing filesystem because the SSD's root dataset was mounted into /ssd. I assume if I had anything in /ssd it would get overridden when pool is created?



			
				Sebulon said:
			
		

> That command has made the pool import with the raw device instead of the GPT-label, at least for me. Remember to specify in which directory ZFS should start looking:
> `# zpool import -d /dev/gpt ssd`
> 
> /Sebulon


I just tried to import the pool without specifying devices directory and it automatically caught the GPT label:

```
$ zpool status ssd
  pool: ssd
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ssd         ONLINE       0     0     0
          gpt/ssd0  ONLINE       0     0     0

errors: No known data errors
```

So, I believe, this is not an issue anymore.

Also, regarding ashift:

```
$ zdb
ssd:
    version: 5000
    name: 'ssd'
    state: 0
    txg: 30
    pool_guid: 978916073
    hostid: 2716527942
    hostname: 'lair'
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 978916073
        children[0]:
            type: 'disk'
            id: 0
            guid: 770219244
            path: '/dev/gpt/ssd0'
            phys_path: '/dev/gpt/ssd0'
            whole_disk: 1
            metaslab_array: 33
            metaslab_shift: 30
            ashift: 12
            asize: 128030343168
            is_log: 0
            create_txg: 4
    features_for_read:
```

And GPT partitioning:

```
$ gpart show ada1
=>       34  250069613  ada1  GPT  (119G)
         34          6        - free -  (3.0k)
         40       1024     1  freebsd-boot  (512k)
       1064  250068576     2  freebsd-zfs  (119G)
  250069640          7        - free -  (3.5k)
```


```
$ diskinfo -v ada1
ada1
        512             # sectorsize
        128035676160    # mediasize in bytes (119G)
        250069680       # mediasize in sectors
        4096            # stripesize
        0               # stripeoffset
        248085          # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        0000000013150936835B    # Disk ident.
```

Could you please tell me if the partitions and pool are properly aligned to 4k sector size? I'm thinking that they are, but I don't feel like I'm an expert here.


----------



## J65nko (Oct 4, 2013)

Yes, the partitions and pool are aligned to a sector size of 4K.


Partition start and sizes

4K or 4096 bytes = 8 sectors of 512. Sector numbering start at 0, so the first chunk of 4096 bytes consists of sectors 0-1-2-3-4-5-6-7. That means that the next 4K chunk has to start at sector 8 or multiples of 8. 
You can easily verify the numbers using the '%' modulo operator of bc(1) 

```
[cmd=$]echo '40 % 8' | bc[/cmd]
0
[cmd=$]echo '1024 % 8' | bc[/cmd]
0
[cmd=$]echo '250069640 % 8' | bc[/cmd]
0
```

The ashift value, according the zdb output, is 12.

```
[cmd=$] echo '2 ^ 12' | bc[/cmd]
4096
```

The diskinfo output shows a sector size of 512 and a stripesize of 4096.
That proves that your ssd is a 'real' 4K sectored drive that, only in order not to confuse non-4K-aware operating systems, presents itself as having sectors of 512.


----------

