# Performance in different partitions



## Young (Apr 10, 2013)

Hi,

I'm experiencing an issue with performance tests in different UFS partitions with the same tunefs options.

System information:


```
# uname -a
FreeBSD cache.local 9.1-RELEASE FreeBSD 9.1-RELEASE #3: Tue Apr  9 11:21:57 BRT 2013     root@cache.local:/usr/obj/usr/src/sys/CUSTOM  amd64
```


```
# cat /etc/fstab
# Device        Mountpoint      FStype  Options         Dump    Pass#
/dev/ada0p2     /               ufs     rw              1       1
/dev/ada0p3     none            swap    sw              0       0
/dev/ada0p4     /cache          ufs     rw,noatime      2       2
```


```
# df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/ada0p2     38G    4.4G     31G    12%    /
devfs          1.0k    1.0k      0B   100%    /dev
/dev/ada0p4    101G    4.8G     88G     5%    /cache
devfs          1.0k    1.0k      0B   100%    /var/named/dev
```


```
# tunefs -p /
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)
```


```
# tunefs -p /cache
tunefs: POSIX.1e ACLs: (-a)                                disabled
tunefs: NFSv4 ACLs: (-N)                                   disabled
tunefs: MAC multilabel: (-l)                               disabled
tunefs: soft updates: (-n)                                 enabled
tunefs: soft update journaling: (-j)                       enabled
tunefs: gjournal: (-J)                                     disabled
tunefs: trim: (-t)                                         disabled
tunefs: maximum blocks per file in a cylinder group: (-e)  4096
tunefs: average file size: (-f)                            16384
tunefs: average number of files in a directory: (-s)       64
tunefs: minimum percentage of free space: (-m)             8%
tunefs: optimization preference: (-o)                      time
tunefs: volume label: (-L)
```

Test procedures:


```
# dd bs=1m of=/dev/null if=/dev/ada0p2
```


```
# dd bs=1m of=/dev/null if=/dev/ada0p4
```

Test results:


```
# gstat
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0| cd0
    1    652    652  83501    1.5      0      0    0.0   96.6| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p1
    1    652    652  83501    1.5      0      0    0.0   97.7| ada0p2
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p3
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p4
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/8d571973-a061-11e2-a919-00188be1ed87
```


```
# gstat
dT: 1.001s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0| cd0
    1    552    552  70713    1.8      0      0    0.0   97.1| ada0
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p1
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p2
    0      0      0      0    0.0      0      0    0.0    0.0| ada0p3
    1    552    552  70713    1.8      0      0    0.0   98.0| ada0p4
    0      0      0      0    0.0      0      0    0.0    0.0| gptid/8d571973-a061-11e2-a919-00188be1ed87
```

As we can see, the p2 partitions performs 99 Mbps faster them p4. Why is that?

Thank you very much.


----------



## wblock@ (Apr 10, 2013)

The p2 partition is at the start of the disk, more sectors going past the heads in the same time.  Sometimes the start of the disk is nearly twice as fast as the end.  The first partition is smaller, too, so when the drive seeks, it does not have as far to go as the other partition.


----------



## Young (Apr 10, 2013)

wblock@ said:
			
		

> The p2 partition is at the start of the disk, more sectors going past the heads in the same time.  Sometimes the start of the disk is nearly twice as fast as the end.  The first partition is smaller, too, so when the drive seeks, it does not have as far to go as the other partition.



Thanks for the fast reply. Do you think that I need to put this partition at the start of the disk and left the system partition at the end? I'm using Gigabit NIC to provide a faster Squid throughput performance since I use this box for Windows Update caching (big files).

My squid.conf:


```
acl localnet src 10.0.0.0/8     # RFC1918 possible internal network
acl localnet src 172.16.0.0/12  # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl localnet src fc00::/7       # RFC 4193 local private network range
acl localnet src fe80::/10      # RFC 4291 link-local (directly plugged) machines

acl SSL_ports port 443
acl Safe_ports port 80          # http
acl Safe_ports port 21          # ftp
acl Safe_ports port 443         # https
acl Safe_ports port 70          # gopher
acl Safe_ports port 210         # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280         # http-mgmt
acl Safe_ports port 488         # gss-http
acl Safe_ports port 591         # filemaker
acl Safe_ports port 777         # multiling http
acl CONNECT method CONNECT

http_access allow localhost manager
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports

http_access allow localnet
http_access allow localhost
http_access deny all

http_port 3128

maximum_object_size 2048 MB

cache_dir ufs /cache 80000 16 256
coredump_dir /cache

range_offset_limit -1
quick_abort_min -1

refresh_pattern -i microsoft.com/.*\.(cab|exe|dll|msi|psf) 259200 100% 259200 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-store ignore-must-revalidate ignore-private ignore-auth
refresh_pattern -i windowsupdate.com/.*\.(cab|exe|dll|msi|psf) 259200 100% 259200 override-expire override-lastmod reload-into-ims ignore-reload ignore-no-store ignore-must-revalidate ignore-private ignore-auth

refresh_pattern ^ftp:           1440    20%     10080
refresh_pattern ^gopher:        1440    0%      1440
refresh_pattern -i (/cgi-bin/|\?) 0     0%      0
refresh_pattern .               0       20%     4320

visible_hostname cache.local
```

Thank you.


----------



## wblock@ (Apr 10, 2013)

Disk throughput is likely so much faster than available network bandwidth that it won't matter.


----------



## Young (Apr 10, 2013)

wblock@ said:
			
		

> Disk throughput is likely so much faster than available network bandwidth that it won't matter.



I need to disagree with that. The problem with the Gigabit Ethernet today is the bottleneck of disk read/write operation. In the tests, the best performance of my disk reaches 83501 kBps or 652 Mbps. This is 65% of the Gigabit connection.

More info: http://www.tomshardware.com/reviews/gigabit-ethernet-bandwidth,2321-9.html


----------



## wblock@ (Apr 10, 2013)

Generally, gigabit network transfers are, at best, around 400-600Mbytes/second.  To keep up with that, you'll need a RAID, or at least an SSD.

But last I knew, Microsoft made sure that Windows updates could not be buffered without running their own special WSUS.  Which, of course, only runs on a Windows server.


----------



## Young (Apr 10, 2013)

wblock@ said:
			
		

> Do you actually get gigabit speeds on the incoming data meant to be buffered by Squid?  You say this is for Windows updates, which will almost certainly be coming in from your ISP network connection.  The bottleneck will be the actual bandwidth available on your WAN connection.  If it's truly gigabit transfers, a RAID array or SSD will be the best to keep up with it.
> 
> But it's probably academic anyway.  Last I knew, Microsoft made sure that Windows updates could not be buffered without running their own special WSUS.



I got up to 300 Mbps (HIT) in large Windows Update files (Service Packs, nVidia Drivers, etc). The bottleneck of the WAN connection is just at the first download of the files (MISS). Unfortunately I don't have an RAID compatible hardware or SSD disks. 

I haven't any problem with Windows Update caching. Just need to add a few rules like I showed before.


----------



## wblock@ (Apr 10, 2013)

Forgot to mention that diskinfo(8)'s benchmark gives a nice comparison of the speeds that can be gained from "short-stroking" the disk:
`% diskinfo -tv ada0`

It can be run on partitions also.


----------



## Terry_Kennedy (Apr 11, 2013)

wblock@ said:
			
		

> Generally, gigabit network transfers are, at best, around 400-600Mbytes/second.  To keep up with that, you'll need a RAID, or at least an SSD.


Did you slip up and write bytes instead of bits? Or am I missing something obvious?

Gigabit = 1,000,000,000 bits/sec. Divide by 8 to get bytes/sec, giving 125,000,000 bytes/sec, or 125MBytes/sec. This is well within the performance envelope of some single disks, and definitely within the envelope for a stripeset of 2 modern disks.

This ignores Ethernet header overhead, TCP/IP overhead, etc. so actual performance will be somewhat less. I think the maximum I've seen is about 108MByte/sec.


----------



## Terry_Kennedy (Apr 11, 2013)

wblock@ said:
			
		

> The p2 partition is at the start of the disk, more sectors going past the heads in the same time.  Sometimes the start of the disk is nearly twice as fast as the end.  The first partition is smaller, too, so when the drive seeks, it does not have as far to go as the other partition.


It also could be an Advanced Format drive (4KB sectors, lying to the OS and claiming 512 byte sectors) where there is a different amount of misalignment on the two partitions. I don't see that the original poster told us his drive model, nor the raw [cmd=""]bsdlabel[/cmd] output showing offsets.


----------



## wblock@ (Apr 11, 2013)

Terry_Kennedy said:
			
		

> Did you slip up and write bytes instead of bits? Or am I missing something obvious?
> 
> Gigabit = 1,000,000,000 bits/sec. Divide by 8 to get bytes/sec, giving 125,000,000 bytes/sec, or 125MBytes/sec. This is well within the performance envelope of some single disks, and definitely within the envelope for a stripeset of 2 modern disks.
> 
> This ignores Ethernet header overhead, TCP/IP overhead, etc. so actual performance will be somewhat less. I think the maximum I've seen is about 108MByte/sec.



Looking at it now, I don't know what I was thinking, disk speeds maybe.  You're right, though.



			
				Terry_Kennedy said:
			
		

> It also could be an Advanced Format drive (4KB sectors, lying to the OS and claiming 512 byte sectors) where there is a different amount of misalignment on the two partitions. I don't see that the original poster told us his drive model, nor the raw [cmd=""]bsdlabel[/cmd] output showing offsets.



Now that I did consider.  But the original speed numbers were just reads from a raw device, where a misalignment would only matter at the start and end and would not have the huge impact misalignment has on writes.


----------



## throAU (Apr 15, 2013)

Tangent: 

If you're trying to cache Windows updates, use the correct tool for the job - a copy of WSUS (Free, will run on a Windows XP machine if need be from memory).

This will allow you to stagger approvals (i.e., you can set up a group for beta testing updates before greater roll-out), monitor roll-out progress, set deadlines for updates (to ensure they are deployed), etc.


I used to cache updates with Squid, I wish I started running WSUS sooner.


----------

