# Incredible disk I/O claims. Can ~ 4 GB/s be realistic?



## BonHomme (Apr 17, 2019)

I met a guy who says he builds his own servers and claims to reach incredable disk i/o speeds. 
As proof he has send me the output of 2 bonnie++ files. 

The top one should be a FreeBSD server with 12 disks (50TB) and 64GB memory
The bottom one is a Nextcloud  VM with 1 GB memory and 1TB storage

Can these results be realistic? And how can the VM be faster than the server itself? And what about the high cpu usage and the enormous differences in latency between the server and the VM


----------



## D-FENS (Apr 17, 2019)

This is completely reallistic. If you mirror drives or in a RAID with redundancy, the drives' speed adds up because two or more drives can read the same file in parallel.
Also, with NVMe drives you have comparable speeds in a single drive anyway. See what one of my drives can do:

```
Transfer rates:
        outside:       102400 kbytes in   0.043894 sec =  2332893 kbytes/sec
        middle:        102400 kbytes in   0.043879 sec =  2333690 kbytes/sec
        inside:        102400 kbytes in   0.042000 sec =  2438095 kbytes/sec
```

"Can a VM be faster than the server itself?"
Well, yes and no. It depends on what you're measuring. If the VM has a dedicated storage pool with faster throughput than the storage of the host, it could happen. This is surely possible depending on how you configure the VM.
But strictly speeking, the VM is as fast as the server is, because ultimately the VM runs *on* the server.


----------



## Phishfry (Apr 17, 2019)

What NVMe drive is that?


----------



## D-FENS (Apr 18, 2019)

Samsung 970 EVO.


----------



## Ordoban (Apr 18, 2019)

I think the faster VM is a caching effect. The host is caching the disk IO from the VM.
Maybe you should force Bonnie++ to use 128G testfile size in the VM too.


----------



## ralphbsz (Apr 18, 2019)

What are we trying to measure here?  If it is the actual backend (disk drive or virtual disk drive) bandwidth, then one has to be more careful.

A physical disk drive (spinning rust) can do about 100 - 200 MByte/s; the first number is for realistic large block random IO (which you would get from a good file system when reading/writing large files), the second number is for streaming IO (very large files, with little or no parallelism).  Flash drives are much faster, and there is no simple estimate for them.

So with 12 drives, long-term sustained 4 GByte/s is not realistic, but about 1-2 GByte/s are.  On the other hand, since the server has 64GByte of memory, and the test file was only 128 GByte, it is quite possible that about half of the test file was in memory cache already, which roughly doubles the number.  To do the measurement correctly, one either has to use a working set that is much larger than RAM, or clear the cache first and then account for cache filling 

For a VM, the sky is the limit: You have no idea how the virtual disk drive is implemented.  It might for example consist purely of fast flash storage, connected to a really fast network (big cloud providers today install 100gig Ethernet).  It might also consist of much less.


----------



## BonHomme (Apr 18, 2019)

Hello everybody, thank you for your reactions. The thing is your guess is as good as my guess because I don't how this server was set up. 

I hardly know the guy and I guess he was trying to impress me with these figures. But he did not want to explain how he is doing this. That is why he has send me the bonnie++ outputs. He claims he uses ZFS and that he can sustain the 4GB for a very long time which I believe could be interesting because it is 4 times more than is necessary to saturate a 10 gig ethernet connection. 

As he also uses ZFS this means much of the most frequent used data can be reached very fast, while he can load the missed hits in the background from his much slower hard disks. So the result for a webserver, even with heavy data i/o, will be that the client will very seldom have to wait for the data, because in general the data will be offered to the client faster than a 10 gig connection ever can handle. Or am I seeing this wrong?  

Anyway, as ralphbsz suggests I will ask him to repeat the Bonnie++ test with a much bigger testfile, say 512 GByte? If anybody still has other suggestions, please let me know. I will keep you posted.


----------

