# Poor performance while routing to and from VNET jail



## raitech (Apr 7, 2019)

Hello, folks!

I am using FreeBSD 12 under KVM for vnet jail and PF testing, and I am seeing a very bad performance of any kind (TCP or UDP) of traffic between the host and guest's vnet jail.

My setup:

Ubuntu 18.04 as host.
FreeBSD 12 as guest, with one VirtIO network card talking to the host and the Internet (in fact, the problem occurs with Intel em too), gateway_enable=YES in rc.conf.
Two interconnected netgraph interfaces, one to FreeBSD guest, the other to the vnet jail

There are no bridges, because my goal is to test FreeBSD guest as a router between the outside world and the vnet jail.

My interpretation is that the FreeBSD guest would simply route traffic that comes in from the VirtIO interface to the netgraph, when the destination is the other side of the interconnected ng_eifaces. Indeed, that is what happens, given that the routing tables are correctly configured - host can ping guest's vnet jail and vice-versa.

But when I do something like:

`nc -l 500 < /dev/zero`

inside the guest's vnet jail, and from the host I issue:

`nc 10.0.0.2 > /dev/null` (assuming that 10.0.0.0/24, from host, is pointing to guest's VirtIO interface's address and that the vnet jail has the way back setup too)

I can not go much further than 10Mbps. I was expecting at least 1000Mbps.

But if I issue the same command inside the guest instead, I can achieve 12000Mbps. Yes!, 12Gbps. That is just amazing! Or, if set it up two nc "servers" inside the jail, I can get somewhat near 7Gbps of down and upstreams.

Any clue on what can go wrong with this setup?

Thank you all!


----------



## Hiroo Ono (Apr 7, 2019)

If you want just to connect vnet jail and FreeBSD guest, use epair().
See Thread jail-vnet-slaac-ipfw.69690. It is not exactily what you want, but should be useful.
If you still want to use netgraph, how about using ng_eiface? There is an example in /usr/share/examples/netgraph/virtual.chain.


----------



## Hiroo Ono (Apr 7, 2019)

Sorry, you were using ng_eiface(). I misread it to be ng_iface().


----------



## raitech (Apr 7, 2019)

Yeah, usign ng_eiface.

This is the script that I use to bring them to life:


```
$ cat /root/ngethers.sh
if [ "x$1" = "xcreate" ]; then
    ngctl mkpeer eiface ether ether
    ngctl mkpeer eiface ether ether
    ngctl connect ngeth0: ngeth1: ether ether
    ifconfig ngeth0 link aa:bb:cc:00:00:00
    ifconfig ngeth1 link aa:bb:cc:00:00:01
fi

if [ "x$1" = "xdelete" ]; then
    ngtctl kill ngeth0:
    ngtctl kill ngeth1:
fi
```

But my first bunch of hours with this problem was over epair interfaces.

I am really amused with the fact that every piece of text that I came across on this subject talks about bridging. Well, may be that is the problem, the lack of a bridge. Strange, to say the least.

Two tests that I must execute:

- run on baremetal
- insert a bridge


----------



## D-FENS (Apr 7, 2019)

raitech said:


> Any clue on what can go wrong with this setup?


I think you have fallen into the pitfall of using too many virtualization layers. In such situations you have to be really careful when doing performance measurements. In my personal opinion measuring performance in such cases is bogus and unreliable.
So you have an Ubuntu host, then you have a FreeBSD VM that's one layer: virtualized CPU. Then you use VirtIO networking - a second layer. Then you use jails - a third layer. And netgraph - fourth layer.
With so many layers it's quite easy to stumble upon one or more bottlenecks - software or (virtual) hardware.

Actually, I run a similar setup myself for development. I have a GNU Linux host with a FreeBSD VM and VNET jails inside. I have been able to measure speeds of ~150 MBit/s, which is real world because I connect a physical machine from the outside into the jail and then do iperf. In my case it's probably the CPU that's the bottleneck because I can see the VM's CPUs working at ~50%.
The packets pass in this case through a number of other layers on the way - 1 machine's firewall, 2-nd machine's firewall, VM's firewall, jail's firewall, OpenVPN SSL connection AND an SSH tunnel. All of this seems to consume quite a lot of CPU time and the speed is nevertheless 150 Mbit/s, not so bad.


raitech said:


> But when I do something like:
> 
> nc -l 500 < /dev/zero
> 
> ...


You have to be careful here and compare apples to apples. Be aware of what's happening under the hood.
If you measure speeds like 12 Gbps, this is not an actual network throughput. The packets are obviously short-circuited through the FreeBSD kernel.
However, the 10 Mbps speed could be going through slower layers - real or virtualized network. It could be going through a packet filter (aka firewall) etc.

In general, to measure realistic speeds, try to be as close to the metal as possible and try to remove as many layers as possible. Reduce the number of moving parts and vary only one thing at a time. Just install FreeBSD on a physical machine with 1 Gbit/s or higher network card. Create a VNET jail in it. Then probably create an epair and put one side into the jail and the other - bridge to the host's physical network interface, or use routing.
Then start benchmarks/iperf on the jail and on an external physical machine and measure the speed. Do not use netgraph or firewalls. This would make the measurement as realistic as possible.


----------



## pos (Apr 8, 2019)

I have a FreeBSD12 host running on top of a KVM hypervisor. And inside the test jail I can measure:

```
root@prometheus ~]#
[root@prometheus ~]#
[root@prometheus ~]# iocage console TEST2
Last login: Mon Apr  8 23:22:35 on pts/0
FreeBSD 12.0-RELEASE-p3 GENERIC

Welcome to FreeBSD!

root@TEST2:~ # ./bbk_cli_freebsd_amd64-1.0
Start: 2019-04-08 23:23:04
Network operator: Bahnhof AB
Support ID: sth632cb8645
Latency:       1.277 ms
Download:  6,559.140 Mbit/s
Upload:      416.369 Mbit/s
Measurement ID: 295851127
root@TEST2:~ #
```


Around 6,5 Gbit/s (I can get 7 and more also. The above was a random test) to an online test host 6 router hops away from my jail. So the KVM with FreeBSD+jail can deliver. The jail makes me loose approx 0,5-1Gbit/s. I get around 7-7,5Gbit/s in the FreeBSD host. And around 8+ Gbit/s in the KVM hypervisor host.  The 6,5 Gbit/s traffic comes in through my physical FreeBSD 12 firewall (PF with 520 rules), gets vlan tragged and sent to my Linux KVM host holding the FreeBSD 12 vm with the jail on top.  (And yes, I have a 10/10 Gbit/s symmetric internet to my home....)

The KVM hypervisor network card has a bunch of tweaks including using SR-IOV for the FreeBSD guest. But the FreeBSD guest has none...

raitech What network card do you use in your KVM hypervisor? If it supports SR-IOV and you have no requirements to move guests live between hosts, use it!


----------



## Phishfry (Apr 8, 2019)

The layers analogy is a great way of looking at it. All the numbers I am seeing here seem very true.

Even when messing with NVMe I discover a "Layer" on motherboards with bifurcation(PCIe multiplex).
Straight connected versus bifurcation. You take a hit. Maybe 10%. Right from the BIOS.

Enabling NFS serving maybe 25% network hit with no tuning. That layer hurts my NVMe's feelings.


----------



## pos (Apr 8, 2019)

Addition to the last post...

raitech You could test this swedish online test tool. http://www.bredbandskollen.se/en/bredbandskollen-cli/

Sweden has a very good backbone and there are also 10G test servers. I just tested from a vm in NY and got 600 Mbit/s and 950 Mbit/s from a server in NL.


----------



## D-FENS (Apr 9, 2019)

I can also confirm pos'es observation that going through the virtualization layers causes a performance hit.
For example, I did a speed test in my browser. I compared two cases:
(1) using a jail as a SOCKS5 proxy, the connection goes like this: browser -> VM port forwarding -> jail A port forwarding -> jail B OpenVPN server -> jail C SOCKS5 proxy -> jail C NAT -> VM NAT -> Physical Host NAT -> Router NAT -> Internet
(2) using direct Internet connection through the router: browser -> Router NAT -> Internet.

In case (1) the speed topped at about 58 MBit/s. In case (2) the result was 96 MBit/s.
Ping was almost double: (1) 48 ms, (2) 22 ms.

So virtualization definitely slows things down a bit. And VPN is slower than direct connection, this is common knowledge.
So a rule of thumb: The more layers, the more performance will be hit.


----------

