# VNET ARP replies are lost



## roeeb (Jun 11, 2019)

My setup is as follows:

*freenas* ---- *switch* ---- *station

freenas* is running FreeBSD 11.2 with iocage *jail* using VNET/VIMAGE network stack (though same behaviour is observed on warden jails and previous FreeBSD releases)

It all started when I noticed that *station* loses connection to *jail*. Further investigation revealed that *station* does not receive ARP replies to "long" ARP requests. By "long" ARP I mean ARP messages padded with more than 18 zero bytes (minimum Ethernet payload size is 46 bytes while ARP header is 28 bytes, therefore ARP messages are padded with at least 18 bytes).

*Apparently, Apple wireless driver pads ARP requests with more than 18 bytes therefore they cannot see VNET jails *(verified on macOS and tvOS). This padding is only visible when sniffing the packet (tcpdump) on *switch* or *freenas *(the tcpdump on *station* shows no padding whatsoever, not even the minimum 18 bytes. tcpdump probably captures the packet before the padding takes place. FreeBSD tcpdump behaves the same).

Here is how the ARP reply is lost:
1. *station* sends ARP request (padding is not visible here yet):

```
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.240 tell 192.168.1.105, length 28
    0x0000:  ffff ffff ffff 20c9 f123 6023 0806 0001  ..........`#....
    0x0010:  0800 0604 0001 20c9 f123 6023 c0a8 0169  ..........`#...i
    0x0020:  0000 0000 0000 c0a8 01f0                 ..........
```
3. *switch* sees the "long" request (see padding of 32 bytes):

```
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.240 tell 192.168.1.105, length 60
    0x0000:  ffff ffff ffff 20c9 f123 6023 0806 0001  ..........`#....
    0x0010:  0800 0604 0001 20c9 f123 6023 c0a8 0169  ..........`#...i
    0x0020:  0000 0000 0000 c0a8 01f0 0000 0000 0000  ................
    0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0040:  0000 0000 0000 0000 0000                 ..........
```
2. *jail* receives the request and sends a reply. tcpdump on vnet0 (see the reply is padded with 14 bytes before it leaves vnet0, this is not expected, see notes below):

```
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.240 tell 192.168.1.105, length 60
        0x0000:  ffff ffff ffff 20c9 f123 6023 0806 0001  ..........`#....
        0x0010:  0800 0604 0001 20c9 f123 6023 c0a8 0169  ..........`#...i
        0x0020:  0000 0000 0000 c0a8 01f0 0000 0000 0000  ................
        0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
        0x0040:  0000 0000 0000 0000 0000                 ..........
ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.240 is-at d1:23:99:f9:6f:6c, length 42
        0x0000:  20c9 f123 6023 d123 99f9 6f6c 0806 0001  ....`#.P..ol....
        0x0010:  0800 0604 0002 d123 99f9 6f6c c0a8 01f0  .......P..ol....
        0x0020:  20c9 f123 6023 c0a8 0169 0000 0000 0000  ....`#...i......
        0x0030:  0000 0000 0000 0000                      ........
```
3. *freenas* attempts to bridge that reply. tcpdump of the physical interface:

```
ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.240 is-at d1:23:99:f9:6f:6c, length 42
        0x0000:  20c9 f123 6023 d123 99f9 6f6c 0806 0001  ....`#.P..ol....
        0x0010:  0800 0604 0002 d123 99f9 6f6c c0a8 01f0  .......P..ol....
        0x0020:  20c9 f123 6023 c0a8 0169 0000 0000 0000  ....`#...i......
        0x0030:  0000 0000 0000 0000                      ........
```
4. *switch* doesn't show the ARP reply. *Packet is lost*!

I can easily replicate this using scapy inside another jail (say *jail2*) where *jail2* sends a request on behalf of *station. station* will then receive or not receive a reply, depending on the padding length. _station_mac_ and _station_ip_ are *station* MAC and IP addresses, _jail_ip_ is *jail* IP address, and _padding_ is the padding length in bytes:

```
station_mac='20:c9:f1:23:60:23'
station_ip='192.168.1.105'
jail_ip='192.168.1.240'
padding=19
arp=ARP(hwsrc=station_mac,psrc=station_ip,pdst=jail_ip)
ether=Ether(dst='ff:ff:ff:ff:ff:ff',type=0x0806)
sendp(ether/arp/Padding(load='\x00'*padding))
```

On *station* I run:

```
tcpdump -XXvi en1 arp host 192.168.1.240
```
Where I see only the request sent from *jail2 *(the reply from *jail *is lost). When changing to *padding=18* I can see the the reply from *jail*.


Worth noting:

*freenas *physical interface, doesn't have this problem. It replies to "long" ARP requests
On step #2 the reply is mistakenly padded with 14 bytes which is exactly the number of bytes beyond the 18 bytes in the request (the request was padded with 32 bytes). I bet this is part of the bug. By looking at FreeBSD ARP reply code it actually creates the reply by editing the request bytes in place. For some reason it removes only 18 bytes from the request padding. However, this happens only on VNET interface as noted above
I tried it with different switches (from different manufacturers) - all behaves the same, so the packet is lost in *freenas*
There is no packet filtering enabled that I am aware of
I suspect this post from Dec 2017 is having the same issue


----------



## openjohn (Aug 24, 2019)

I believe I am also seeing this issue, running FreeBSD 12.0 with jail using VNET.


----------



## openjohn (Aug 25, 2019)

I've got a test setup with two jails each on two direct attached hosts. The problem comes and goes on both hosts.

In an affected jail, I also can't add static arp entries:

```
# arp -s 10.0.0.1 XX:XX:XX:XX:XX:XX
arp: writing to routing socket: Cannot allocate memory
```

Whereas, in an unaffected jail the command succeeds.


----------



## openjohn (Aug 25, 2019)

Filed a bug report.


----------



## Matej Ondrusek (Aug 22, 2020)

I was debugging similar problem and have found solution which seems to be the same as issue here. 

In my opinion, problem is visible on this line:
_ARP, Ethernet (len 6), IPv4 (len 4), Reply 192.168.1.240 is-at d1:23:99:f9:6f:6c, length 42_

*d1:23:99:f9:6f:6c* is not a valid MAC address for a host because it is a multicast address, not a unicast one. Least significant bit from the first octet says that (d1 is odd number, not even), for details see Wikipedia description.

I believe that changing the MAC address to the valid one will solve issue described.


----------

