# VLANs with bhyve guests not getting DHCP



## Drizzt321 (Nov 10, 2020)

I suspect my issue better belongs here than Emulation and Virtualization.

So I'm trying to get a Home-Assistant VM in bhyve (via vm-bhyve) to have a general LAN + Internet, plus IoT/no-internet VLAN to segment off that untrusted crapola that is a lot of the (although useful) IoT. I've got opnSense on the router, and some Ubiquiti switches. Router & FreeBSD NAS/host are on trunk ports. When I plug my laptop into a switch port I set to the VLAN (30) it picks up an IP just fine from the router as I'd expect, within the range I'd expect.

When I set a static IP in the proper range + gateway on the guest, I can ping back and forth fine. 

When I try DHCP, using `tcpdump -i em0.30 port 67 or port 68 -e -n -vv` on the host, I see the DHCP request, but no DHCP offer. On the router on the VLAN30 interface, I see both DHCP request, and DHCP offer. 

Can't quite figure out what is wrong. Seems like it should be firewall, but then again my laptop pulls DHCP just fine. Do I need some additional firewall rules for the host VLAN interface that vm-bhyve created?

My configuration of the vm-bhyve switch set with the VLAN


```
# vm switch list
NAME    TYPE      IFACE      ADDRESS  PRIVATE  MTU  VLAN  PORTS
public  standard  vm-public  -        no       -    -     em0
ha-iot  standard  vm-ha-iot  -        no       -    30    em0
```

`ha-iot` is the switch name, public is the standard one for all the VMs (up until now the only VMs) that also need Internet access and full LAN access.

The VM configuration


```
loader="uefi"
graphics="yes"
xhci_mouse="yes"
graphics_listen="192.168.2.5"
graphics_port="5900"
graphics_wait="no"
graphics_res="800x600"

cpu="4"
memory="4GB"
network0_type="virtio-net"
network0_switch="public"
disk0_type="ahci-hd"
disk0_name="disk0"
disk0_dev="sparse-zvol"
uuid="9edcc9da-1b35-11eb-8576-0015170027d2"
network0_mac="58:9c:fc:06:f4:a0"

network1_type="virtio-net"
network1_switch="ha-iot"
network1_mac="58:9c:fc:5f:71:50"
```

The host interfaces

```
# ifconfig
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=812099<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER>
        ether 00:15:17:00:27:d2
        inet 192.168.2.5 netmask 0xffffff00 broadcast 192.168.2.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        ether 70:85:c2:fc:1f:58
        media: Ethernet autoselect (none)
        status: no carrier
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
lo1: flags=8008<LOOPBACK,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        groups: lo
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
vm-public: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 2e:36:b0:9f:36:7f
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 10 priority 128 path cost 2000000
        member: tap1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 9 priority 128 path cost 2000000
        member: tap0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 8 priority 128 path cost 2000000
        member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 20000
        groups: bridge vm-switch viid-4c918@
        nd6 options=1<PERFORMNUD>
vm-ha-iot: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 6e:19:b8:bb:03:25
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: tap3 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 11 priority 128 path cost 2000000
        member: em0.30 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 7 priority 128 path cost 55
        groups: bridge vm-switch viid-f27b7@
        nd6 options=1<PERFORMNUD>
em0.30: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vm-vlan-ha-iot-em0.30
        options=1<RXCSUM>
        ether 00:15:17:00:27:d2
        inet6 fe80::215:17ff:fe00:27d2%em0.30 prefixlen 64 scopeid 0x7
        groups: vlan vm-vlan viid-ccc4e@
        vlan: 30 vlanpcp: 0 parent interface: em0
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vmnet-unifi-controller-0-public
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:ff:87
        groups: tap vm-port
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 2153
tap1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vmnet-pihole-0-public
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:ff:a9
        inet6 fe80::5a9c:fcff:fe10:ffa9%tap1 prefixlen 64 tentative scopeid 0x9
        groups: tap vm-port
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 2454
tap2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vmnet-home-assistant-0-public
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:f2:21
        inet6 fe80::5a9c:fcff:fe10:f221%tap2 prefixlen 64 tentative scopeid 0xa
        groups: tap vm-port
        media: Ethernet autoselect
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        Opened by PID 2385
tap3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vmnet-home-assistant-1-ha-iot
        options=80000<LINKSTATE>
        ether 58:9c:fc:10:60:08
        inet6 fe80::5a9c:fcff:fe10:6008%tap3 prefixlen 64 scopeid 0xb
        groups: tap vm-port
        media: Ethernet autoselect
        status: active
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
        Opened by PID 2385
```

And finally, the `ipfw` configuration, which I think is wide open


```
# ipfw list
00100 allow ip from any to any via lo0
00200 deny ip from any to 127.0.0.0/8
00300 deny ip from 127.0.0.0/8 to any
00400 deny ip from any to ::1
00500 deny ip from ::1 to any
00600 allow ipv6-icmp from :: to ff02::/16
00700 allow ipv6-icmp from fe80::/10 to fe80::/10
00800 allow ipv6-icmp from fe80::/10 to ff02::/16
00900 allow ipv6-icmp from any to any icmp6types 1
01000 allow ipv6-icmp from any to any icmp6types 2,135,136
65000 allow ip from any to any
65535 deny ip from any to any
```


----------



## VladiBG (Nov 10, 2020)

If you want to test if it's the ipfw then disabled it. If it work when you disabled the ipfw then  check if ipfw is filtering your bridge packets
net.link.ether.bridge.ipfw=1


----------



## Drizzt321 (Nov 10, 2020)

Good thought VladiBG, tried that, doesn't seem to have helped.

It's odd, though, I can use dhclient on em0.30 and it gets a DHCP IP in the range for the VLAN fine, so the host machine/interface is on the VLAN correctly. 

I also created a quick little Debian VM on the ha-iot switch, and during the install/configuration process it can't get DHCP either, although I also see, on the router, the DHCP request/offer messages. Likewise with the HA VM, I see the requests on em0.30, but no responses. 

So the request is making it to the router, the router is responding, but apparently not getting back to the DHCP client. Could be something on the host is blocking it, could be something on the router is blocking the outgoing. Although that'd be very strange, since both my laptop, when plugged into a VLAN tagged port, and my NAS via the VLAN iface, both properly get the IP from the VLAN DHCP block. 

I even tried rebooting with `firewall_enable="no"` in my /etc/rc.conf. Still no good. When it's enabled, here's the bridge.ipfw:


```
# sysctl net | grep ipfw
net.link.ether.ipfw: 0
net.link.bridge.ipfw: 0
net.link.bridge.ipfw_arp: 0
```

I wonder if maybe I should try mirroring the trunk switch port to my NAS to see if the offer is making it on the wire. I suspect so, it should be. I'll have to give that a shot tomorrow, bed time for me now.


----------



## VladiBG (Nov 10, 2020)

You don't have to reboot. you can enable/disable ipfw with
`ipfw enable`
`ipfw disable`
or
`/etc/rc.d/ipfw stop`
`/etc/rc.d/ipfw start`
or
`/etc/rc.d/ipfw disable`
`/etc/rc.d/ipfw enable`

Anyway it's not the IPFW that block your DHCP offer.

Did you get ip address via dhcp when you leave em0.30 on the host (remove it from the bridge)
Note that you can't have multiple interfaces with DHCP as they will have conflict setting the default gateway and DNS. So only for the test you can put em0.30 on the host only and check if it get IP address via DHCP to confirm that 802.1q is working.

`dhcpclient`edit the same /etc/resolv.conf that's why it will conflict with different dhcpclients on more than one interface on the host. Also it will fail to insert default gateway as you will already have that route in routing table.

When you get it working and setup your ipfw you may need


```
allow udp from 0.0.0.0 68 to 255.255.255.255 67 out
allow udp from any 67 to any 68 in
allow udp from any 67 to 255.255.255.255 68 in
```


----------



## Drizzt321 (Nov 10, 2020)

I know I didn't really have to reboot, I was just being excessively paranoid.

I got the IP via DHCP on em0.30 while part of the bridge. Before I got the IP the VM tried to get an IP via DHCP, and I merely tested (ran `dhclient em0.30`) it to get it to resolve. 

Didn't realize you shouldn't run dhclient on multiple interfaces, although https://forums.freebsd.org/threads/dhcp-client-on-several-interfaces.15341/#post-89430 implies I can with a bit of configuration work to ensure default route doesn't get messed up.

Leaving along DHCP on the em0.30 interface, when i added those FW rules 


```
ipfw add 02000 allow udp from 0.0.0.0 68 to 255.255.255.255 67 out
ipfw add 02001 allow udp from any 67 to any 68 in
ipfw add 02002 allow udp from any 67 to 255.255.255.255 68 in
```

Didn't make any difference at all. I think next step, when I have a chance later today after work, I'll setup the port mirroring to make sure the issue isn't upstream somehow. 

Doing a very bit of quick attempts to get accurate VLAN ID from tcpdump (see https://serverfault.com/a/545637), I think I'm just going to mirror the port and capture everything, but that'll take me longer. So later. *sigh* Well, I'm sure I'll come out of this with a greater understanding of VLAN and all the network internals, just frustrating at the moment.


----------



## Drizzt321 (Nov 15, 2020)

OK, so update. I'm doing port-mirroring on the NAS trunk port. Viewing the dump, I'm seeing the DHCP Request & Offer both, for the VM client. 

I also configured, via rc.conf, the FreeBSD host VM interface as DHCP configuration. I see that on the mirror port, _and_ the tcpdump on the host VLAN interface. All good.

When I try again with the VM VLAN interface, on the host dhcp I again see the DHCP Request, but no DHCP Offer. Although on the mirror port dump I _do_ see the Offer response. I'm getting very, very confused at this issue.


----------



## VladiBG (Nov 15, 2020)

Are you tagging the traffic inside your VM?


----------



## Drizzt321 (Nov 15, 2020)

VladiBG said:


> Are you tagging the traffic inside your VM?


No, I'm letting the host interface deal with that, which is why using the bridge should work just fine. In my searches I've seen other people say to use this sort of setup with VMs. 

The other thing I've seen, which is not what's in use, is epair interfaces on the host for the VM. Doesn't seem like that should be the issue, since they're still tied to the bridge like how I'm doing it.


----------



## Drizzt321 (Nov 17, 2020)

Ah ha! Looks like the response is ending up at em0, not em0.30 for some reason. I added a logging firewall rule to match the destination IP the router is trying to give out. OK, now I have a particular point I can really dig into. And for some reason tcpdump wasn't showing any output for `tcpdump -i em0 port 67 or port 68 -e -n -vv` for some reason.


```
ipfw add 64000 allow log logamount 9999999 ip from any to 10.30.122.30

Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 in via em0
Nov 17 10:32:14 darkserver syslogd: last message repeated 1 times
Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via vm-public
Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap3
Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap2
Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap1
Nov 17 10:32:14 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap0
```


----------



## Drizzt321 (Nov 18, 2020)

Further example, to show both sides of send/receive to show that the DHCP Request is going out the VLAN port


```
64000 allow log logamount 9999999 ip from any to any 68
64001 allow log logamount 9999999 ip from any to any 67
```


```
Nov 18 12:30:30 darkserver kernel: ipfw: 64001 Accept UDP 0.0.0.0:68 255.255.255.255:67 in via tap3
Nov 18 12:30:30 darkserver syslogd: last message repeated 1 times
Nov 18 12:30:30 darkserver kernel: ipfw: 64001 Accept UDP 0.0.0.0:68 255.255.255.255:67 out via vm-ha-iot
Nov 18 12:30:30 darkserver kernel: ipfw: 64001 Accept UDP 0.0.0.0:68 255.255.255.255:67 out via em0.30
Nov 18 12:30:30 darkserver kernel: ipfw: 64001 Accept UDP 0.0.0.0:68 255.255.255.255:67 in via vm-ha-iot
Nov 18 12:30:30 darkserver kernel: ipfw: 64001 Accept UDP 0.0.0.0:68 255.255.255.255:67 in via tap3
Nov 18 12:30:31 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 in via em0
Nov 18 12:30:31 darkserver syslogd: last message repeated 1 times
Nov 18 12:30:31 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via vm-public
Nov 18 12:30:31 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap2
Nov 18 12:30:31 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap1
Nov 18 12:30:31 darkserver kernel: ipfw: 64000 Accept UDP 10.30.0.1:67 10.30.122.30:68 out via tap0
```


----------



## sol289 (Nov 20, 2020)

First of all, draw a scheme of your setup, because now it is not clear who's doing what and where.
Next, provide all switching and trunking configuration of your L2 links.


----------

