# ARP Issue: Bridging, Routing, and FreeBSD LAGGs



## jasonvp (Nov 27, 2015)

Since converting my router over to FreeBSD, I've been running into a fun (FSVO: fun) challenge with ARP entries on it with respect to one of my servers.  I've used Graffle to chicken-scratch a little diagram as seen here:





The Verizon router is a Juniper of some variety; I only know that via it's MAC address.  The ONT is connected to one of my router's 3 Ethernet interfaces.  Per the diagram, my router is then connected via 2 GigE interfaces to my managed switch, each connection to a different VLAN.  Also connected to that switch: a FreeBSD server that has a 2xGigE LACP LAGG.

Because of Verizon's stupidity when it comes to business class IP addressing, my public VLAN (noted with the red line) and the Verizon external network (black line) are actually the same broadcast domain.  Which means the router is actually bridging between interfaces em0 and re0.  And it has an IP address on that bridge interface with a default route pointing to VZ's Juniper.

The server in question also has a bunch of jails on it, which are each taking up an IP address on that same public VLAN.  The VLAN is a /24, even though I only own 13 (not 14... idiots!) IP addresses on it.

That's the set up.  Now the problem: the router invariable decides the MAC addresses of the server and the jails on it belong to the Verizon router, _not_ the LAGG interface on the server.  And I can't quite figure out why.  When that happens, I have a lot of difficulty getting to my public-facing jails and server from the private VLAN (noted with the green line).

My work-around for the time being is to set a series of static ARP entries on the router.  For each IP address on that server, set the ARP to the LAGG's MAC.

Of note, there are other devices in that same public (red) VLAN.  Such as a Windows PC.  The ARP entry on the router for the Windows box stays latched to the Windows' MAC address.  It just seems to be the FreeBSD server with the LAGG int that has the problems.

Lots of words, but I figured verbosity would help.  Any ideas?


----------



## jasonvp (Nov 28, 2015)

My gut tells me that the solution to this lies somewhere in ARP filtering.  But not all ARPs.  I need to _somehow_ block any ARP inbound on em0 (facing Verizon) for any IP address that I own.  I'm not quite sure if that's even possible.


----------



## jasonvp (Nov 28, 2015)

A little more data.  The router has a permanent ARP entry that I put in:

```
lateapex-gw# arp -n 1.2.3.4
? (1.2.3.4) at 0c:c4:7a:31:e3:d8 on bridge0 permanent [bridge]
```

That MAC is the lagg0 interface on the server in the diagram:

```
joker$ ifconfig lagg0 | grep ether
    ether 0c:c4:7a:31:e3:d8
```

If I nuke the static entry on the router for testing purposes:

```
lateapex-gw# arp -d 1.2.3.4
1.2.3.4 (1.2.3.4) deleted
```

It's gone.  Now, from a machine on the private VLAN (note the green network in the diagram), I telnet to 1.2.3.4 port 80, because it's a jail running apache.  The first one succeeds:

```
deadshot$ telnet 1.2.3.4 80
Trying 1.2.3.4...
Connected to somehostname.
Escape character is '^]'.
```

And if I look on the router, the arp entry is incorrect:

```
lateapex-gw# arp -n 1.2.3.4
? (1.2.3.4) at 54:e0:32:be:cf:c1 on bridge0 expires in 1197 seconds [bridge]
```

That MAC is the Verizon router on the same broadcast domain.  Now that my router has the incorrect ARP entry, further telnets to 1.2.3.4 port 80 fail:

```
deadshot$ telnet 1.2.3.4 80
Trying 1.2.3.4...
```

If I hard-set the ARP entry on the router for IP 1.2.3.4 -> 0c:c4:7a:31:e3:d8, it's good and stays that way:

```
lateapex-gw# arp -S 1.2.3.4 0c:c4:7a:31:e3:d8
1.2.3.4 (1.2.3.4) deleted
lateapex-gw# arp -n 1.2.3.4
? (1.2.3.4) at 0c:c4:7a:31:e3:d8 on bridge0 permanent [bridge]
```

After that, I can repeatedly get to 1.2.3.4's port 80 from the private VLAN:

```
deadshot$ telnet 1.2.3.4 80
Trying 1.2.3.4...
Connected to somehostname.
Escape character is '^]'.
^]
telnet> Connection closed.
deadshot$ telnet 1.2.3.4 80
Trying 1.2.3.4...
Connected to somehostname.
Escape character is '^]'.
^]
telnet> Connection closed.
```


----------

