# bridge losing packets



## STL (Oct 28, 2009)

I have a transparent bridge setup (bridge0) connecting 2 broadcom fiber devices.

There is a firewall, but as of right now it is completely disabled...ignoring all relevant interfaces.

The strange occurance that I am seeing is that occasionally (about half of the time) 50% of the pings get no response.  And stranger still it is exactly every other one.  As in the machine behind our firewall only gets responses from the even numbered pings (or the odd).

After doing a TCP dump on both the outside and inside interface, (bge0 bge1 respectively).  It seems bge0 can see all packets going out and coming in.  But bge1 only sees the packets that make it to the computer behind.  

What this means is the bridge is stealing half my ping responses, and I would like them back.

What I don't understand is what the bridge is doing with them and why.  Any theories are greatly appreciated.

Here is my ifconfig, let me know if any other info would be helpful.

PS.  if you know what "role state learning" means at the end of the ifconfig, kinda curious about that as well


```
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
	ether 00:30:48:b9:40:b0
	inet 130.207.197.79 netmask 0xffffff00 broadcast 130.207.197.255
	media: Ethernet autoselect (1000baseTX <full-duplex>)
	status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
	ether 00:30:48:b9:40:b1
	inet 10.0.0.2 netmask 0xff000000 broadcast 10.255.255.255
	media: Ethernet autoselect (10baseT/UTP <full-duplex>)
	status: active
bge0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
	ether 00:15:77:4f:3d:ce
	media: Ethernet autoselect
	status: active
bge1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
	ether 00:15:77:4f:3d:cf
	media: Ethernet autoselect
	status: active
pfsync0: flags=41<UP,RUNNING> metric 0 mtu 1460
	pfsync: syncdev: em1 syncpeer: 224.0.0.240 maxupd: 128
pflog0: flags=141<UP,RUNNING,PROMISC> metric 0 mtu 33160
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x7 
	inet6 ::1 prefixlen 128 
	inet 127.0.0.1 netmask 0xff000000 
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	ether 9a:a2:ca:9d:7d:ee
	id 00:15:77:4f:3d:ce priority 32768 hellotime 2 fwddelay 15
	maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
	root id 00:0f:8f:b4:fb:01 priority 32768 ifcost 63 port 3
	member: bge1 flags=147<LEARNING,DISCOVER,STP,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 4 priority 128 path cost 55 proto rstp
	        role designated state forwarding
	member: bge0 flags=147<LEARNING,DISCOVER,STP,AUTOEDGE,AUTOPTP>
	        ifmaxaddr 0 port 3 priority 128 path cost 55 proto rstp
	        role root state learning
```


----------



## DutchDaemon (Oct 28, 2009)

Role root / Role designated:
http://en.wikipedia.org/wiki/Rapid_Spanning_Tree_Protocol#Rapid_Spanning_Tree_Protocol_.28RSTP.29

I have a transparent bridge with two fiber interfaces too (Intel though), and it looks like this:


```
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:15:17:80:22:92
        media: Ethernet autoselect (1000baseSX <full-duplex>)
        status: active
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:15:17:80:22:93
        [some ip aliases snipped]
        media: Ethernet autoselect (1000baseSX <full-duplex>)
        status: active
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether c6:48:87:bd:73:9a
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: em1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 2 priority 128 path cost 20000
        member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 1 priority 128 path cost 20000
```

The only difference appears to be the absence of stp / rstp roles. Do you need stp / rstp?

BTW: this bridge handles 120+ Mbit/sec both ways, with over 50,000 states (with PF). No packetloss of any kind.


----------



## STL (Oct 28, 2009)

I do need RSTP, once I am confident that this is forwarding packets appropriate I will bring up its clone and partner, they will run in tandem, thus providing redundancy.  

However after reading through that wiki page it seems like the problem may lie in the fact that it seems to be stuck in the learning state.  When I get back to the office tomorrow I will attempt to force it to consider itself an edge, see if that fixes the issue.  If it does, at least I will be a step closer.  

From what I am understand from the wiki, it should only be in learning mode when it is attempting to determine the topology of the network.  But that would mean it would not be forwarding anything anywhere.  Where instead 50% of the time it is forwarding 50% of the packets.  So I would conclude it might be flipping between learning and forwarding?  not sure if thats even possible?  

Here is a basic view of how its plugged in (if its helpful)


```
<outside network>
     |
     |
   |trans|  (fiber->copper transceiver)
     |
     |bge0
|------------|-----em0-----used only for ssh
|  Firewall  |
|------------|-----em1-----nothing 
     |bge1
     |
  |switch|  (using it as a fiber->copper transceiver)
     |
     |
   Laptop   (only device on this side)
```


----------



## STL (Nov 12, 2009)

I figured out my problem...apparently the default priority of my STP bridge was the same as another bridge on the network, this was causing the two to routinely fight over the root status.  Once I lowered the priority of mine, all was good in the world yet again.


----------

