# lagg(4) alternatives



## Beeblebrox (Mar 22, 2012)

I have some problems implementing lagg(4)() as loadbalance on dual NIC's. It may be because my switch is a cheap, un-managed Gbit device or because I have missed some point about loading a kernel module (custom kernel, kldloaded if_lagg but no other modules loaded). For example, does lagg depend on NAT/PF (not from what I can tell)?

I would like to either find an alternative to loadbalanced-lagg and if there is no such thing, to figure out how to debug what is going wrong. All I can say with my implementation of lagg so far, is that as soon as lagg kicks in, NIC pool no longer responds to pings and gives an "I'm busy now" message.


----------



## phoenix (Mar 22, 2012)

Define "load-balance".  It's a very loaded term, where everyone means something different.  A description (or image) of your network layout would be handy as well.


----------



## Beeblebrox (Mar 22, 2012)

Q3 and follow-up. Diskless-clients with NIC0:10/100 + NIC1:Gbit. Mobo embedded NIC0 is needed for boot, then should pool for better transfer speed. More details: http://forums.freebsd.org/showthread.php?t=30616

To further complicate, I can't get lagg working on master, never-mind nodes...
Master is AthlonII-X3, nodes vary from AII-X3 to Athlon-sock754. Have I left out anything?


----------



## throAU (Mar 23, 2012)

Not 100% sure of what is going on, however I can tell you that you (as per your other thread) that you do NOT want to run LACP protocol unless your switch supports it - running LACP requires ports to be configured into a port channel running LACP protocol.

Also - you don't have any other IPs bound to the network interfaces directly in rc.conf?

I have the following working fine in /etc/rc.conf on one of my boxes (although using LACP):


```
ifconfig_bce1="DHCP"
ifconfig_bce0="up"
#ifconfig_bce1="up"
ifconfig_igb0="up"
ifconfig_igb1="up"
cloned_interfaces="lagg0"
#ifconfig_lagg0="laggproto lacp laggport bce0 laggport bce1 laggport igb0 laggport igb1"
ifconfig_lagg0="laggproto lacp laggport bce0 laggport igb0 laggport igb1"
ipv4_addrs_lagg0="10.3.1.29/24"
```

I notice you were specifying the IP for the lagg0 interface on the ifconfig_lagg0 line, I did it the above way due to being told to do it that way by some documentation (sorry, forget where from) online.

Edit:
Also not sure if you can load-balance across a 10/100 and gigabit NIC pairing?  Sure you don't want to use the "failover" option?  Set the gigabit NIC as the master, and failover to 10/100 if it is down?


----------



## Beeblebrox (Mar 23, 2012)

@throAU


> I notice you were specifying the IP for the lagg0 interface on the ifconfig_lagg0 line,


Both methods work, AFAIK.


> Also not sure if you can load-balance across a 10/100 and gigabit NIC pairing


Normally after the diskless client is done booting from the BIOS-recognized 10/100 NIC, it should switch traffic to the Gbit NIC and 10/100 would go "down". My thought was: Can I squeeze 110% (1.1G) out of two pooled NICs?

Meanwhile, testing some things today showed that:
1. On the master system, after lagg0 gets created as loadbalance, master stops responding to pings. However, tcpdump on master shows that ping requests are being received - it just does not respond. (No firewall running)
2. On a node, lagg setup as failover makes no difference and results in system freeze (no ping response also). Here re1 is the Gbit-NIC and is set as primary NIC.

```
ifconfig_lagg0="laggproto failover laggport re1 laggport re0 192.168.2.2 netmask 255.255.255.0"
```
Curiously, under this setup tcpdump shows nfs response during frozen status. The second code snippet shows up when I un-plug the ethernet cable on re1 and lagg0 needs to fall back on re0.

```
12:49:12.237477 IP 192.168.2.1.nfs > node2.76575838: reply ok 112

12:59:54.323462 IP 192.168.2.1.nfsd > node2.740: Flags [R.], seq 4209474782, ack 2247908211,
win 29127, options [nop,nop,TS val 2033479589 ecr 11647], length 0
```


----------



## Beeblebrox (Mar 25, 2012)

Any input re how to interpret this tcpdump info?


----------



## RusDyr (Apr 3, 2012)

I would recommend you use CARP with enabled net.inet.carp.arpbalance, if your hosts and server are located in one broadcast domain.


----------



## Beeblebrox (Apr 4, 2012)

That won't work in this setup in my opinion because:
carp provides load-balancing or failover between two (or more) hosts.
lagg provides load-balancing or failover between two (or more) NICs on one host.
(Thanks to Peter Jeremy)


----------



## RusDyr (Apr 5, 2012)

Oh, excuse me, you're right.


----------



## nickrad (Apr 5, 2012)

I haven't tried with one, but won*'*t your switch need to know what LACP/lagg is to do this? Chances are your unmanaged switch has no clue what is being sent to it, and the lagg never sets up so it freezes.

I'm not a LACP/lagg guru or anything just my thoughts.


----------



## Beeblebrox (Apr 6, 2012)

Yes, that is correct. From Peter Jeremy:


> FEC & LACP require support at both ends and so won't work.  loadbalance & roundrobin should work but aren't really appropriate unless your interfaces are identical.


Which means that the only choice left is failover as the NICs are not even remotely similar (Gbit vs 10/100).
Unfortunately, failover is also not working for some odd reason because the script at the end of this page is unable to flip the interface because of an error:

```
ifconfig: interface re0 cannot change link addresses!
```


----------



## RusDyr (Apr 6, 2012)

So recommendation is not to try invent bicycle and buy instead new switch with gigabit ports and/or LACP support: Dlink, for example, or even Mikrotik.


----------

