# MPD crashes FreeBSD 7.2 and 8.1 after update to 5.6 version and patch



## Egor (May 22, 2012)

Hello everybody!

I have a few servers with freebsd FreeBSD 7.2 or 8.1 and MPD 5.5 for a PPPoE connection. After I updated MPD to version 5.6 (I use ports and this patch and a patch to support the CoA RAD_CLASS attribute:

```
--- ../mpd-5.6/src/radsrv.c     2011-12-21 23:58:49.000000000 +0900
+++ ./src/radsrv.c      2012-04-02 19:02:26.106800017 +0900
@@ -94,6 +94,7 @@
     Bund       B;
     Link       L;
     char        *tmpval;
+       u_char  *rad_class = NULL;
     char       *username = NULL, *called = NULL, *calling = NULL, *sesid = NULL;
     char       *msesid = NULL, *link = NULL, *bundle = NULL, *iface = NULL;
     int                nasport = -1, serv_type = 0, ifindex = -1, i;
@@ -163,6 +164,13 @@
                Log(LG_RADIUS2, ("radsrv: Got RAD_USER_NAME: %s",
                    username));
                break;
+               case RAD_CLASS:
+               tmpval = Bin2Hex(data, len);
+               Log(LG_RADIUS2, ("radsrv: Got RAD_CLASS: %s",
+                       tmpval));
+               Freee(tmpval);
+               rad_class = Mdup(MB_AUTH, data, len);
+               break;
            case RAD_NAS_IP_ADDRESS:
                nas_ip = rad_cvt_addr(data);
                Log(LG_RADIUS2, ("radsrv: Got RAD_NAS_IP_ADDRESS: %s ",
@@ -509,6 +517,8 @@
                ACLCopy(acl_queue, &L->lcp.auth.params.acl_queue);
                ACLCopy(acl_table, &L->lcp.auth.params.acl_table);
 #endif /* USE_IPFW */
+               if (rad_class)
+                       L->lcp.auth.params.class=rad_class;
 #ifdef USE_NG_BPF
                for (i = 0; i < ACL_FILTERS; i++) {
                    ACLDestroy(L->lcp.auth.params.acl_filters[i]);
```

After this update the servers start to reboot after a panic periodically about once a week. The reasons are different but usually it is:

```
kgdb /boot/kernel/kernel /var/crash/vmcore.2 
...
Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer	= 0x20:0xc4de1d73
stack pointer	        = 0x28:0xc3f92670
frame pointer	        = 0x28:0xc3f926c0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 26 (em1 taskq)
trap number		= 18
...

(kgdb) list *0xc4de1d73
0xc4de1d73 is in bpf_filter (/usr/src/sys/modules/netgraph/bpf/../../../net/bpf_filter.c:461).
456			case BPF_ALU|BPF_MUL|BPF_K:
457				A *= pc->k;
458				continue;
459	
460			case BPF_ALU|BPF_DIV|BPF_K:
461				A /= pc->k;
462				continue;
463	
464			case BPF_ALU|BPF_AND|BPF_K:
465				A &= pc->k;
```

Sometimes there are other errors, but there is always bpf_filter in "where" command output of gdb. All my kernels have additional options:

```
options         IPFIREWALL
options         IPDIVERT
options         IPFIREWALL_FORWARD
options         NETGRAPH
options         NETGRAPH_IPFW
options         NETGRAPH_PPPOE
options         NETGRAPH_IFACE
options         DEVICE_POLLING
options         HZ=1000
```
And I've changed these sysctl variables:

```
net.inet.icmp.icmplim=800
net.inet.flowtable.enable=0
net.isr.direct=1
kern.random.sys.harvest.ethernet=0
kern.random.sys.harvest.point_to_point=0
kern.random.sys.harvest.interrupt=0
net.inet.ip.fastforwarding=1
vm.pmap.shpgperproc=2048
net.isr.maxthreads 2
net.isr.bindthreads 1
```

There are about 200 users on every server. And pppoe-delay=3 or 4 (see this patch).

What may be the reason of kernel panic?


----------



## ecazamir (May 24, 2012)

This is happening only with the patch applied? I see a 'Mdup', but no 'Freee' for rad_class, this may eat memory, especially in the long run.


----------



## Egor (May 25, 2012)

ecazamir said:
			
		

> This is happening only with the patch applied? I see a 'Mdup', but no 'Freee' for rad_class, this may eat memory, especially in the long run.



Thank you for the answer! I can't right now say if it happens only with this patch, because it happens once in a week or more. I run *show mem* command in MPD console, here is its output:

```
[] show mem
   Type                              Count      Total
   ----                              -----      -----
   AUTH                              32636     872735
   BUND                                150    1103680
   CMD                                   4         46
   CMDL                                528       9387
   COMP                                  1         36
   CONSOLE                               4       9936
   CONSOLE.buckets                       1        124
   CONSOLE.gent                          1          8
   CRYPT                                 1         28
   EVENT                               992      63836
   IFACE                              2524      93856
   LINK                                595    1252528
   PHYS                                689     228242
   PHYS.buckets                          3        372
   PHYS.gent                             1          8
   RADIUS                                4         74
   RADSRV                                3         45
   REP                                   1         16
   WEB                                   3        272
   http_server                           1         60
   http_server.server_name               1         15
   http_server.vhosts                    1        168
   http_server.vhosts.buck               1        124
   http_server.vhosts.gent               1          8
   http_servlet_hook                     1         40
   http_virthost                         1         12
   http_virthost.host                    1          1
   paction                               2         64
   typed_mem_stats                       1        928
                                     -----      -----
   Totals                            38152    3636649
```
There are 150 users on this server right now. I will see if this number is increasing.


----------



## Egor (May 28, 2012)

ecazamir said:
			
		

> This is happening only with the patch applied? I see a 'Mdup', but no 'Freee' for rad_class, this may eat memory, especially in the long run.



There was a memory leak! But it doesn*'*t look like it is the reason of the kernel panic, because there is more then 1.5G free RAM and emty 2G swap right before crash.

kgdb says, that call_trap with exception is always called from bpf_filter.c and there are always problems with the pc->k variable (for example, pc->k is equal to 0 but pc->code says divide A by pc->k). How can I check where pc->k takes its value from and why it is wrong?


----------



## ecazamir (May 28, 2012)

I can't help you debug further. Anyway, it's good to know if a stock net/mpd5 is crashing, or only the patched version.


----------

