# kernel panic after upgrade from 9.2 to 10.3 release



## DM07 (Jan 4, 2017)

Hello everyone.
I encountered a problem after upgrading server from FreeBSD 9.2 Release to 10.3-RELEASE-p15. A kernel panic happens when network traffic are increasing. For example, I use rsync to deliver backups from the server to local machine. After a while, I get panic that states:

```
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x59
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80dae53c
stack pointer           = 0x28:0xfffffe04e7117570
frame pointer           = 0x28:0xfffffe04e7117580
code segment            = base 0x0, limit 0xfffff, type 0x1b
                   = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq256: re0)
trap number             = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
#0 0xffffffff809dc210 at kdb_backtrace+0x60
#1 0xffffffff8099eee6 at vpanic+0x126
#2 0xffffffff8099edb3 at panic+0x43
#3 0xffffffff80db091b at trap_fatal+0x36b
#4 0xffffffff80db0c1d at trap_pfault+0x2ed
#5 0xffffffff80db029a at trap+0x47a
#6 0xffffffff80d96262 at calltrap+0x8
#7 0xffffffff80369771 at ipf_frag_lookup+0x111
#8 0xffffffff803699e4 at ipf_frag_known+0x54
#9 0xffffffff8035ad3d at ipf_check+0x2fd
#10 0xffffffff80a72dd4 at pfil_run_hooks+0x84
#11 0xffffffff80adf3ae at ip_input+0x2fe
#12 0xffffffff80a71f12 at netisr_dispatch_src+0x62
#13 0xffffffff80a692d6 at ether_demux+0x126
#14 0xffffffff80a69f7e at ether_nh_input+0x35e
#15 0xffffffff80a71f12 at netisr_dispatch_src+0x62
#16 0xffffffff807144ee at re_rxeof+0x4ce
#17 0xffffffff8071572b at re_intr_msi+0x10b
```

More detailed trace:

```
#0  doadump (textdump=<value optimized out>) at pcpu.h:219
#1  0xffffffff8099eb42 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:486
#2  0xffffffff8099ef25 in vpanic (fmt=<value optimized out>, ap=<value optimized out>)
at /usr/src/sys/kern/kern_shutdown.c:889
#3  0xffffffff8099edb3 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:818
#4  0xffffffff80db091b in trap_fatal (frame=<value optimized out>, eva=<value optimized out>)
at /usr/src/sys/amd64/amd64/trap.c:858
#5  0xffffffff80db0c1d in trap_pfault (frame=0xfffffe04e71174c0, usermode=<value optimized out>)
at /usr/src/sys/amd64/amd64/trap.c:681
#6  0xffffffff80db029a in trap (frame=0xfffffe04e71174c0) at /usr/src/sys/amd64/amd64/trap.c:447
#7  0xffffffff80d96262 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
#8  0xffffffff80dae53c in bcmp () at /usr/src/sys/amd64/amd64/support.S:87
#9  0xffffffff80369771 in ipf_frag_lookup () at /usr/src/sys/contrib/ipfilter/netinet/ip_frag.c:697
#10 0xffffffff803699e4 in ipf_frag_known (fin=0xfffffe04e71176d8, passp=0xfffffe04e71176d4)
at /usr/src/sys/contrib/ipfilter/netinet/ip_frag.c:895
#11 0xffffffff8035ad3d in ipf_check (ctx=0xffffffff81e85828, ip=<value optimized out>,
hlen=<value optimized out>, ifp=<value optimized out>, out=0, mp=0xfffffe04e7117838)
at /usr/src/sys/contrib/ipfilter/netinet/fil.c:3025
#12 0xffffffff80a72dd4 in pfil_run_hooks (ph=0xffffffff81e9f918, mp=0xfffffe04e71178c0, ifp=0xfffff80005a36000,
dir=1, inp=0x0) at /usr/src/sys/net/pfil.c:82
#13 0xffffffff80adf3ae in ip_input (m=0xfffff8006c59a500) at /usr/src/sys/netinet/ip_input.c:488
#14 0xffffffff80a71f12 in netisr_dispatch_src (proto=<value optimized out>, source=<value optimized out>, m=0x1)
at /usr/src/sys/net/netisr.c:976
#15 0xffffffff80a692d6 in ether_demux (ifp=<value optimized out>, m=0xfffff8006c59a500)
at /usr/src/sys/net/if_ethersubr.c:851
#16 0xffffffff80a69f7e in ether_nh_input (m=<value optimized out>) at /usr/src/sys/net/if_ethersubr.c:646
#17 0xffffffff80a71f12 in netisr_dispatch_src (proto=<value optimized out>, source=<value optimized out>, m=0x1)
at /usr/src/sys/net/netisr.c:976
#18 0xffffffff807144ee in re_rxeof (sc=0xfffffe0000b98000, rx_npktsp=0x0) at /usr/src/sys/dev/re/if_re.c:2369
#19 0xffffffff8071572b in re_intr_msi (xsc=0xfffffe0000b98000) at /usr/src/sys/dev/re/if_re.c:2665
#20 0xffffffff80969b2b in intr_event_execute_handlers (p=<value optimized out>, ie=0xfffff80005a64e00)
at /usr/src/sys/kern/kern_intr.c:1264
#21 0xffffffff80969f76 in ithread_loop (arg=0xfffff80005a072a0) at /usr/src/sys/kern/kern_intr.c:1277
#22 0xffffffff8096767a in fork_exit (callout=0xffffffff80969ee0 <ithread_loop>, arg=0xfffff80005a072a0,
frame=0xfffffe04e7117c00) at /usr/src/sys/kern/kern_fork.c:1027
  #23 0xffffffff80d9679e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611
#24 0x0000000000000000 in ?? ()
```

I suppose the problem is in my network card driver:

```
re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xe800-0xe8ff mem 0xfbeff000-0xfbefffff,0xf6ff0000-0xf6ffffff irq 16 at device 0.0 on pci6
```

or somewhere on network stack. Should I try to recompile world to 10 stable ? Or try to disable all sysctl.conf settings concerning network, like

```
net.inet.ip.rtminexpire=2
net.inet.ip.rtmaxcache=1024
net.inet.tcp.sack.enable=1
net.inet.tcp.finwait2_timeout=10000
net.inet.ip.intr_queue_maxlen=4096
kern.ipc.somaxconn=32768
net.inet.tcp.maxtcptw=200000
```

Thanks for answer!


----------



## SirDice (Jan 4, 2017)

It looks like it's IPFilter that crashes the machine. How did you do the upgrade?


----------



## ASX (Jan 4, 2017)

Look like closely related to PR 212872


----------



## DM07 (Jan 4, 2017)

SirDice said:


> It looks like it's IPFilter that crashes the machine. How did you do the upgrade?


I svn'ed code from releng/10 branch, make world/make build kernel/make install kernel/make installworld -> reboot -> recompile all ports.
I build custom kernel with options:

```
# BDS 20121212
options         IPFILTER                # support IPFILTER
options         IPFILTER_LOG
options         IPFILTER_DEFAULT_BLOCK

# BDS 20121213
options         SC_HISTORY_SIZE=8192    # чтобы в консоли можно было далеко листать историю
options         ACCEPT_FILTER_DATA        # фильтры для nginx
options         ACCEPT_FILTER_HTTP        # ...
options         HZ=1000
options         DEVICE_POLLING
```


----------



## SirDice (Jan 4, 2017)

ASX said:


> Look like closely related to PR 212872


That looks very similar indeed.


----------



## DM07 (Jan 4, 2017)

ASX said:


> Look like closely related to PR 212872


in the above post: "...In my case, the error caused by garbage traffic IpV6"
In my case, when I run rsync from local machine that is downloading backuped files from the server, after ~60 sec I get kernel panic... No ipv6 traffic.


----------



## SirDice (Jan 4, 2017)

Is it possible to disable IPFilter temporarily? Only to rule it out as a possible cause.


----------



## ASX (Jan 4, 2017)

DM07 said:


> in the above post: "...In my case, the error caused by garbage traffic IpV6"


I would read that as "In my case, the error is *triggered* by garbage IPv6 traffic" ... just like yours seems to be triggered from rsync traffic ...

Note also that while the bug report was filled for a system running ipfw.ko, the second poster was instead running pf.ko, and they both are using an "igp" device. Either those are separate issues or the problem is a bit deeper.

Both your case and the initial bug report experience the error from :

```
ipf_frag_lookup () at /usr/src/sys/contrib/ipfilter/netinet/ip_frag.c:697
```
in a function that does:

```
/* Check the fragment cache to see if there is already a record of this     */
/* packet with its filter result known.                                     */
```

As I read it something is going wrong while trying to access that "cache", and I suspect that may happen independently from specific filtering in use, ...

I would suggest to add your info to that PR.


----------

