# 64GB system not saving crash dump - how large should swap be?



## rowan194 (Aug 18, 2020)

One of my servers crashed shortly after logging a read error on an NVMe drive used for ZFS L2 ARC. I presume the read error caused a panic, although I don't quite understand why that would happen (L2 ARC is expendable data, so ZFS should just fault the device and continue?)

I have *dumpdev="AUTO"* set in rc.conf but there's no vmcore files in /var/crash.

I thought at first it was because it wanted to dump the entire 64GB physical RAM to swap (which is only a 2GB partition), but the FreeBSD handbook says Minidumps ... "Hold only memory pages _in use by the kernel_ (FreeBSD 6.2 and higher)" ... "Minidumps are the default dump type as of FreeBSD 7.0"

I'm having some trouble determining how much memory "in use by the kernel" might be under amd64. What should I increase the swap size to in order to successfullyreliably dump?

FreeBSD 12.1-RELEASE r354358 (amd64)

Thanks.


----------



## Mjölnir (Aug 18, 2020)

`sysctl -d vm.kmem_size_max`
_vm.kmem_size_max: Maximum size of kernel memory_ (1.25 TB on my 12 GB RAM laptop)
Remember that buffers & caches add most to kernel memory in use, but I don't know if these are included in a minidump.  If you don't want to repartition (dump device>=physical RAM), consider to enable a textdump(4) (sysctl knob `debug.ddb.textdump.pending`) and have `sysrc -v dumpon_flags` to include compression.
EDIT I have a small swap/dump device of 4 GB on this machine (add. 12 GB swap on ZVOL) and one minidump succeeded since I have this machine.  It was ~1/2 GB compressed.  This is not a server use-case, though.


----------



## rowan194 (Aug 18, 2020)

If minidumps include caching then on any busy server that's going to be almost the same as dumping all memory. According to sysctl vm.kmem_map_size, I'm already up to ~39GB, and that will climb as the server warms up the cache (L2ARC already has 27GB of data after 3 hours uptime). To further complicate things, in a few days I'll be fitting larger sticks, which will increase the total physical memory to 128GB.

I have plenty of disk space so I can certainly allocate 8GB or 16GB to swap (by doing some repartitioning), but I'd rather be more certain about a size estimate...

From a read of the man page, it looks like textdump requires a custom kernel?


----------



## rootbert (Aug 18, 2020)

is NETDUMP(4) an option?


----------



## Mjölnir (Aug 18, 2020)

Consider to ask on the mailing list <freebsd-current@freebsd.org>, and also ask if it's reasonable to include a sysctl(8) knob to exclude buffers & caches from a dump (at least L2 ARC & other expendable data).
My understanding is that setting `debug.ddb.textdump.pending="1"` in loader.conf(5) enables textdump(4)s.  But eventually, this is an interpretation.  Better ask the wizzards.


----------



## Mjölnir (Aug 18, 2020)

`sysctl -da | grep dump`
_[...]
kern.coredump_pack_vmmapinfo: Enable file path packing in 'procstat -v' coredump notes
kern.coredump_pack_fileinfo: Enable file path packing in 'procstat -f' coredump notes
vfs.zfs.zio.exclude_metadata: Exclude metadata buffers from dumps as well
[...]_


----------



## Crivens (Aug 18, 2020)

If I understand it all correctly, it would only dump what the TLB contains (at least that would make minidumps small enough). But I did not check the code for seeeveral years.


----------



## rowan194 (Aug 19, 2020)

Does the dump process blindly write as much as possible and abort if it runs out of space, or does it pre-check?

Wondering whether the complete absence of any crash files means that it failed because of insufficient space (latter case), or for some reason (config etc) it didn't attempt it at all.

I'll be able to do some quick testing when I install the new RAM, but I'd rather go in prepared.  In the meantime, mysql is rebuilding a 1TB+ MyISAM database (don't ask...)


----------



## Mjölnir (Aug 19, 2020)

rowan194 said:


> Does the dump process blindly write as much as possible and abort if it runs out of space, or does it pre-check?


I did not RTSL for you, but I had `debug.ddb.textdump.pending="1"` in loader.conf(5) and produced a kernel dump by `sysctl debug.kdb.panic=1` as described in dumpon(8).  It did a normal minidump, not a textdump(4), and the knob is not visible, thus not in effect.  So to get a textdump(4), a custom kernel is required (`options TEXTDUMP_PREFERRED` & `TEXTDUMP_VERBOSE`).  To be safe, size the dump device as large as physical RAM.  To have some swap even on a machine with very large RAM (that will likely never swap), is a safety net, anyway. EDIT I wouldn't recommend to restrict `vm.kmem_size_max` to the size of the dump device, which is relatively small in your case.


----------

