# HELP NEEDED: kernel: swap_pager: indefinite wait buffer



## peterpakos (Oct 7, 2018)

I'm running FreeBSD 11.2-RELEASE-p4 as a storage server (iSCSI) for 2 VMware hosts.

Specs:
- Supermicro X11SSH-LN4F
- Xeon E3-1220 v6
- 64GB DDR4 ECC
- 8 x 3TB HDDs + 240GB SSD
- 2 x IBM M1015 HBAs
- Root on ZFS


```
[root@stor01 ~]# zpool status
  pool: tank
 state: ONLINE
  scan: scrub repaired 0 in 5h39m with 0 errors on Tue Oct  2 16:48:14 2018
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            da2p3   ONLINE       0     0     0
            da7p3   ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            da0p3   ONLINE       0     0     0
            da5p3   ONLINE       0     0     0
          mirror-2  ONLINE       0     0     0
            da1p3   ONLINE       0     0     0
            da6p3   ONLINE       0     0     0
          mirror-3  ONLINE       0     0     0
            da3p3   ONLINE       0     0     0
            da4p3   ONLINE       0     0     0
        cache
          da8       ONLINE       0     0     0

errors: No known data errors
[root@stor01 ~]# gmirror status
       Name    Status  Components
mirror/swap  COMPLETE  da0p2 (ACTIVE)
                       da1p2 (ACTIVE)
                       da2p2 (ACTIVE)
                       da3p2 (ACTIVE)
                       da4p2 (ACTIVE)
                       da5p2 (ACTIVE)
                       da6p2 (ACTIVE)
                       da7p2 (ACTIVE)
```

This morning the host became unresponsive and had to be hard rebooted.

This is what I found in /var/log/messages:

```
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38628, size: 32768
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 282715, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37724, size: 24576
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37423, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38536, size: 12288
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 12136, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 136691, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 101, size: 20480
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38544, size: 8192
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38578, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 40832, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 17781, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 334391, size: 4096
Oct  7 07:07:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 16786, size: 4096
Oct  7 07:07:29 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 17763, size: 12288
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 136691, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38837, size: 32768
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 16786, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 36574, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37724, size: 24576
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37423, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37462, size: 24576
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 40832, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 38827, size: 40960
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 334391, size: 4096
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37949, size: 8192
Oct  7 07:07:48 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 37911, size: 4096
Oct  7 07:08:38 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:08:43 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:09:21 stor01 kernel: WARNING: 10.69.102.2 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:09:30 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:10:12 stor01 last message repeated 2 times
Oct  7 07:10:12 stor01 last message repeated 2 times
Oct  7 07:10:12 stor01 kernel: WARNING: 10.69.102.2 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:10:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 51635, size: 4096
Oct  7 07:10:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 53877, size: 4096
Oct  7 07:10:12 stor01 kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 53728, size: 8192
Oct  7 07:10:12 stor01 kernel: pid 917 (telegraf), uid 0, was killed: out of swap space
Oct  7 07:10:28 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:10:49 stor01 last message repeated 2 times
Oct  7 07:11:00 stor01 kernel: WARNING: 10.69.102.2 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:11:00 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:11:00 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:11:00 stor01 ctld[28691]: child process 41534 terminated with signal 13
Oct  7 07:11:01 stor01 ctld[28691]: child process 41535 terminated with signal 13
Oct  7 07:11:01 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): connection error; dropping connection
Oct  7 07:11:08 stor01 kernel: WARNING: 10.69.102.2 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:11:55 stor01 kernel: WARNING: 10.69.102.2 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:11:55 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
Oct  7 07:12:12 stor01 kernel: WARNING: 10.69.101.11 (iqn.1998-01.com.vmware:esxi01-5c2577bd): no ping reply (NOP-Out) after 5 seconds; dropping connection
```

The logs say "kernel: pid 917 (telegraf), uid 0, was killed: out of swap space" but according to monitoring graphs, the swap space was barely used at the time:






Any idea what could have caused this?


----------



## trev (Oct 8, 2018)

https://www.freebsd.org/doc/en/books/faq/troubleshoot.html#idp59131080

What does the error swap_pager: indefinite wait buffer: mean?

This means that a process is trying to page memory to disk, and the page attempt has hung trying to access the disk for more than 20 seconds. It might be caused by bad blocks on the disk drive, disk wiring, cables, or any other disk I/O-related hardware. If the drive itself is bad, disk errors will appear in /var/log/messages and in the output of dmesg. Otherwise, check the cables and connections.
_______________________________________________


----------



## peterpakos (Oct 8, 2018)

Same thing happened this morning:





Sadly the remote console is not responding to key strokes and SSH is not responding at all. Interesting thing is that ctld is running OK and iSCSI is still serving datastores to my ESXi hosts.

How is this even possible considering the swap space is spread across the same drives, connected with the same cables, HBAs etc.?

BTW, I've already swapped Mini SAS cables between backplane and HBA, I'm going to swap HBA next and if there is no joy - the PSU.


----------



## peterpakos (Oct 9, 2018)

PSU replaced over 24 hours ago, so far so good...

Fingers crossed!


----------



## peterpakos (Oct 18, 2018)

Sadly, the problem returned after 9 days of normal operation:





Is it a good practice to have swap mirrored across 8 drives?


```
[root@stor01 ~]# gmirror status
       Name    Status  Components
mirror/swap  COMPLETE  da0p2 (ACTIVE)
                       da1p2 (ACTIVE)
                       da2p2 (ACTIVE)
                       da3p2 (ACTIVE)
                       da4p2 (ACTIVE)
                       da5p2 (ACTIVE)
                       da6p2 (ACTIVE)
                       da7p2 (ACTIVE)
```

If not, what else would you recommend here?

I'm wondering if gmirror gets into a weird state or something, it's interesting that it works fine after hard reboot. Surely it rules out any problems with physical connections etc., no?

I have already replaced PSU and mini SAS cables between HBA controller and SAS backplane.

Short SMART test shows no issues with the drives, I've now initiated long tests on all drives.

Any advice on what else I can do to diagnose this further?

TIA


----------



## kpa (Oct 18, 2018)

How much SWAP that is in total? You would probably do with a maximum of 16GBs of swap in total, if your system ends up needing that much swap you have other more serious problems and the amount of swap is not going do a thing to the help the situation.


----------



## peterpakos (Oct 18, 2018)

2 GB partition on each drive in 8-way mirror totalling 2 GB swap space which, according to the attached graphs, is not being utilised much (bottom line):





You can tell from the above graph when the server crashed as snmpd wasn't running and no stats were collected. 

Interestingly, "Shared real memory", "Physical memory" and "Real memory" climbed up to 100% just before the crash.

Perhaps there is something wrong with my swap setup and the system cannot use it for some reason?


```
[root@stor01 ~]# swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/mirror/swap   2097148        0  2097148     0%
[root@stor01 ~]# cat /etc/fstab
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/mirror/swap                none    swap    sw              0       0
[root@stor01 ~]# gmirror status
       Name    Status  Components
mirror/swap  COMPLETE  da0p2 (ACTIVE)
                       da1p2 (ACTIVE)
                       da2p2 (ACTIVE)
                       da3p2 (ACTIVE)
                       da4p2 (ACTIVE)
                       da5p2 (ACTIVE)
                       da6p2 (ACTIVE)
                       da7p2 (ACTIVE)
```


----------



## peterpakos (Oct 19, 2018)

Long SMART  tests have now completed showing no errors.

I have just disabled powerd, just in case.

Below is system memory information:


```
[root@stor01 ~]# freecolor -mt
Physical  : [...................................] 0%    (569/63643)
Swap      : [##################################.] 98%   (2024/2047)
Total     : [#%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%] (65691=2594+63097)

[root@stor01 ~]# sysctl hw | egrep 'hw.(phys|user|real)'
hw.physmem: 68493099008
hw.usermem: 2371002368
hw.realmem: 68719476736

SYSTEM MEMORY INFORMATION:
mem_wire:       66122309632 (  63059MB) [ 99%] Wired: disabled for paging out
mem_active:  +      8253440 (      7MB) [  0%] Active: recently referenced
mem_inactive:+      3682304 (      3MB) [  0%] Inactive: recently not referenced
mem_cache:   +            0 (      0MB) [  0%] Cached: almost avail. for allocation
mem_free:    +    601030656 (    573MB) [  0%] Free: fully available for allocation
mem_gap_vm:  +      -217088 (      0MB) [  0%] Memory gap: UNKNOWN
-------------- ------------ ----------- ------
mem_all:     =  66735058944 (  63643MB) [100%] Total real memory managed
mem_gap_sys: +   1758040064 (   1676MB)        Memory gap: Kernel?!
-------------- ------------ -----------
mem_phys:    =  68493099008 (  65320MB)        Total real memory available
mem_gap_hw:  +    226377728 (    215MB)        Memory gap: Segment Mappings?!
-------------- ------------ -----------
mem_hw:      =  68719476736 (  65536MB)        Total real memory installed

SYSTEM MEMORY SUMMARY:
mem_used:       68114763776 (  64959MB) [ 99%] Logically used memory
mem_avail:   +    604712960 (    576MB) [  0%] Logically available memory
-------------- ------------ ----------- ------
mem_total:   =  68719476736 (  65536MB) [100%] Logically total memory
```

I'm guessing majority of RAM has been "consumed" by ZFS cache:

```
[root@stor01 ~]# sysctl -a | grep vfs.zfs.arc_
vfs.zfs.arc_meta_limit: 16415329280
vfs.zfs.arc_free_target: 113014
vfs.zfs.arc_grow_retry: 60
vfs.zfs.arc_shrink_shift: 7
vfs.zfs.arc_average_blocksize: 8192
vfs.zfs.arc_no_grow_shift: 5
vfs.zfs.arc_min: 8207664640
vfs.zfs.arc_max: 65661317120
```

Is it possible that the system is not releasing ARC memory quickly enough and starts killing other processes?

Considering the system only acts as iSCSI server (ctld), is it worth tweaking any memory/ZFS tunables?


----------



## xtaz (Oct 19, 2018)

ZFS is supposed to consume up to 75% of the available free memory and then free it up in preference to anything else as soon as additional memory is needed by the system. But I have never seen this work in practice on any of my systems that run ZFS. I always get the same problems where swap is used up to 100% and you get errors.

So for me setting those sysctls to limit the arc size is a must. I usually limit it to 50% of the memory rather than 75% and that works better for me.


----------



## SirDice (Oct 19, 2018)

xtaz said:


> ZFS is supposed to consume up to 75% of the available free memory


Actually, by default it uses the total (RAM) memory minus 1GB. Which, if you have 4GB, would be indeed be 75%. But if you have 96GB it would use 98% (95GB).


----------



## Phishfry (Oct 19, 2018)

To me the gmirror swap arrangement seems awkward. You are using ZFS on Root but GEOM for swap.
Have you thought about throwing a whole drive at swap.
For instance throw in a small SSD for swap drive testing and eliminate partitions/gmirror as a culprit.
There was a mailing list post about swap interleaving that makes me think that multi-drive swap is bad.
Note this comment in code:
"Also be aware that swap ops are constrained by the swap device interleave stripe size."
Line 493 swap_pager.c


----------



## peterpakos (Oct 19, 2018)

xtaz do you happen to know which setting limits it to 75%? What tunables do you put in /boot/loader.conf and/or /etc/sysctl.conf?

According to official docs (https://www.freebsd.org/doc/handbook/zfs-advanced.html), the max size of ARC (vfs.zfs.arc_max) by default is set to all RAM less 1 GB or half of RAM, whichever is more. In my case it's 65661317120 bytes.


----------



## peterpakos (Oct 19, 2018)

Phishfry, this sort of mirrored swap arrangement is offered by the installer. I've had 4 drives in the gmirror swap running absolutely fine until last month when I built this server and moved drives over and added 4 new ones in (I was previously using HP Microserver Gen8).

It did cross my mind to get rid of gmirror and stick to a single swap partition/drive. I'm assuming if said swap drive died then potentially the system would crash, right?


----------



## peterpakos (Oct 19, 2018)

Right, got rid of gmirror swap and instead added the same 8 partitions as separate swap devices in:


```
[root@stor01 ~]# swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/da0p2        2097152        0  2097152     0%
/dev/da1p2        2097152        0  2097152     0%
/dev/da2p2        2097152        0  2097152     0%
/dev/da3p2        2097152        0  2097152     0%
/dev/da4p2        2097152        0  2097152     0%
/dev/da5p2        2097152        0  2097152     0%
/dev/da6p2        2097152        0  2097152     0%
/dev/da7p2        2097152        0  2097152     0%
Total            16777216        0 16777216     0%
[root@stor01 ~]# cat /etc/fstab
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/da0p2              none    swap    sw              0       0
/dev/da1p2              none    swap    sw              0       0
/dev/da2p2              none    swap    sw              0       0
/dev/da3p2              none    swap    sw              0       0
/dev/da4p2              none    swap    sw              0       0
/dev/da5p2              none    swap    sw              0       0
/dev/da6p2              none    swap    sw              0       0
/dev/da7p2              none    swap    sw              0       0
[root@stor01 ~]# freecolor -mt
Physical  : [###########........................] 34%   (21664/63643)
Swap      : [###################################] 100%  (16384/16384)
Total     : [################%%%%%%%%%%%%%%%%%%%] (80027=38048+41979)
```

Let's see how this goes...


----------



## xtaz (Oct 19, 2018)

Sorry guys, yes of course. When I said 75% that's because I have 4GB in my laptop and 4GB - 1GB is 75%.

I believe the actual algorithm is 1GB less than total RAM, or 50% of total RAM, depending on which is the highest amount. But in my opinion leaving 1GB free isn't enough and the system quite often ends up using swap space. When I've limited it to 2GB free I've never seen that problem.


----------



## Phishfry (Oct 19, 2018)

I am very underqualified in storage principals. Not sure I should have even commented.
Upon reading more I see many different views on the subject.
https://forums.freebsd.org/threads/the-use-of-swap.56799/

The thing with swap is you probably don't need it with 64GB ram but if you do use it and a swap disk fails it can crash the system.
At very least on the next bootup it will fail to boot. fstab will list a drive that is broken and FreeBSD will barf.


----------



## peterpakos (Oct 23, 2018)

What do you recommend then? Get rid of swap completely?


----------



## Phishfry (Oct 24, 2018)

I don't want to drive you off a cliff but if a application is gobbling up memory all swap is going to do is delay the problem.
Have you considered your monitoring program? Above you show:


peterpakos said:


> "kernel: pid 917 (telegraf), uid 0, was killed: out of swap space"


----------



## Phishfry (Oct 24, 2018)

Personally I run with no swap.
But I feel like you are still interleaving the swap by spreading it over many drives.(Although not mirrored)
How about one swap partition on one drive only. That would eliminate the interleaving as a problem.
But back to redundancy, There is none with that arrangement.
Back to my above point you shouldn't need too much swap anyway.
Supposedly you can have too much swap space too.


----------



## leebrown66 (Oct 24, 2018)

Phishfry said:


> Supposedly you can have too much swap space too.


Indeed.  I'm re-purposing an old Juniper box, and got this with 2G of swap:

```
warning: total configured swap (466033 pages) exceeds maximum recommended amount (111776 pages).
warning: increase kern.maxswzone or reduce amount of swap.
```
Upon further inspection I realized this has 256MB of memory, not the 2GB I misread it to be!

Personally I prefer to put swap onto a GMIRROR on top of two authenticating GELI's.


----------



## Phishfry (Oct 24, 2018)

leebrown66 said:


> Personally I prefer to put swap onto a GMIRROR on top of two authenticating GELI's.


To me spreading it out over 8 drives like the OP seems like a problem waiting to happen.

Just for my info how many drives are you spreading your gmirror swap over?

I think ShelLuser covered good points in the post from 2011 that I linked to.
If I was using only-ZFS I would consider adding a zpool (or zvol ??) for swap.
To me that seems logical.

I am still back to my first though. Are these filesystem/swap issues or is the real problem a program that is running amuck.
My true thought is the latter.

Maybe telegraf is the issue. I don't hear many people speak about it. A monitoring tool is the last thing you might suspect.

I just built a rig with UFS geom mirror/stripe over 4 NVMe on two SuperMicro M.2 paddle cards and 24 disk ZFS array using a RAID60 arrangement in a Chenbro RM23524 on SM X10DRi /2608LV3 / 3x LSI 3008 HBA. 64GB RAM booting off a SATADOM.


----------



## Phishfry (Oct 24, 2018)

peterpakos 
I was reviewing your graphs and this sticks out to me. 
Physical Memory =33% used. Not bad but the amount seems concerning. 22GB of RAM consumed.
How much of that is allocated to your two VM's?
Reason I ask is that 22GB seems like a huge number compared to what I am seeing in use.

Perhaps you need to investigate memory usage with `top` and see what processes are doing with `ps -ax`


----------



## Phishfry (Oct 24, 2018)

I have to ask this question too. Why two M1015 controllers. With an 8 drive arrangement one card would do.
Then put your SSD on the motherboard SATA3.
If you were running an array of SSD drives I would understand. Dual Path backplane too.
The M1015 is only PCIe 2 with x8 interface. 8 SSDs will soak the interface. Been there done that.
With today's rotating disks you cannot saturate that interface.

To me it seems sacrilegious to put a PCIe 2.0 card on a SM X11 board. The SAS3008 are not much more.


----------



## peterpakos (Oct 24, 2018)

Phishfry said:


> I don't want to drive you off a cliff but if a application is gobbling up memory all swap is going to do is delay the problem.
> Have you considered your monitoring program? Above you show:



The same thing happened also after disabling telegraf, so it's not that.


----------



## peterpakos (Oct 24, 2018)

Phishfry said:


> Personally I run with no swap.
> But I feel like you are still interleaving the swap by spreading it over many drives.(Although not mirrored)
> How about one swap partition on one drive only. That would eliminate the interleaving as a problem.
> But back to redundancy, There is none with that arrangement.
> ...



Touch wood, the system's been very stable since I got rid of gmirror swap. Time will tell if it stays this way...


----------



## CyberCr33p (Oct 24, 2018)

I use gmirror swap (with 2 drives) and I don't see this issue.

Let us know if disabling swap fix your issue.


----------



## PacketMan (Oct 24, 2018)

peterpakos said:


> Touch wood, the system's been very stable since I got rid of gmirror swap. Time will tell if it stays this way...



You will have to wait at least 9 days, before feeling at least feeling hopeful.    Best wishes!


----------



## peterpakos (Oct 24, 2018)

Phishfry said:


> To me spreading it out over 8 drives like the OP seems like a problem waiting to happen.



Why is that? Kernel not handling multiple swap devices correctly or what? Any docs to back your worries up?



Phishfry said:


> If I was using only-ZFS I would consider adding a zpool (or zvol ??) for swap.
> To me that seems logical.



I remember reading somewhere (quick search found https://forums.freebsd.org/threads/swap-and-zfs.30298/ as one of many examples) that swap on ZFS is not the best idea. Unless something changed recently.


----------



## peterpakos (Oct 24, 2018)

Phishfry said:


> I was reviewing your graphs and this sticks out to me.
> Physical Memory =33% used. Not bad but the amount seems concerning. 22GB of RAM consumed.
> How much of that is allocated to your two VM's?



Sorry, what VMs are you talking about?

I'm only running ctld to serve iSCSI storage at the minute, everything else is disabled.


----------



## peterpakos (Oct 24, 2018)

Phishfry said:


> I have to ask this question too. Why two M1015 controllers. With an 8 drive arrangement one card would do.
> Then put your SSD on the motherboard SATA3.
> If you were running an array of SSD drives I would understand. Dual Path backplane too.
> The M1015 is only PCIe 2 with x8 interface. 8 SSDs will soak the interface. Been there done that.
> ...



My 2U chassis has 12 disk bays and 3 mini SAS connectors on the backplane. Even though I currently have 8 disks populated, I'm planning to add 4 more in near future.

The system board comes only with SATA ports hence I'm unable to wire them to the SAS backplane.


----------



## peterpakos (Oct 24, 2018)

PacketMan said:


> You will have to wait at least 9 days, before feeling at least feeling hopeful.    Best wishes!



True dat!

Subconsciously I can even feel the system is more responsive now without gmirror


----------



## leebrown66 (Oct 24, 2018)

Phishfry said:


> Just for my info how many drives are you spreading your gmirror swap over?


The GELI partitions are authenticated only, no encryption.  I don't care about disk performance.  I do care about data integrity.  I have 3 variants:
1. A single piece of swap on top of GMIRROR containing 2 GELI partitions on 2 disks, ie RAID1.
2. A single piece of swap on top of GSTRIPE + GMIRROR + 4 GELI on 4 disks, ie RAID10.
3. A single piece of swap on top of GELI.

My reasons:
1 & 2 because a GELI failure due to bit rot degrades the mirror, triggering a devd event, which sends me an email, which means I replace that disk, but the box keeps operating.
3 because I want to know if any swap is corrupted, although that requires scraping /var/log/messages.

I never use ZFS because I've always had bad experiences with it, contrary to most other folk it seems.

I tried running InfluxDB which telegraf is part of, on a box months ago and found it ate memory with no regard for limits.  Large queries just ate all memory, all swap and then crashed.  As it's a monolithic application that lost the ability to store data at that point.


----------



## Phishfry (Oct 29, 2018)

Dan from FreshPorts has a nice writeup on his blog regarding gmirror swap. Would be nice to hear his opinions.
https://dan.langille.org/2017/08/21/replaced-a-drive-what-about-that-gmirror-swap/
I can honestly say Freshports is never down and always responsive.
Great Job Dan


----------



## Phishfry (Nov 1, 2018)

peterpakos said:


> Any docs to back your worries up?



https://www.freebsd.org/doc/handbook/bsdinstall-partitioning.html


> On larger systems with multiple SCSI    disks or multiple IDE disks operating on    different controllers, it is recommended that swap be    configured on each drive, up to four drives.  The swap    partitions should be approximately the same size.  The    kernel can handle arbitrary sizes but internal data structures    scale to 4 times the largest swap partition.  Keeping the swap    partitions near the same size will allow the kernel to    optimally stripe swap space across disks.  Large swap sizes    are fine, even if swap is not used much.  It might be easier    to recover from a runaway program before being forced to    reboot.


----------



## peterpakos (Nov 6, 2018)

Phishfry said:


> https://www.freebsd.org/doc/handbook/bsdinstall-partitioning.html



Thanks, I must have missed this bit of the handbook.

This was clearly causing the problem as the system has been rock solid ever since.


----------



## Phishfry (Nov 7, 2018)

I only found that passage in the manual helping someone else here. We are all learning around here.
Thank You for the follow up.


----------

