# 60+ simultaneous mount requests causing listen queue overflows



## ron_post (Oct 30, 2018)

I've got a freenas running freebsd 11.1-STABLE (althouth this happened on earlier versions as well) serving user directories over nfs3 to a cluster of linux machines (which are used as a compute cluster).  Directories on the cluster are automounted, including home directories.  When someone starts a job on many of the cluster machines at the same time (a fairly common occurence), a number of the jobs will fail to start saying they failed to mount the user's home directory.  Checking on the server, `dmesg` will have an error like

`sonewconn: pcb 0xfffff801e5278570: Listen queue overflow: 193 already in queue awaiting acceptance (1 occurrences)`

The address is close to mountd's, but doesn't exactly match anything in `lsof -iTCP -sTCP:LISTEN -P` when I check.  `netstat -s` says there were listen queue overflows in tcp, and watching with `netstat -Lan` I can watch the listen queue for mountd go up to 183.  (I'm forcing the nfs mount requests to be tcp, rather than udp, with proto=tcp on the clients.)

I've tried increased the size of the accept queue (kern.ipc.soacceptqueue) from 128 to 1024 and rebooted the machine, but `netstat -Lan` shows that the limit is still 128 for mountd:


```
tcp4  0/0/128                          *.763                
tcp6  0/0/128                          *.763
```

(`rpcinfo -p` shows mountd is at port 763).

How to I get mountd to either have a larger queue or to otherwise handle the request spikes from the cluster?


----------



## Bobi B. (Oct 31, 2018)

To find-out which program `sonewconn` message is related to, run `netstat -aA | grep fffff801e5278570`.


----------



## SirDice (Oct 31, 2018)

ron_post said:


> I"ve got a freenas


PC-BSD, FreeNAS, XigmaNAS, and all other FreeBSD Derivatives

11.11.1.2. kern.ipc.soacceptqueue


----------



## ron_post (Oct 31, 2018)

Sorry, Bobi, but that doesn't return anything, just like it didn't match anything in lsof.  However, after watching the listen queue for mountd climb to 183 right before getting the message, I'm pretty sure it was from mountd.

SirDice, as I mentioned above, I've already set kern.ipc.soacceptqueue to 1024.  This affected roughly half the listeners (after a reboot), so I can only presume the other half are using their own values for listen() rather than using the kernal limit.  I'm asking here because I don't know how to get mountd to increase the value.


----------

