# nfsd hangs on shutdown with nfsv4_server_only="YES"



## mickey (Apr 16, 2021)

As I am exclusively using NFSv4 and seeing there is a new rc.conf setting _nfsv4_server_only_ I enabled this setting on two machines in the context of upgrading 12.2 -> 13.0. Now when shutting down/rebooting, the following message appears in the log of both machines:

```
nfsd[1169]: rpcb_unset failed
```
One of the machines however hangs for like 30-90 seconds where it is stopping the nfsd processes, before rc.shutdown terminates unexpectedly and then reboots. Given the above error message, I suspect it is trying to contact rpcbind (which is not running when _nfsv4_server_only_ is enabled) and because this particular machine has TCP/UDP blackhole enabled, the request takes a long time before it times out.

Is this supposed to be happening with _nfsv4_server_only_ enabled?


----------



## T-Daemon (Apr 16, 2021)

mickey said:


> nfsd[1169]:   rpcb_unset failed .



That is "harmless noise" according to commit log message nfsd: silence rpcb_unset noise for NFSv4 only servers . MFC after 2 weeks, committed 2021-04-01.

The machine hanging must have another cause.


----------



## mickey (Apr 17, 2021)

T-Daemon said:


> That is "harmless noise" according to commit log message nfsd: silence rpcb_unset noise for NFSv4 only servers . MFC after 2 weeks, committed 2021-04-01.
> 
> The machine hanging must have another cause.


That patch is not yet in releng/13.0 but it will probably fix the issue by avoiding calling rpcb_unset() when server runs v4 only.

I just ran a test on the machine that was not experiencing the hang on shutdown. Before rebooting it, I manually enabled TCP4/UDP4 blackhole: `sysctl net.inet.udp.blackhole=1 && sysctl net.inet.tcp.blackhole=2` Then I rebooted the machine, and it was showing the same hang as the other one:

```
Stopping nfsd.
Waiting for PIDS: 1159 1170
```
At that point it hangs for some time, then the _rpcb_unset failed_ message appears, followed by a message that some 90 seconds watchdog expired and rc.shutdown gets terminated.

So I guess it's pretty safe to say that the hang is caused by the combination of:

_nfsv4_server_only="YES"_ which causes rpcbind to not start.
Having UDP/TCP blackhole enabled.
nfsd still trying to contact rpcbind which yields a timeout.


----------

