# Using jail nullfs mounts with autofs



## himay (Dec 1, 2021)

I've been trying to implement an NFS-based mounting system for some shared data between jails, and have found a _mostly_ working solution that comes with a problem.

Goal:
Share an NFS-mounted directory between two jails: $jail0 and $jail1.
NFS share: $ip:/path/to/share containing some common files for both jails and $ip:/path/to/share/<$jail0_dir> and $ip:/path/to/share/<$jail1_dir> (neither of which need to be exclusive from the other in access).

Current solution:
Using `autofs` and /etc/auto_master to automount $ip:/path/to/share to /net/$ip/path/to/share. This local mapping of /net/$ip/path/to/share is then presented to the jails as `nullfs` mounts via their respective $jail{0,1}.fstab files.
This works if `autofs` has already mounted the NFS share, and both jails operate and traverse the `nullfs`-mounted NFS share without issue.
However, if /net/$ip/path/to/share has yet to be automounted (either during boot or if the share is autounmounted after shutting down the jails), starting _either_ jail will end up with unresponsive jail processes trying to traverse the jail mount. Trying to `jexec` into the jails is fine, but as soon as I try to `ls` within the jail's `nullfs` mount, `ls` similarly spins up to a full core's worth of CPU activity and becomes unresponsive (to either SIGHUP or SIGKILL).
Looking at the output of `mount` shows that the jail's `nullfs` mount has been created before `autofs` has mounted the NFS share…

```
# mount
[…]
/net/$ip/path/to/share on /srv/jails/$jail0/mnt/share (nfs, nosuid, read-only)
devfs on /srv/jails/$jail0/dev (devfs)
$ip:/path/to/share on /net/$ip/path/to/share (nfs, nosuid, automounted)
```
…so I presume this has something to do with how `nullfs` and `autofs` are interacting with the former trying to call the latter. To kill these processes to be able to shutdown the jail requires `umount -f /srv/jails/$jail{0,1}/mnt/share` to forcibly detach the mount before the processes become responsive again.

Issue remaining:
How do I prime `autofs` to mount the NFS share _prior_ to the jails' fstab directing the creation of the `nullfs` mount? Do I need to abandon `autofs` and just put this mount into /etc/fstab for the host?

I have been trying to use some Google-fu to find even a related bug let alone solution to this issue, and I have come up emtpy-handed. I would appreciate any advice or insight any of you may be able to provide.


----------



## luoqi (Dec 5, 2021)

This I think is because nullfs doesn't cross any mount point in the lower fs -- autofs works by retrying lookup after an automount is triggered and the second lookup would cross the new mount point.

Here's a proof-of-concept patch at https://people.freebsd.org/~luoqi/nullxmnt.diff against releng/13.0, it's working but I didn't spend much time to get all the details right.


----------



## D-FENS (Dec 5, 2021)

Do you provide the $jail{0,1}.fstab as "fstab" inside your jail.conf?
Drop that and use your own shell mounting code in the prestart and poststop hooks. I had similar issues with nullfs mounts and discovered sometimes jail mounted in the wrong order or did not handle unmount failures properly. So I implemented my own logic, which checks and mounts a directory when starting the jail if necessary, and extra robust logic to unmount (a couple of tries and at the end -f) to make sure my jail does not leave garbage mounts.

prestart.sh

```
#!/bin/sh -x

name=$1
mntDir=$2

# You can put your NFS mount checking code here instead ....
isMounted=$(zfs get -H -o value mounted "zroot/jails/myjail/mnt")
if [ "$isMounted" = "no" ]; then
        zfs mount "zroot/jails/myjail/mnt"
fi

/sbin/mount -a -F "$mntDir/../fstab"

/sbin/mount -t devfs -oruleset=4 . "$mntDir/dev"
/sbin/mount -t fdescfs . "$mntDir/dev/fd"
```

poststop.sh

```
#!/bin/sh -x

jail=$1
mntDir=$2

[ -n "$mntDir" ] || { echo "No mntDir in poststop." 1>&2; exit 3; }

echo "Unmounting all under $mntDir"

# unmount fdescfs
/sbin/umount "$mntDir/dev/fd"  || /sbin/umount -f "$mntDir/dev/fd"    || true

# unmount devfs
/sbin/umount "$mntDir/dev"    || /sbin/umount -f "$mntDir/dev"      || true

cat $mntDir/../fstab \
    | grep -v '^\s*#.*' \
    | sort -r -k 6 \
    | awk 'NF { print "/sbin/umount -t " $3 " " $2 " || /sbin/umount -f -t " $3 " " $2 " || true"; }' \
    | /bin/sh \
    #
```
Of course, in jail.conf do not use the "fstab" directive.

In your case you could modify the prestart and poststop hooks and handle your cases accordingly. First check if the NFS mount is available, if so - mount it via nullfs. If not - either try to mount it at jail start (but then you might have hanging if NFS is not available), or better yet, simply ignore it - it will not be mounted. Then whenever you automount the NFS share, you have to somehow externally create the nullfs mount (maybe a cron job, or a hook after NFS was mounted) so the jails can see it.

At jail stop - unmount the nullfs. It is not needed. (You can use my script directly.)


----------



## himay (Dec 5, 2021)

roccobaroccoSC said:


> Do you provide the $jail{0,1}.fstab as "fstab" inside your jail.conf?


Yes, exactly. Not sure that I need the poststop.sh script (shutdown and autounmounting is working fine), but I like the idea you've got going on with prestart.sh.

Reading a bit more into jail(8), I think I'm going to give it a (f)stab (hah!) using the `exec.prepare` pseudo-parameter.


> These commands are executed *before *assigning IP addresses and *mounting filesystems*, so they may be used to create a new jail filesystem if it does not already exist.


Sounds like exactly what I'm looking for, with something like your checking code. Thank you for the lead and ideas!


----------



## himay (Dec 11, 2021)

In case anyone is curious, this is the script I wrote to use with the `exec.prepare` parameter for jails in jail.conf:

```
#!/bin/sh -x

HOST=$1
MNTDIR=$2

echo "Checking for availability of ${MNTDIR} export located at ${HOST}"

# Check for ${MNTDIR} among exports of ${HOST}
[ $(/usr/bin/showmount -E ${HOST}) == ${MNTDIR} ] || echo "${HOST} does not provide ${MNTDIR} in its exports"

# Mount ${MNTDIR} using autofs by querying mount contents
if /bin/ls /net/${HOST}${MNTDIR} > /dev/null
then
        echo "Successfully mounted ${MNTDIR} on ${HOST} to /net/${HOST}${MNTDIR}"
else
        echo "Unable to mount ${MNTDIR} on ${HOST}"
fi
```


----------

