# lockf: /var/run/periodic.daily.lock: already locked



## Understudy (Aug 22, 2014)

Hi,

I recently received this in my email from my FreeBSD 10 server. 

```
lockf: /var/run/periodic.daily.lock: already locked
```

Normally I would get the daily run but I got this instead. 

Sincerely,

Brendhan


----------



## miks (Aug 24, 2014)

Same here. Started to happen after upgrade from 9.2 to 10.


----------



## Understudy (Aug 25, 2014)

I had to do a reboot on the box. It went away after I did the reboot. So I am uncertain as to the cause or a better solution. I will leave this open so perhaps better information will arrive.


----------



## eduardo (Oct 28, 2015)

Ok, I received the same message as the original poster in my daily reports.

Related to this my 'daily run output' email message gave the following report: " HOSTNAME daily prior run still in progress"

So, I think it's safe to say that some process is hanging. Is there an easy way to determine what is hanging so that I can fix it?


----------



## tingo (Oct 28, 2015)

You don't see anything suspicious with `ps ax`?


----------



## eduardo (Oct 28, 2015)

*tingo*, thanks for pointing me to `ps ax`.

I looked through the ouput of `ps ax` - nothing jumped out at me. I decided to take a closer look at time 03:00 - when my jobs start.

Here is a snippet of what is going on around that time:


```
56635  -  D  0:00.01 zfs snapshot -r n54lapool@zfs-auto-snap_frequent-2015-10-25-02h45
56653  -  I  0:00.00 cron: running job (cron)
56659  -  Is  0:00.07 ruby /usr/local/sbin/zfs-auto-snapshot hourly 24 (ruby21)
56680  -  D  0:00.01 zfs snapshot -r n54lapool@zfs-auto-snap_hourly-2015-10-25-03h00
56683  -  I  0:00.00 cron: running job (cron)
56685  -  Is  0:00.00 /bin/sh - /usr/sbin/periodic daily
56687  -  I  0:00.00 lockf -t 0 /var/run/periodic.daily.lock /bin/sh /usr/sbin/periodic LOCKED daily
56688  -  I  0:00.00 /bin/sh /usr/sbin/periodic LOCKED daily
56695  -  I  0:00.01 /bin/sh /usr/sbin/periodic LOCKED daily
56696  -  I  0:00.00 mail -E -s n54la daily run output root
56962  -  I  0:00.00 /bin/sh /etc/periodic/daily/800.scrub-zfs
56968  -  I  0:00.00 /bin/sh /etc/periodic/daily/800.scrub-zfs
56969  -  D  0:00.00 zpool history n54lapool
56970  -  I  0:00.00 egrep ^[0-9\\.\\:\\-]{19} zpool scrub n54lapool$
56971  -  I  0:00.00 tail -1
56972  -  I  0:00.00 cut -d  -f 1
56989  -  I  0:00.00 cron: running job (cron)
56991  -  Is  0:00.07 ruby /usr/local/sbin/zfs-auto-snapshot frequent 4 (ruby21)
56996  -  D  0:00.01 zfs snapshot -r n54lapool@zfs-auto-snap_frequent-2015-10-25-03h15
57013  -  I  0:00.00 cron: running job (cron)
57015  -  Is  0:00.07 ruby /usr/local/sbin/zfs-auto-snapshot frequent 4 (ruby21)
57020  -  D  0:00.01 zfs snapshot -r n54lapool@zfs-auto-snap_frequent-2015-10-25-03h30
57051  -  I  0:00.00 cron: running job (cron)
57053  -  Is  0:00.07 ruby /usr/local/sbin/zfs-auto-snapshot frequent 4 (ruby21)
57058  -  D  0:00.01 zfs snapshot -r n54lapool@zfs-auto-snap_frequent-2015-10-25-03h45
57076  -  I  0:00.00 cron: running job (cron)
```

Any ideas?


----------



## tingo (Oct 29, 2015)

Not sure. But I see that you have quite a few zfs snapshots ongoing (five), and since this is the time for periodic daily, it runs the 800.scrub-zfs script. I wonder if that script is the source of the `zpool history <...>` process? Anyway, I'm not certain the the zfs snapshot processes are affecting your daily run, but I would start looking there. Why are there five snapshots running at the same time? Are they taking a long time to complete? If so, why?


----------



## Kiiski (Oct 31, 2015)

Just a quick tip I find useful, if someone missed it...

Maybe you find it easier to find processes related to `periodic` by adding -d option to `ps`, ex.
`ps -adx`


----------



## eduardo (Oct 31, 2015)

tingo said:


> Not sure. But I see that you have quite a few zfs snapshots ongoing (five), and since this is the time for periodic daily, it runs the 800.scrub-zfs script. I wonder if that script is the source of the `zpool history <...>` process? Anyway, I'm not certain the the zfs snapshot processes are affecting your daily run, but I would start looking there. Why are there five snapshots running at the same time? Are they taking a long time to complete? If so, why?



tingo, thanks for taking a look at this.

I have snapshots running per Brian Drewery's zfstools script once every 15 minutes, once every hour, once every day, once every week, once every month ...with automatic deletion as time goes on.

The `zpool history ...` is run to determine when the last zpool scrub took place. It runs a new scrub every 7 days.

The weird thing is that this cron job has been running great since the end of August. It has been only recently that the job is choking.

Getting to your observations, how can I determine how long a snapshot took (is taking)?

Also, the snapshots are not running at the same time. They are running on this schedule.

Kiiski

Thanks for the recommendation on using `ps -adx` - I'll give it a shot.


----------



## tingo (Nov 1, 2015)

eduardo said:


> Also, the snapshots are not running at the same time. They are running on this schedule.


That might be, but the fact remains that your `ps ...` output shows five '`zfs snapshot ...`' processes. If a snapshot finished in less than 15 minutes you should only have one such process in you ps output. So something is wrong here.
Sorry, I don't know how to figure out how long a snapshot takes, other than running it manually with time(1).


----------



## protocelt (Nov 1, 2015)

What does the server's memory and ZFS ARC stats look like in top(1) when this is happening? If the machine starts to swap, things could potentially slow down to a crawl causing the extended run times. Just a thought.


----------



## molofishy (Oct 11, 2016)

Like Understudy, I also saw this when I ran `periodic daily` manually from the command line, after not having received any daily periodic email when I should have. I rebooted and ran `periodic daily` again. Now it hangs... and a "lockf" process appears in the list of processes in `top`.


----------



## rotor (Oct 11, 2016)

molofishy said:


> Now it hangs... and a "lockf" process appears in the list of processes in  top.



I ran into that as well.   In my case, the process was not hung.

The `/usr/local/etc/periodic/security/410.pkg-audit` script was sitting in a `sleep `jot -r 1 0 3600`` command.

When the `sleep` command finished, the processing proceeded as expected.

I also have received emails such as mentioned in the first message of this thread.  I had changed the times that my periodic scripts run, and I did not leave enough time between daily and weekly for the `sleep` command to finish normally, so there was an overlap of scripts and `lockf` rightly complained.

This scenario may not be the one affecting you, but it may give you some things to look at.  Hope this helps.

[edit: fix a typo in the last paragraph]


----------



## molofishy (Oct 11, 2016)

I think something similar may have happened in my case... because after about 90 minutes the email came through. Maybe it was scanning through my 4TBs of disc space or something like that...


----------



## rotor (Oct 11, 2016)

molofishy said:


> I think something similar may have happened in my case... because after about 90 minutes the email came through. Maybe it was scanning through my 4TBs of disc space or something like that...




If you look at the 410.pkg-audit script I mentioned, you'll see a comment just before that sleep command.  The comment says that it sleeps for a random number of minutes (between 1 and 60) so that the pkg servers don't get hammered when everyone's periodics scripts start processing on the hour.

So it isn't scanning the disks, it is just sitting there in the sleep command.


----------

