Watchdog: watchmedo shell-command: drop events during execution of command

Created on 28 Nov 2012 · 20Comments · Source: gorakhargosh/watchdog

(original title): watchmedo shell-command --wait seems to fire more often than necessary

Version: watchdog 0.6.0

When I execute the watchmedo script with "--wait --command=..." seems to be triggered to often. The command seems to be executed for each detected event according to the generated command-output. This seems to unnecessary and inefficient. Normally, I would expect the following behavior

# -- PSEUDO-CODE:
loop forever:
    Wait until event occurs.
    Remove all pending events (from the scheduler queue).
    Execute the command (once for all removed/pending events)

REASON:
When the command is executed it should rebuild whatever and be responsible for all accumulated events.

The efficient behaviour could also be enabled by a new, additional command-line option.

NOTE:
If you do not use the "--wait" option you can still use the nice debug/diag feature:

watchmedo shell-command ... --command='echo "${watch_src_path}"'

to detect which filesystem events occurred.

feature watchmedo

Source

jenisys

All 20 comments

Please don't change it; --wait works as I expect and need it to. Consider watching a directory for tarballs and untarring them as they arrive, and not wanting to hit the disk with too many simultaneous operations:

watchmedo shell-command --wait --pattern '*.tar' --command '[ "${watch_event_type}" == "created" ] && tar xf "${watch_src_path}"'

ryandesign on 23 May 2013

@ryandesign
In my opinion you do not need the --wait option for your use case (but I didn't try it out).

The command-line help states which behavior the --wait option provides
(or should provide but the implemented behavior does not match the description).

$ watchmedo shell-command -h
...
  -w, --wait            wait for process to finish to avoid multiple
                        simultaneous instances

jenisys on 23 May 2013

👍1

Can you explain again what you wanted or what the bug is? --wait means it will wait for the shell command aka. "the process" to finish before continuing. Otherwise it will be executed asynchronous and you may get multiple simultaneous instances of it.

tamland on 24 May 2013

As described in the first message.
The the command is executed too often in my opinion.

EXAMPLE:
watchmedo waits for filesystem events with a timeout (before the command is triggered).
3 filesystem events occur.
Then the command is triggered (and basically serves all 3 events).
When the command is done, it will be triggered twice again.

jenisys on 24 May 2013

Still don't get it. It's execute exactly once. Just tested by triggering a batch of events:

./watchmedo.py shell-command --command='sleep 5; echo "${watch_src_path}"' test
echoes all of them at once.

./watchmedo.py shell-command --wait --command='sleep 5; echo "${watch_src_path}"' test
echoes one by one in a 5 sec interval

As expected.

tamland on 25 May 2013

This is an issue for us as well. We're using watchdog to watch our source directory to trigger an automatic build of the software. When changing git branches, events are triggered for each file and directory change. Ideally, all of those events would trigger just one build.

What we'd like is to have the functionality of the --wait option, but ensure there can only be a maximum of one event on the queue.

chrisconley on 17 Jul 2013

To accommodate both @ryandesign and @chrisconley perhaps --wait could remain unmodified, and one of the following additional options could be implemented:

--max-wait-queue : max number of items to add to the queue (would be 1 in @chrisconley 's case)
--event-delay : number of ms to delay before dispatching the event. Assuming a delay value of 100ms, Event A occurs. It waits 100ms. If Event B occurs within 100ms, Event A is killed and Event B waits 100ms. If no events occur within 100ms, Event B is dispatched. This would solve the issue of changing git branches (which currently adds an event to the queue for every file and directory modified... this can easily be hundreds of queued executions), or adding/removing files (which triggers both a 'file changed' and 'directory changed' event), while ensuring that the _last_ event in a rapid series is the one that gets dispatched (i.e. when the last file in a git checkout is modified).

jonaldinger on 17 Jul 2013

To workaround this I wrapped my testrunner command with a lockfile. From my tricks.yaml:

 shell_command: if [ -e 'trick-lock' ]; then exit 0; else touch trick-lock; django-admin.py test --noinput people; rm trick-lock; fi

CharString on 27 Aug 2013

Using --command 'fab test', always runs tests twice for me... using -w doesn't make a difference. Is this a problem with fabfiles?

brainwarped on 5 Nov 2013

Reading this issue's comments as a newcomer it seems people are talking about different functionality.

@jonaldinger's suggestions seem the most sensible and make watchmedo a lot more useful.

Even copying a file into a directory will trigger multiple events:

1 created event
(file size divided by buffer size) modified events

My specific use case is that I want to uncompress an archive when it's been copied, but watchdog will spam the command for every event. I only would like to uncompress the archive when it's finished copying.

While I can work around it, @jonaldinger's --event-delay would make this use case trivial.

ferrouswheel on 14 Jan 2014

+1, --event-delay would be great! Currently watchdog with --wait kicks off 6 builds in a row when I save one file in vim, which is particularly overwhelming when there's a build error.

I would have thought --event-delay implies --max-wait-queue=1 though, so I won't know that both flags would be necessary. If you're going to be dropping events at all, it doesn't seem like there's a use case for keeping multiple events that accumulate during a command's execution.

timbertson on 1 Feb 2014

@gfxmonk I've added a --smoothing=<seconds> parameter to run after the triggering event is detected and ignore all of the events that gather in those <seconds>.

It does cause the command to run after <seconds> delay, but that's somewhat unavoidable.

The modifications are in Pull #231 or in nuket/watchdog@11c64d6.

nuket on 10 Apr 2014

Merged as --drop. Will drop events occurring while command is being executed. Use as an _alternative_ to --wait, not in combination.

tamland on 10 Apr 2014

Excellent. Thanks for the quick turnaround.

On 4/10/14, 4:33 PM, tamland wrote:

Merged as |--drop|. Will drop events occurring while command is being
executed. Use as an /alternative/ to |--wait|, not in combination.

—
Reply to this email directly or view it on GitHub
https://github.com/gorakhargosh/watchdog/issues/136#issuecomment-40088653.

nuket on 10 Apr 2014

I love --drop option, thanks!

craftgear on 25 Dec 2014

Great little utility, thanks. The watchmedo feature I was hoping for (and I think others too from rummaging around in feature requests), may be different than --drop, which is to say, if there are a batch of changes happening, I don't want the shell-command to be run until _after_ the changes have all happened, and only to happen once, as they are all the result of the same cascade of events.

In other words I would like the system to wait for X milliseconds after each event to see if there's other events to be batched together with it, and then finally trigger the command after the first X milliseconds monitored without changes. You might call it --idletrigger

--drop seems to trigger the shell command immediately when the first change happens, and then ignore subsequent changes, which is quite a different feature set, highly likely to trigger race conditions, and a bit surprising that it's what people want, assuming they're using it for the kind of cascade commands I'm working with - recursively monitoring whole directories where a recompile has extensive knock-on changes which influence a lot of files, and where those knock-on effects may take quite a few milliseconds to fully propagate. It's not just the first change which should be noticed, or the ones which have raced to be completed in time, and the next pipeline should only be triggered _after_ all the changes are complete

By all means tell me if my testing is wrong and I've misunderstood --drop. Also happy to refile a new issue if others subscribed to this this one really do want --drop and don't want --idletrigger Also I may have missed the full help or manual file as I can't seem to find a complete list of the commands available in my copy of watchmedo, so it may be hidden in another command line flag which I haven't examined, yet.

watchmedo --help

...doesn't list the --drop flag, for example, although it seems to be honored and...

man watchmedo

...says the utility is undocumented in man.

cefn on 22 Mar 2015

👍1

@cefn That seems reasonable to me, what would make sense IMO/for me:

wait a bit of time to start (X milliseconds)
run command(s)
if event happens while running, cancel previous command (^C), go back to 1.

(doing watchmedo shell-command --help gets the specific help, with drop and wait.)

hayd on 2 Jun 2015

@cefn I have a PR which does this, adding a --terminate argument. #319

hayd on 2 Jun 2015

@nuket I know it's been a long time, but is there any room to resubmit that PR? What @cefn described is exactly what I want.

git checkout branch
Git changes files one at a time
On the first file, watchmedo shell-command starts executing

I don't want that. I want watchmedo to start executing on the _last_ file change. Obviously that's not possible to detect, but a smoothing of a couple milliseconds would perfectly solve the issue. This is a well-known concept in programming too, for instance, lodash has a function called debounce.

Asday on 22 Nov 2017

👍1

Agree with @Asday and others here. Anyone finding this, an open issue on the topic exists here: https://github.com/gorakhargosh/watchdog/issues/315