If you start "consul lock" it will attach signal handlers so it can pass them to the child process. However, until the lock is acquired, every signal is discarded, so there isn't any graceful way for stopping it.
Run "consul lock" on two nodes. On the node that didn't acquire the lock, try pressing CTRL+C for killing it. It won't close.
Ubuntu 18 LTS, nothing fancy
I'm opening this issue because I'm using "consul lock" on a systemd service. If Consul has the lock, everything works fine. However, if I want to stop the service, and Consul doesn't currently hold the lock, Systemd will timeout and SIGTERM it
@JohnKiller Thanks for the report. It sounds perfectly reasonable to expect consul lock to still handle signals and exit even when the lock isn't held.
I don't know GO very well, but I've noticed that in the watch command there is this line:
https://github.com/hashicorp/consul/blob/8cdba9611d4cda22691fb5715872e2fa538bb389/command/watch/watch.go#L20-L24
which in lock is missing:
https://github.com/hashicorp/consul/blob/8cdba9611d4cda22691fb5715872e2fa538bb389/command/lock/lock.go#L64-L68
However that variable is referenced here:
https://github.com/hashicorp/consul/blob/8cdba9611d4cda22691fb5715872e2fa538bb389/command/lock/lock.go#L207-L212
So maybe it's just missing that. Thanks
@JohnKiller Maybe you know more GO than you thought. That is a good catch and certainly looks suspect.
So I looked into it a bit and that is part of the issue. Right now not having a shutdown chan means that it will just keep issuing the lock until it gets it.
There is a second piece to note in that the blocking query issued to consul to gain the lock could block for up to 15 seconds.
@JohnKiller do you know how long before systemd times out and issues a sigterm
Default is 90s. Until timeout, it will just keep trying to get the lock, so I did try another thing:
This means that there is in fact a signal handler that just discards everything.
Forked the repo, made the changes to have ShutdownCh with MakeShutdownCh() and now I observe two things:
Any suggestion on where to look?
We are having the same problem.
Any ideas when this will be fixed? (I'm unfortunately not fluent in GO myself)
Sorry, i did not dig further since it's behind my possibilities. My workaround is a SIGKILL after a timeout.
Fixed by #5909
Hi @freddygv is this backported to 1.6 or should I wait for 1.7? Thanks
Hi @JohnKiller I just saw that it didn't get backported to 1.6.3. That means it will be in 1.7, which is coming very soon.
OK, just upgraded to 1.7.0 and the fix is working.
However, this is the output:
Setting up lock at path: test/.lock
Attempting lock acquisition
^CShutdown triggered or timeout during lock acquisition
Is there a way to abort immediately instead of waiting the lock timeout? It still took about 10 seconds to quit