Users should be able to configure the shell for individual script based checks. Currently they default to /bin/sh or $SHELL for all the script checks on a node.
Is this currently included in any released version?
Had problems with this while playing with consul 0.6.0 on ubuntu 15.10, using /bin/false as default shell for the consul user.
Took me a while to figure out. Thing is, the docs mention that:
Shell is currently only supported for Docker checks.
Hi @skiold not yet - there's an open PR for this that we will review when we get a chance.
https://www.consul.io/docs/agent/checks.html would imply that the script check is useful places other then docker. But without being able to set the shell away from /bin/false I don't see how this is true. In the meantime until this is fixed the documentation should be changed to note that script checks are only supported on docker.
@tahoemph you are correct. We only show it in the Docker example, but it isn't called out very specifically. We can update the docs to reflect this.
This becomes especially important as systemd in newer versions enforces SHELL with the User= parameter: https://github.com/systemd/systemd/blob/master/NEWS#L3413-L3414
And as system users (like a consul user) are usually setup with /sbin/nologin as a shell, this makes consul now picking up nologin as the shell. Resulting in failure of every check-cmd.
So either you setup the system user with /bin/bash as a shell or you should be able to set it to something different within consul (globally would also be nice). Or consul should ignore SHELL's like nologin.
Right now Consul cannot run as a pseudo-user with a bogus shell (e.g. /bin/false) because of this. This increases Consul's attack window somewhat as the user requires a valid shell.
It's not clear to me why the check command shouldn't be passed directly as args to execve(). Consider using https://github.com/mattn/go-shellwords to parse the arguments.
ended up setting the SHELL variable in /etc/default/consul
@slackpad seems like #1430 is dead, should we create a new one or is there something that you're not satisfied with?
And we should keep it in mind that the ExecScript function is used not only by checks but by some other components:
command/agent/remote_exec.go:168: cmd, err := ExecScript(script)
consul/command/agent/watch_handler.go:48: cmd, err := ExecScript(script)
consul/command/agent/check.go:160: cmd, err := ExecScript(c.Script)
consul/command/lock.go:349: cmd, err := agent.ExecScript(script)
consul/command/watch.go:167: cmd, err := agent.ExecScript(script)
The issue mentioned by @duritong led to long troubleshooting for us when script checks that were working on all existing machines did not work on new ones (which had a slightly newer systemd).
When determining how to fix it, I'd just like to mention that we use a fair amount of shellism such as subshells and pipes (eg for Zookeeper result=$(echo ruok | nc localhost 2181); echo $result; test "zk-$result" = "zk-imok"). I couldn't determine from the documentation if go-shellwords supported this? Requiring this to be deployed as a file to the agent would be a bit impractical, and can hopefully avoided.
@carlpett you can always wrap it in /bin/sh -c. You would not need to put it in a separate script.
Ah. Of course. Bit slow today :)
Still, if it can be avoided without much compromise, I would prefer not to, for the sake of ease-of-use.
Why not just make consul execute the script as god intended so the kernel could properly use shebang? Passing everything through shell is what made the bash vulnerability (shell shock) so dangerous.
As some have suggested here I'm thinking we should drop the shell entirely and just exec the command. It's easy enough to run /bin/sh -c "" and this is more secure / less magic.
@slackpad your comment aligns with a recent breaking change in consul-template v0.18.0
A shell is no longer assumed for Template commands. Previous versions of Consul Template assumed /bin/sh (cmd on Windows) as the parent process for the template command. Due to user requests and a desire to customize the shell, Consul Template no longer wraps the command in a shell. For most commands, this change will be transparent. If you were utilizing shell-specific functions like &&, ||, or conditionals, you will need to wrap you command in a shell, for example:)
Closing this as a duplicate of https://github.com/hashicorp/consul/issues/2999.
I couldn't run script health check because of the nologin shell. I have a simple health check of reading a file for status. However I am getting error saying "This account is currently not available" which is a linux level error. We have version 0.6.4. If I change the shell to bash its working perfectly. I tried few options, but would be great if someone can provide me a hack. Appreciate your help
Most helpful comment
As some have suggested here I'm thinking we should drop the shell entirely and just exec the command. It's easy enough to run
/bin/sh -c ""and this is more secure / less magic.