I've edited a config file in the -config-dir with an editor via sudo. This leaves a backup copy with a suffix of ~, owned by root. When restarting the server, it stops with
==> config: ReadFile failed on /etc/consul.d/config.json~: open /etc/consul.d/config.json~: permission denied
I would expect the server to skip any files that it can't access, especially if they aren't matching the *.json or *.hcl pattern for config files.
-config-dir
Server info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = 39f93f01
version = 1.2.1
consul:
bootstrap = false
known_datacenters = 1
leader = true
leader_addr = 10.8.0.1:8300
server = true
raft:
applied_index = 81860
commit_index = 81860
fsm_pending = 0
last_contact = 0
last_log_index = 81860
last_log_term = 22
last_snapshot_index = 65556
last_snapshot_term = 10
latest_configuration = [{Suffrage:Voter ID:47e7f178-ca40-1043-39c0-c3ce9b3abdd7 Address:10.8.0.1:8300}]
latest_configuration_index = 1
num_peers = 0
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 22
runtime:
arch = amd64
cpu_count = 2
goroutines = 109
max_procs = 2
os = linux
version = go1.10.1
serf_lan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 8
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 125
members = 7
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = true
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 15
members = 1
query_queue = 0
query_time = 1
Devuan jessie (debian w/o systemd), x64
==> config: ReadFile failed on /etc/consul.d/config.json~: open /etc/consul.d/config.json~: permission denied
Hey @femaref, I understand that this can be surprising behavior as it happens but I think checking permissions here to avoid this issue is relatively out of scope for Consul. I'd expect that anything in a directory such as consul.d/ is strictly for Consul, and thus should be accessible to Consul.
With that being said I'm going to close this but feel free to comment if there are other scenarios you think I'm missing.
I get your point, but I think it would add to the usability. Another thing to consider is that the error is easily swallowed if you are daemonizing consul (e.g. via start-stop-daemon) as is it not logged to syslog, this error is hard to follow, as the process simply won't start.
Checking should not be necessary. According to the docs:
Consul will load all files in this directory with the suffix ".json" or ".hcl".
consul should not care about files with another suffix in the first place (except if -config-format is set). Going by the code in agent/config/builder.go, this part of the docs is actually not followed, as there is no check for the extension in the ReadPath method (line 140-195). The check happens only after the files have been read, in Build (l 230ff). Considering the default behaviour is SKIP(l 258), there should be no difference in moving that check to the loading of the file(s) instead of the parsing. Why read files and then skip them anyway?
You're right the behavior is not as specified in the docs, and it could definitely do as the docs say and at the same time potentially manage this original issue/permissions in a better way. I'm going to re-open this, thanks for digging into it!
Hello, can I take a stab at this?
@shubheksha go for it! Not entirely sure what the right approach here is but I think at minimum it'd be great for us to reproduce the issue of loading files with non-matching extensions, and fix it to reflect the behavior specified in the docs.
@pearkes, I was able to think of a fix for this. It involves moving all of the code related to checking the extension of the format to the ReadPath() function from Build() and populate the Format field of the source there as well.
However, I tried reproducing this and failed. I created a file with the json~ extension owned by root but it just skips it for me, as is the expected behavior. I don't get a permission error as specified in the issue description. I'm am MacOS, probably that's why.
make sure the group/other read bit isn't set. If you create the file with your user and chown it, it will still be readable. Also, with a restrictive umask set, files could be created without those bits set (which was the case when I first encountered the error).
@femaref, thanks for the pointer! Created the file using root but my group bit was set. Resetting that let me reproduce the issue.
I'll open a PR with a possible fix.
@pearkes, I couldn't find any unit tests for builder.go in the repo. However, I could see some of the functionality being tested in runtime_test.go. Should I add a test for this case there itself?
Going to reference https://github.com/hashicorp/consul/issues/4506 which is similar to this. I believe those issues would be resolved by #4504 as well, or could be incorporated there.
Most helpful comment
I get your point, but I think it would add to the usability. Another thing to consider is that the error is easily swallowed if you are daemonizing consul (e.g. via
start-stop-daemon) as is it not logged to syslog, this error is hard to follow, as the process simply won't start.Checking should not be necessary. According to the docs:
consul should not care about files with another suffix in the first place (except if
-config-formatis set). Going by the code inagent/config/builder.go, this part of the docs is actually not followed, as there is no check for the extension in theReadPathmethod (line 140-195). The check happens only after the files have been read, inBuild(l 230ff). Considering the default behaviour isSKIP(l 258), there should be no difference in moving that check to the loading of the file(s) instead of the parsing. Why read files and then skip them anyway?