Nomad v0.10.4 (f750636ca68e17dcd2445c1ab9c5a34f9ac69345)
Fedora 31, with 18.09.8
The envoy health check in consul stays red and var/log/audit/audit.log contains denials:
type=AVC msg=audit(1583672022.178:2020): avc: denied { write } for pid=70868 comm="envoy" name="consul_grpc.sock" dev="tmpfs" ino=676989 scontext=system_u:system_r:container_t:s0:c121,c146 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=sock_file permissive=0
After sentenforce 0 the health check turns green.
Run nomad agent -dev and consul agent -dev and deploy the job file from below
job "example" {
datacenters = ["dc1"]
type = "service"
update { max_parallel = 1 }
group "http1" {
network {
mode = "bridge"
port "http" { to=80 }
}
service {
port = "http"
name = "http1"
connect {
sidecar_service {}
}
}
task "http1" {
driver = "docker"
config { image = "nginx" }
}
}
group "http2" {
network {
mode = "bridge"
port "http" { to=80 }
}
service {
port = "http"
name = "http2"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "http1"
local_bind_port = 8080
}
}
}
}
}
task "http2" {
driver = "docker"
config { image = "nginx" }
}
}
}
Consul logs have:
2020-03-08T14:02:56.962+0100 [WARN] agent: Check socket connection failed: check=service:_nomad-task-6d05e4c5-b5d8-2941-c6a4-dc9bb1e675c6-group-http2-http2-http-sidecar-proxy:1 error="dial tcp 127.0.0.1:30124: connect: connection refused"
2020-03-08T14:02:56.963+0100 [WARN] agent: Check is now critical: check=service:_nomad-task-6d05e4c5-b5d8-2941-c6a4-dc9bb1e675c6-group-http2-http2-http-sidecar-proxy:1
2020-03-08T14:03:00.781+0100 [WARN] agent: Check socket connection failed: check=service:_nomad-task-a6ae689e-6b3c-206d-5b58-0562248a595c-group-http1-http1-http-sidecar-proxy:1 error="dial tcp 127.0.0.1:20423: connect: connection refused"
2020-03-08T14:03:00.781+0100 [WARN] agent: Check is now critical: check=service:_nomad-task-a6ae689e-6b3c-206d-5b58-0562248a595c-group-http1-http1-http-sidecar-proxy:1
The other logs do not contain anything interesting sadly.
This got away by setting:
plugin "docker" {
config {
volumes {
enabled = true
selinuxlabel = "z"
}
}
}
in the nomad config.
There still seems to be a selinux issue because now I get:
type=AVC msg=audit(1583680590.931:2730): avc: denied { connectto } for pid=88248 comm="envoy" path="/opt/nomad/alloc/016ddeb3-3253-e6aa-7795-1b8dea187224/alloc/tmp/consul_grpc.sock" scontext=system_u:system_r:container_t:s0:c764,c777 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=1
type=AVC msg=audit(1583680590.974:2731): avc: denied { connectto } for pid=88250 comm="envoy" path="/opt/nomad/alloc/cb3eb8f7-f82c-ece8-d3a8-5fc310771358/alloc/tmp/consul_grpc.sock" scontext=system_u:system_r:container_t:s0:c179,c881 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=1
The blogpost https://danwalsh.livejournal.com/81143.html has a good explanation of why this is not working. The best thing to do here is probably --security-opt label=disable for the envoy container. Would this be a possibility?
I was able to manually fix the sidecars via:
sidecar_task { config { security_opt = ["label=disable"] } }
in the connect stanza :)
@shoenig I'm not sure if this is something we could explore improving in the documentation / guide?
@tgross For what it's worth, even a simple "beware does not work well with default selinux rules" would probably go far. I guess the main question is: Is active selinux a supported mode of operation for nomad. The same question could probably be asked for app-armor.
If the answer is yes, then the next question becomes: to which extend do you want to support it.
Hey there
Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this.
Thanks!
bump
selinuxlabel is troublesome, https://github.com/hashicorp/nomad/pull/7094