Nomad = 0.11.0
CNI Plugins = 0.8.4
Docker = 19.03.7, build 7141c199a2
Docker API = 1.40
OS = Ubuntu 18.04 (4.15.0-91-generic)
If you attempt to run a job that uses extra_hosts while using bridged networking, you will receive the following error.
config {
image = "bash"
extra_hosts = [
"foobar.example.com:127.0.0.1"
]
failed to create container: API error (400): conflicting options:
custom host-to-IP mapping and the network mode
This is a major problem because it means any consul connect enabled job is not able to use custom hosts options.
I did find one related issue in docker where using --net=host and --add-hosts was mutually exclusive before docker api version 1.12. I'm not sure which docker api version nomad is using, but 1.40 is the latest
curl --unix-socket /var/run/docker.sock http://localhost/version | jq .ApiVersion
"1.40"
Submit the following job
job "bash" {
datacenters = ["dc1"]
group "api" {
network {
mode = "bridge"
}
task "bash" {
driver = "docker"
config {
image = "bash"
args = ["/bin/sleep", "100000000"]
extra_hosts = [
"foobar.example.com:127.0.0.1"
]
}
}
}
}
Possibly Related:
I can confirm that this problem in not limited to just the extra_hosts attribute but also to dns_servers and other dns options.
I'm running Nomad v0.10.5 and Docker v18.06.1 with API version 1.38 and minimum version 1.12.
It looks like the problem was fixed in Docker API version v1.12.0, see:
Nomad is developed against Docker version 1.8.2 and 1.9 (Official docs), meaning API version 1.20 and above (See Docker version matrix).
For the time being I am unable to run connect enabled jobs with custom DNS servers because of this problem. The error I get is conflicting options: dns and the network mode.
I tried to run a docker container using the command line and I am able to use --net=bridge with --dns=<ip> on the same machine where Nomad throws an error:
docker run --rm -it --net bridge --dns 1.1.1.1 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 1.1.1.1
options timeout:2 attempts:5
docker run --rm -it --net bridge --dns 8.8.8.8 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 8.8.8.8
options timeout:2 attempts:5
docker run --rm -it --net bridge --dns 8.8.8.8 --dns 1.1.1.1 ubuntu:20.04 cat /etc/resolv.conf
search us-east-1.compute.internal
nameserver 8.8.8.8
nameserver 1.1.1.1
options timeout:2 attempts:5
So I captured the traffic between Nomad and the Docker socket and it turns out that the network mode is container, not bridge.
I don't understand everything yet but I suspect it has to do with the fact that Nomad is using CNI plugins to setup networking and there is an intermediate container acting as the network bridge and gateway.
The new information for me is that the network mode specified in the jobspec is used for some other purpose. I tried to run a container with this new configuration e.g. --net container:container-id --dns 1.1.1.1 and it failed with the same error docker: Error response from daemon: conflicting options: dns and the network mode.
Did some more digging. Now I'm certain that this is because of the CNI based network setup.
Here is the call trace of network setup before the allocation is started:
client/allocrunner/allocRunner.Run(): client/allocrunner/alloc_runner.go#L298client/allocrunner/allocRunner.prerun(): client/allocrunner/alloc_runner_hooks.go#L201client/allocrunner/networkHook.PreRun(): client/allocrunner/network_hook.go#L76client/allocrunner/bridgeNetworkConfigurator.Setup(): client/allocrunner/networking_bridge_linux.go#L161In my opinion the final call to cni.Setup() should also be given the DNS configuration if specified in the jobspec. something like
dnsConfig := cni.DNS{
Servers: []string{"1.1.1.1"},
Searches: []string{},
Options: []string{},
}
b.cni.Setup(ctx,
alloc.ID,
spec.Path,
cni.WithCapabilityPortMap(getPortMapping(alloc)),
cni.WithCapabilityDNS(dnsConfig))
should do the job just fine.
I can try this change locally in a while, but it'd be great if someone who knows the codebase can verify the correctness of this patch in the meantime.
@nickethier could you offer some insight here please?
Hey all I missed this one when linking issues but the dns part of this issue is merged and will be in the next major release. See: #7661
We're still evaluating the extra_hosts option as its not something CNI supports directly. Under bridge mode, the docker tasks are using network-mode=container:
With regards to the extra_hosts option, would using a template block to write out an /etc/hosts file work? It's definitely not an ideal solution but might be a work around in the interim?
Thats a great idea. I think that may be a viable work around
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
task "app" {
driver = "docker"
config {
image = "<%= ENV['CI_REGISTRY_IMAGE'] %>:<%= ENV['CI_COMMIT_SHA'] %>"
volumes = [
"local/etc/hosts:/etc/hosts",
.....
template {
data = <<EOH
127.0.0.1 localhost
127.0.1.1 dev-vault.example.com
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
EOH
destination = "local/etc/hosts"
}
May be my problem is some what similar so I hope I can post it here.
When running nomad with consul connect the /etc/hosts may look like this:
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
But I expected something like:
127.0.0.1 localhost
**172.0.0.2 abcxyzaaa**
::1 localhost ip6-localhost ip6-loopback
The bold line is the host of Docker container. My application was running Java, and a lot of library is rely on Hostname which cause error when try to resolve abcxyzaaa
Most helpful comment
I can confirm that this problem in not limited to just the
extra_hostsattribute but also todns_serversand other dns options.I'm running Nomad v0.10.5 and Docker v18.06.1 with API version 1.38 and minimum version 1.12.
It looks like the problem was fixed in Docker API version v1.12.0, see:
Nomad is developed against Docker version 1.8.2 and 1.9 (Official docs), meaning API version 1.20 and above (See Docker version matrix).
For the time being I am unable to run connect enabled jobs with custom DNS servers because of this problem. The error I get is
conflicting options: dns and the network mode.