Nomad v0.4.0
Ubuntu 14.04, Kernel 4.2.0-42-generic
When running a Nomad service ober Docker 1.12 with a Docker driver setting like
network_mode = "some_overlay_network"
Nomad won't schedule the job's tasks. Watching docker events shows that the containers are created and immediately destroyed afterwards. The alloc-status displays a misleading error message, that the image could not be found.
docker network create --driver overlay --subnet=10.0.9.0/24 myoverlaynetmy_job.nomad and put this config in for the Docker driver driver = "docker"
config {
image = "registry.mycomp.com:5000/myservice:latest"
port_map {
goodservice = 10080
}
}
network_mode = "myoverlaynet"
$ nomad alloc-status 508cd057
couldn't retrieve stats (HINT: ensure Client.Advertise.HTTP is set): Unexpected response code: 500 (unknown allocation ID "508cd057-9938-559b-44e0-775a6669a5ce")
ID = 508cd057
Eval ID = 9241cc47
Name = myservice.myservice[0]
Node ID = d9e7c2ee
Job ID = myservice
Client Status = failed
Task "myservice" is "dead"
Task Resources
CPU Memory Disk IOPS Addresses
500 256 MiB 300 MiB 0 goodservice: 192.4.7.153:10080
Recent Events:
Time Type Description
08/13/16 19:26:16 CEST Not Restarting Error was unrecoverable
08/13/16 19:26:16 CEST Driver Failure Failed to create container from image registry.mycomp.com:5000/myservice:latest: no such image
08/13/16 19:26:16 CEST Received Task received by client
The documentation does not mention that overlay networks are valid network_mode.
https://www.nomadproject.io/docs/drivers/docker.html#network_mode
What is the advantage of running Nomad on top of Docker Swarm?
@itatabitovski it works with Docker 1.11.2. The documentation talks about "valid values pre-docker 1.9".
The advantage is having a network spanned over multiple hosts?
Ah, I've misread the docs, sorry.
I have not run Swarm before 1.12, so I can't tell how different the networking is.
Is there maybe something relevant in the nomad agent logs?
No. With Docker 1.12 networking works differently in Swarm Mode. I haven't looked into details yet, but my guess is that overlay networks cannot be accessed in a container context anymore, but only through the Service API.
The rationale behind swarm mode is:
Hey, sorry for not having a better response but we will need to investigate if this is a use case we want to support. It is not very high on the priority list to be frank as we can only tackle so many things at a time.
I'd like to see nomad on top of swarm as well!
@taemon1337 Why?
Since I'm already using Docker Swarm, it makes sense to want to deploy any services on that hardware using swarm. If I can just deploy Nomad as a service across the whole swarm, I can use it to run jobs and still install Nomad on other platforms independently.
While Docker Swarm is a job scheduler, its main use case is clustering docker nodes, not running scheduled jobs, I know that sounds weird to say, but Swarm does not yet have the ability to run a task once, where it looks as though Nomad's sole purpose is job processing and could care less about clustering docker nodes.
I'm trying to build a windows and docker processing engine that I think Nomad would be great for, but I don't want to manage it independently of all my other applications (i.e not on swarm).
@taemon1337 I think you will create more work for yourself with that approach. Both Docker Swarm and Nomad assume they are supervisors of the host machine so stacking them is going to break their assumptions.
I would install Nomad agent's on the host and run docker jobs through Nomad. This is my two cents! Of course everyone has different deployment needs so of course do whats best for you!
That makes sense, I didn't realize Nomad needed that, thanks
@dadgar so will setting up Nomad agent on a Docker host work with overlay network scenario? Thanks!
I am sure you could make it work but I wouldn't pursue that route
@dadgar , what about taking advantage of the routing mesh of docker swarm?
scheduling jobs with nomad on your docker swarm cluster will allow you to simple point to an Load balancer that has all the swarm nodes (where the nomad jobs are being scheduled) and the routing mesh will do the rest.
Instead of having to setup nginx+consul-template or traefik or something like that to route to the proper ip:port of where your job is being run.
would you consider in that case something like being able to (when creating a nomad job that is of a service type), leverage the docker swarm overlay network?
@sebamontini I guess you would gain that stuff if this issue was resolved?
it could be a good way to help people that are already running in swarm to make the switch over to nomad :)
@verges-io has your approach changed in the meantime? I was actually evaluating just using swarm as scheduler with host network because the mesh network looks very fragile, but then realized swarm service Discovery needs mesh. Nomad integrating into consul integrated into fabiolb seems much more reliable.
Hey there
Since this issue hasn't had any activity in a while - we're going to automatically close it in 30 days. If you're still seeing this issue with the latest version of Nomad, please respond here and we'll keep this open and take another look at this.
Thanks!
This issue will be auto-closed because there hasn't been any activity for a few months. Feel free to open a new one if you still experience this problem :+1:
Just wanted to comment here that it seems like a good idea to allow Nomad to choose or even create the network on a Docker swarm. Currently I am evaluating using Nomad because setting up k8s is very heavy and setting up Swarm is too light (cannot run single tasks, cron or privileged containers). Ideally I can use Nomad to schedule containers on my Swarm and it would integrate with Consul for service discovery, but the ip addresses and ports could be non exposed bridge, overlay or ingress network ips.
@dariusj18 you might want to take a look at the recent networking namespace improvements and Consul Connect integration we shipped in 0.10.
@tgross I'm using network on docker swarm now and want to migrate to consul connect service mesh but I have no idea how to monitor nomad tasks with prometheus in consul connect. I need to scrape metrics from every instance of a service and there's no option yet how to do it
@AndrewChubatiuk would you be willing to open a new GitHub issue with that question and give some context? There might be a bit to unpack there and I want to make to make sure it gets the attention it needs.
Hi @AndrewChubatiuk, could you possibly share how you use a swarm overlay together with consul connect and nomad? I am trying similar things, but I have not yet succeded, see https://bitbucket.org/jpsecher/nomad-consul-docker
Most helpful comment
That makes sense, I didn't realize Nomad needed that, thanks