I'm new to Nomad, and was interested to see if it would work with the new Docker support found in Windows Server 2016 TP4. Since this environment is not yet production, this may be an issue somewhere else (not sure). In general, though, Nomad supporting Docker on Windows (as Server 2016 is released) would be great!
Nomad v0.3.0
Windows Server 2016 TP4 (on Azure)
Task with docker driver fails
Apply this fix to Windows Server 2016 TP4
Run nomad in -dev mode
Generate example nomad file with nomad init
Remove constraint section and replace image redis:latest with microsoft/redis:latest
Run nomad run example.nomad
Relevant errors:
2016/03/11 21:05:39 [DEBUG] driver.docker: using 268435456 bytes memory for microsoft/redis:latest
2016/03/11 21:05:39 [DEBUG] driver.docker: using 500 cpu shares for microsoft/redis:latest
2016/03/11 21:05:39 [DEBUG] driver.docker: binding directories []string{"C:\\Users\\paul\\AppData\\Local\\Temp\\2\\NomadClient708162966\\71030d43-8622-8c2c-2bec-b5ff91e4c320\\alloc:/alloc:rw,z", "C:\\Users\\paul\\AppData\\Local\\Temp\\2\\NomadClient708162966\\71030d43-8622-8c2c-2bec-b5ff91e4c320\\redis:/local:rw,Z"} for microsoft/redis:latest
2016/03/11 21:05:39 [DEBUG] driver.docker: networking mode not specified; defaulting to bridge
2016/03/11 21:05:39 [DEBUG] driver.docker: allocated port 10.0.0.4:37417 -> 6379 (mapped)
2016/03/11 21:05:39 [DEBUG] driver.docker: exposed port 6379
2016/03/11 21:05:39 [DEBUG] driver.docker: setting container name to: redis-71030d43-8622-8c2c-2bec-b5ff91e4c320
2016/03/11 21:05:39 [ERR] driver.docker: failed to create container from image microsoft/redis:latest: API error (500): Invalid bind mount spec "C:\\Users\\paul\\AppData\\Local\\Temp\\2\\NomadClient708162966\\71030d43-8622-8c2c-2bec-b5ff91e4c320\\alloc:/alloc:rw,z": volumeinvalid: Invalid volume specification: 'C:\Users\paul\AppData\Local\Temp\2\NomadClient708162966\71030d43-8622-8c2c-2bec-b5ff91e4c320\alloc:/alloc:rw,z'
2016/03/11 21:05:39 [DEBUG] plugin: C:\Users\paul\nomad.exe: plugin process exited
2016/03/11 21:05:39 [ERR] client: failed to start task 'redis' for alloc '71030d43-8622-8c2c-2bec-b5ff91e4c320': Failed to create container from image microsoft/redis:latest: API error (500): Invalid bind mount spec "C:\\Users\\paul\\AppData\\Local\\Temp\\2\\NomadClient708162966\\71030d43-8622-8c2c-2bec-b5ff91e4c320\\alloc:/alloc:rw,z": volumeinvalid: Invalid volume specification: 'C:\Users\paul\AppData\Local\Temp\2\NomadClient708162966\71030d43-8622-8c2c-2bec-b5ff91e4c320\alloc:/alloc:rw,z'
@pofallon We are definitely going to support Windows! We will look into this and get this fixed when we have some bandwidth.
We're doing some experimentation with this, too. Currently, we have made some local changes which now allow us to actually start a docker container, but Nomad has a panic about nil data shortly afterwards and then seems to kill the container.
The required changes we've identified so far:
The errors we're seeing now seem to happen during service registration (eg exec.SyncServices(consulContext(d.config, container.ID))) - or at least commenting this part out makes the panic not happen at once... We're running with nomad agent -dev, and consul agent -dev -bind 127.0.0.1.
If we manage to get it working somewhat reliably, we'll try to tidy up our changes into a PR, but currently we've mostly just changed "stuff that only works on Linux" to "stuff that only works on Windows"... :P
So, after a fair bit of digging, it seems this is the line of code that crashes:
cs, err := consul.NewConsulService(ctx.ConsulConfig, e.logger, e.ctx.AllocID) / https://github.com/hashicorp/nomad/blob/master/client/driver/executor/executor.go#L426
e.ctx is nil here, at least when called via the RPC executor (the same code is called as part of startup, and at that point there is no issue). My go-fu is pretty primitive, so I haven't understood yet why this happens, but will continue looking tomorrow, unless it is trivial for someone to figure out.
We're basing our edits on the v0.3.1-rc2 tag, btw.
I'm also interested in running nomad on Windows 2016
@carlpett - did you finally manage to run it? If not - can you share the code of partial solution?
I think this issue can be closed. Problems described by @carlpett & @pofallon are resolved. And current status is in https://github.com/hashicorp/nomad/issues/1488
Sounds good! Good work @mwieczorek
Most helpful comment
@pofallon We are definitely going to support Windows! We will look into this and get this fixed when we have some bandwidth.