Amazon-ecs-agent: Feature Request: Add support for --runtime parameter for docker run

Created on 16 Nov 2017  路  8Comments  路  Source: aws/amazon-ecs-agent

nvidia-docker 2.0 was released a few days ago and it simplifies the way docker containers can interact with GPU. nvidia-docker CLI calls and mounting Nvidia drivers as container volumes can now be replaced with docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES='0,1'. It is also possible to access GPU from images which do not have Nvidia libraries installed:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES='0,1' debian:stretch nvidia-smi

I had a look at already opened issues and from this one
https://github.com/aws/amazon-ecs-agent/issues/502
I learnt that ECS only supports some docker run options.

I use AWS Batch (which uses ECS) with GPU workloads. Nvidia libraries that I need to install in the images have a significant footprint on the total size and extend the time the ECS task takes to start.

Would you consider adding that option to the ECS agent configuration?

More about nvidia-docker 2.0:
https://github.com/NVIDIA/nvidia-docker/wiki/CUDA
https://aetros.com/blog/Machine%20Learning/14-11-2017-nvidia-docker2-released

kinfeature request scopECS Service workaround available

Most helpful comment

@vinayan3 @szalansky , just tested updating the /etc/docker/daemon.json and it works well:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Thx for the tip.

All 8 comments

@szalansky Thanks for the detailed use case. I've marked this as a feature request.

@szalansky Have you tried to had a /etc/docker/daemon.json file with the default runtime to be nvidia? I'm looking to try something similar if ECS won't be having this support now.

If the daemon.json file doesn't work there might be a way to change the the docker daemon command line config via a sysconfig file supported by ECS. http://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-debug-mode.html

@vinayan3 @szalansky , just tested updating the /etc/docker/daemon.json and it works well:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Thx for the tip.

@akofman @vinayan3 Where would you make the /etc/docker/daemon.json change? Do I have to make the change in a custom AMI?

I have the same question as Vignesh. Does Batch / ECS support modifying /etc/docker/daemon.json? If yes, how?

Also, when modifying that file we need to restart docker to apply the changes. How can we make sure that happens with Batch / ECS?

@vighneshbirodkar @amjd You will have to make your own AMI based on the ECS AMI. I use Packer with Ansible to make modifications like this. https://www.packer.io/docs/builders/amazon.html

Hopefully, ECS gets the runtime parameter so AMI baking won't be needed for stuff like this.

@akofman I tried it and it works great!

the original request appears to be trying to solve nvidia gpu support which is now an released ecs feature. im tagging this with workaround available and closing this issue for now since it relates to nvidia support and doesnt cover more general runtime flag use cases.

please open a new issue with specific use cases tied to runtime if you need. thanks!

Was this page helpful?
0 / 5 - 0 ratings