Nvidia-docker: new container runtime supprort containerd or CRI-O

Created on 3 Dec 2020  Â·  4Comments  Â·  Source: NVIDIA/nvidia-docker

Most helpful comment

The nvidia-container-toolkit already supports both containerd and cri-o (though the documentation around this is a little sparse). We plan to have improved documentation / automated tooling around this well before 1.21 comes out.

The easiest way to enable this today is to:

  1. Follow the instructions here, but install the package nvidia-container-runtime instead of nvidia-docker2
    https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#id3
  2. Update your containerd config or set of cri-o hooks appropriately.

To accomplish (2) you can run the following.

For containerd:

$ sudo ctr image pull nvcr.io/nvidia/k8s/container-toolkit:1.4.0

$ sudo ctr run \
    --with-ns pid:/proc/1/ns/pid \
    --mount type=bind,options=rbind:rw,src=/run/containerd,dst=/run/containerd \
    --mount type=bind,options=rbind:rw,src=/etc/containerd,dst=/etc/containerd \
    nvcr.io/nvidia/k8s/container-toolkit:1.4.0 \
    toolkit \
    containerd setup \
        --socket=/run/containerd/containerd.sock \
        --config=/etc/containerd/config.toml \
        --runtime-class=nvidia \
        --set-as-default=true \
        /usr/bin

$ sudo ctr container rm toolkit

Under the hood, this is just doing the dirty work of updating your containerd config.toml file appropriately and then restarting the containerd service for you.

You can customize this to other values for the flags depending on where your containerd instance is running from.

For cri-o:

$ sudo mkdir -p /usr/share/containers/oci/hooks.d

$ sudo bash -c '
cat > /usr/share/containers/oci/hooks.d/oci-nvidia-hook.json << EOF
{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/bin/nvidia-container-toolkit",
        "args": ["nvidia-container-toolkit", "prestart"]
    },
    "when": {
        "always": true,
        "commands": [".*"]
    },
    "stages": ["prestart"]
}
EOF
'

All 4 comments

The nvidia-container-toolkit already supports both containerd and cri-o (though the documentation around this is a little sparse). We plan to have improved documentation / automated tooling around this well before 1.21 comes out.

The easiest way to enable this today is to:

  1. Follow the instructions here, but install the package nvidia-container-runtime instead of nvidia-docker2
    https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#id3
  2. Update your containerd config or set of cri-o hooks appropriately.

To accomplish (2) you can run the following.

For containerd:

$ sudo ctr image pull nvcr.io/nvidia/k8s/container-toolkit:1.4.0

$ sudo ctr run \
    --with-ns pid:/proc/1/ns/pid \
    --mount type=bind,options=rbind:rw,src=/run/containerd,dst=/run/containerd \
    --mount type=bind,options=rbind:rw,src=/etc/containerd,dst=/etc/containerd \
    nvcr.io/nvidia/k8s/container-toolkit:1.4.0 \
    toolkit \
    containerd setup \
        --socket=/run/containerd/containerd.sock \
        --config=/etc/containerd/config.toml \
        --runtime-class=nvidia \
        --set-as-default=true \
        /usr/bin

$ sudo ctr container rm toolkit

Under the hood, this is just doing the dirty work of updating your containerd config.toml file appropriately and then restarting the containerd service for you.

You can customize this to other values for the flags depending on where your containerd instance is running from.

For cri-o:

$ sudo mkdir -p /usr/share/containers/oci/hooks.d

$ sudo bash -c '
cat > /usr/share/containers/oci/hooks.d/oci-nvidia-hook.json << EOF
{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/bin/nvidia-container-toolkit",
        "args": ["nvidia-container-toolkit", "prestart"]
    },
    "when": {
        "always": true,
        "commands": [".*"]
    },
    "stages": ["prestart"]
}
EOF
'

tried running the ctr run command above, got the following

 sudo ctr run \
>     --with-ns pid:/proc/1/ns/pid \
>     --mount type=bind,options=rbind:rw,src=/run/containerd,dst=/run/containerd \
>     --mount type=bind,options=rbind:rw,src=/etc/containerd,dst=/etc/containerd \
>     nvcr.io/nvidia/k8s/container-toolkit:1.4.0 \
>     toolkit \
>     containerd setup \
>         --socket=/run/containerd/containerd.sock \
>         --config=/etc/containerd/config.toml \
>         --runtime-class=nvidia \
>         --set-as-default=true \
>         /usr/bin
ctr: image "nvcr.io/nvidia/k8s/container-toolkit:1.4.0": not found

any ideas @klueska ?

With ctr you need to explicitly pull the image before running it.

docker image ls |grep nvcr nvcr.io/nvidia/k8s/container-toolkit 1.4.0 b13fa3054839 11 days ago 203MB nvcr.io/nvidia/k8s/container-toolkit latest b13fa3054839 11 days ago 203MB root@homelab-a:/home/evan# ctr run --with-ns pid:/proc/1/ns/pid --mount type=bind,options=rbind:rw,src=/run/containerd,dst=/run/containerd --mount type=bind,options=rbind:rw,src=/etc/containerd,dst=/etc/containerd nvcr.io/nvidia/k8s/container-toolkit:1.4.0 toolkit containerd setup --socket=/run/containerd/containerd.sock --config=/etc/containerd/config.toml --runtime-class=nvidia --set-as-default=true /usr/bin ctr: image "nvcr.io/nvidia/k8s/container-toolkit:1.4.0": not found

hmm still doesn't seem to work...

edit..nvm I'm dumb. I was using docker to pull the image, not ctr :( lol

Was this page helpful?
0 / 5 - 0 ratings