_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._
_Also, before reporting a new issue, please make sure that:_
previous steps are same with the tutorial.
after installing nvidia-container-toolkit sudo apt-get install -y nvidia-container-toolkit
when I used the test examples, it always got error.
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
error:
_docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.0, please update your driver to a newer version, or use an earlier cuda container\\n\\"\"": unknown.
ERRO[0018] error waiting for container: context canceled_
just when I run the test examples:
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.0, please update your driver to a newer version, or use an earlier cuda container\\n\\"\"": unknown.
ERRO[0018] error waiting for container: context canceled
I also tried docker run --gpus 1 nvidia/cuda nvidia-smi
the error is similar
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.1, please update your driver to a newer version, or use an earlier cuda container\\n\\"\"": unknown.
ERRO[0124] error waiting for container: context canceled
nvidia-container-cli -k -d /dev/tty infouname -admesgnvidia-smi -adocker versiondpkg -l '*nvidia*' _or_ rpm -qa '*nvidia*'nvidia-container-cli -Vhi @tytcc - what NVIDIA driver version are you running on your Linux system? You should have at least r410 to run CUDA 10.0 containers and r418 to run CUDA 10.1 containers.
Please provide the output of nvidia-smi
Thanks for your answering. @dualvtable
Now I know my NVIDIA driver version is too old.
@tytcc I also faced the same problem on ubunut16.04 machine. I have the latest driver 440.64.00 installed and now i tried to run example
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
i get this error
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: cuda error: unknown error\\\\n\\\"\"": unknown.
ERRO[0001] error waiting for container: context canceled
Getting the same output after installing nvidia-containers-toolkit.
***@pop-os:~$ docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: detection error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0000] error waiting for container: context canceled
Tried the steps mentioned in #1114 but still no luck.
nvidia-smi output:
NVIDIA-SMI 440.64 Driver Version: 440.64 CUDA Version: 10.2
0 Quadro M2000M Off
OS details:
***@pop-os:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Pop!_OS 18.04 LTS
Release: 18.04
Codename: bionic
I am seeing the same with driver version 440.82:
# docker run \
--rm \
--runtime=nvidia \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_CAPABILITIES=all \
nvidia/cuda nvidia-smi
/run/torcx/bin/docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"process_linux.go:385: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli.real: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
# uname -a
Linux core1 4.19.106-coreos #1 SMP Wed Feb 26 21:43:18 -00 2020 x86_64 Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz GenuineIntel GNU/Linux
# dockerd --version
Docker version 18.06.3-ce, build d7080c1
# /opt/drivers/nvidia/bin/nvidia-smi
Fri Apr 17 12:54:34 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 Off | 00000000:02:00.0 Off | N/A |
| 0% 45C P8 9W / 160W | 0MiB / 5932MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Getting the same error:
keo7@home-desktop:~$ uname -a
Linux home-desktop 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1+deb10u1 (2020-04-27) x86_64 GNU/Linux
keo7@home-desktop:~$ dockerd --version
Docker version 19.03.8, build afacb8b7f0
keo7@home-desktop:~$ nvidia-smi
Fri May 8 22:39:56 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82 Driver Version: 440.82 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN V On | 00000000:26:00.0 On | N/A |
| 28% 43C P2 38W / 250W | 678MiB / 12066MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
@KeironO I wouldn't bother using the nvidia runtime in my opinion, it's disruptive to the setup of your distribution's runc (or whatever OCI runtime you have), clearly it has some issues and all it does is wrap runc with some helpers controlled by environment variables (at least from what I can tell).
If you can find out what your application needs you should be able to expose the devices and libraries from the host manually without having to have an extra binary to manage.
If there are other benefits I'd be interested to know as I have my GPU accelerated workloads running without needing to change my host's Docker setup.
I am also getting the same error with the same setup as @KeironO.
Hi there!
nvidia-container-cli.real: initialization error: driver error: failed to process request\\n\\"\"": unknown.
@billwhiteley @KeironO
Most of the time this issue is linked to an incorrect driver installation or incorrect driver loading . We can usually figure out which one it is when the issue template is filled :)
Unfortunately being able to run nvidia-smi doesn't mean that your driver is fully loaded and you'll see issues later down the line (such as when running CUDA code or tensorflow).
I wouldn't bother using the nvidia runtime in my opinion, it's disruptive to the setup of your distribution's runc (or whatever OCI runtime you have), clearly it has some issues and all it does is wrap runc with some helpers controlled by environment variables (at least from what I can tell).
If you can find out what your application needs you should be able to expose the devices and libraries from the host manually without having to have an extra binary to manage.
The NVIDIA runtime is only expected to be installed in a Kubernetes environment. For a docker only the nvidia-container-toolkit is required (see the README).
As for implementing what the NVIDIA Container Toolkit does, you can certainly do that, however this would this probably have a high upfront cost for you to understand the details of the NVIDIA driver and userland architecture, and I'm not sure you want to be maintaining such a piece of software :) You would also be missing on new driver features as they come out, and if the CUDA or NVIDIA driver model changes you'd have to rewrite that software.
Without bringing up enterprise support or general support, if your use case is narrow enough and you don't mind paying that maintenance cost that's definitely an option :)
For a Kubernetes environment the NVIDIA runtime provides even less benefit, all you need are the NVIDIA drivers/libraries on the host and this DaemonSet and then GPUs can be requested in the normal Kubernetes way:
resources:
limits:
nvidia.com/gpu: 1
Relevant Kubernetes documentation is here.
If your libraries aren't in the default location (/home/kubernetes/bin/nvidia for some reason) you can specify the location manually using the -host-path flag. You may need to add an NVIDIA entry to your container's /etc/ld.so.conf.d and run ldconfig so that the libraries can be found by your application.
Here's the full usage:
Usage of /usr/bin/nvidia-gpu-device-plugin:
-alsologtostderr
log to standard error as well as files
-container-path string
Path on the container that mounts '-host-path' (default "/usr/local/nvidia")
-container-vulkan-icd-path string
Path on the container that mounts '-host-vulkan-icd-path' (default "/etc/vulkan/icd.d")
-host-path string
Path on the host that contains nvidia libraries. This will be mounted inside the container as '-container-path' (default "/home/kubernetes/bin/nvidia")
-host-vulkan-icd-path string
Path on the host that contains the Nvidia Vulkan installable client driver. This will be mounted inside the container as '-container-vulkan-icd-path' (default "/home/kubernetes/bin/nvidia/vulkan/icd.d")
-log_backtrace_at value
when logging hits line file:N, emit a stack trace
-log_dir string
If non-empty, write log files in this directory
-logtostderr
log to standard error instead of files
-plugin-directory string
The directory path to create plugin socket (default "/device-plugin")
-stderrthreshold value
logs at or above this threshold go to stderr
-v value
log level for V logs
-vmodule value
comma-separated list of pattern=N settings for file-filtered logging
im having the same issuse.
0511 15:53:14.054294 27377 nvc.c:281] initializing library context (version=1.0.7, build=b71f87c04b8eca8a16bf60995506c35c937347d9)
I0511 15:53:14.054490 27377 nvc.c:255] using root /
I0511 15:53:14.054525 27377 nvc.c:256] using ldcache /etc/ld.so.cache
I0511 15:53:14.054595 27377 nvc.c:257] using unprivileged user 1000:1000
W0511 15:53:14.056714 27378 nvc.c:186] failed to set inheritable capabilities
W0511 15:53:14.056939 27378 nvc.c:187] skipping kernel modules load due to failure
I0511 15:53:14.058134 27379 driver.c:133] starting driver service
I0511 15:53:14.107994 27377 nvc_info.c:438] requesting driver information with ''
I0511 15:53:14.109434 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.440.33.01
I0511 15:53:14.109515 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.440.33.01
I0511 15:53:14.109800 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.440.33.01 over /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.440.33.01
I0511 15:53:14.110348 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.440.33.01
I0511 15:53:14.111277 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.440.33.01
I0511 15:53:14.112608 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.440.33.01
I0511 15:53:14.114313 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.440.33.01
I0511 15:53:14.114387 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.440.33.01
I0511 15:53:14.115208 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.440.33.01
I0511 15:53:14.115956 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.440.33.01
I0511 15:53:14.116012 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.440.33.01
I0511 15:53:14.116075 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.440.33.01
I0511 15:53:14.116886 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.440.33.01
I0511 15:53:14.117984 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.440.33.01
I0511 15:53:14.118698 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.440.33.01
I0511 15:53:14.118783 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.440.33.01
I0511 15:53:14.119561 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.440.33.01
I0511 15:53:14.119626 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.440.33.01
I0511 15:53:14.120347 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.440.33.01
I0511 15:53:14.121159 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.440.33.01
I0511 15:53:14.121611 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.440.33.01
I0511 15:53:14.121935 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.33.01
I0511 15:53:14.122775 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.440.33.01
I0511 15:53:14.123599 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.440.33.01
I0511 15:53:14.123773 27377 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.440.33.01
I0511 15:53:14.125503 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.440.48.02
I0511 15:53:14.126273 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.440.48.02
I0511 15:53:14.127346 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-opencl.so.440.48.02
I0511 15:53:14.128794 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-ml.so.440.48.02
I0511 15:53:14.130323 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-fbc.so.440.48.02
I0511 15:53:14.132006 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-fatbinaryloader.so.440.48.02
I0511 15:53:14.133270 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-encode.so.440.48.02
I0511 15:53:14.135013 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvidia-compiler.so.440.48.02
I0511 15:53:14.136295 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libnvcuvid.so.440.48.02
I0511 15:53:14.137890 27377 nvc_info.c:154] skipping /usr/lib/i386-linux-gnu/libcuda.so.440.48.02
W0511 15:53:14.138059 27377 nvc_info.c:303] missing library libvdpau_nvidia.so
W0511 15:53:14.138076 27377 nvc_info.c:307] missing compat32 library libnvidia-ml.so
W0511 15:53:14.138088 27377 nvc_info.c:307] missing compat32 library libnvidia-cfg.so
W0511 15:53:14.138098 27377 nvc_info.c:307] missing compat32 library libcuda.so
W0511 15:53:14.138108 27377 nvc_info.c:307] missing compat32 library libnvidia-opencl.so
W0511 15:53:14.138124 27377 nvc_info.c:307] missing compat32 library libnvidia-ptxjitcompiler.so
W0511 15:53:14.138148 27377 nvc_info.c:307] missing compat32 library libnvidia-fatbinaryloader.so
W0511 15:53:14.138166 27377 nvc_info.c:307] missing compat32 library libnvidia-compiler.so
W0511 15:53:14.138185 27377 nvc_info.c:307] missing compat32 library libvdpau_nvidia.so
W0511 15:53:14.138205 27377 nvc_info.c:307] missing compat32 library libnvidia-encode.so
W0511 15:53:14.138227 27377 nvc_info.c:307] missing compat32 library libnvidia-opticalflow.so
W0511 15:53:14.138250 27377 nvc_info.c:307] missing compat32 library libnvcuvid.so
W0511 15:53:14.138267 27377 nvc_info.c:307] missing compat32 library libnvidia-eglcore.so
W0511 15:53:14.138287 27377 nvc_info.c:307] missing compat32 library libnvidia-glcore.so
W0511 15:53:14.138308 27377 nvc_info.c:307] missing compat32 library libnvidia-tls.so
W0511 15:53:14.138328 27377 nvc_info.c:307] missing compat32 library libnvidia-glsi.so
W0511 15:53:14.138349 27377 nvc_info.c:307] missing compat32 library libnvidia-fbc.so
W0511 15:53:14.138367 27377 nvc_info.c:307] missing compat32 library libnvidia-ifr.so
W0511 15:53:14.138384 27377 nvc_info.c:307] missing compat32 library libnvidia-rtcore.so
W0511 15:53:14.138405 27377 nvc_info.c:307] missing compat32 library libnvoptix.so
W0511 15:53:14.138426 27377 nvc_info.c:307] missing compat32 library libGLX_nvidia.so
W0511 15:53:14.138444 27377 nvc_info.c:307] missing compat32 library libEGL_nvidia.so
W0511 15:53:14.138468 27377 nvc_info.c:307] missing compat32 library libGLESv2_nvidia.so
W0511 15:53:14.138491 27377 nvc_info.c:307] missing compat32 library libGLESv1_CM_nvidia.so
W0511 15:53:14.138511 27377 nvc_info.c:307] missing compat32 library libnvidia-glvkspirv.so
W0511 15:53:14.138531 27377 nvc_info.c:307] missing compat32 library libnvidia-cbl.so
I0511 15:53:14.140096 27377 nvc_info.c:233] selecting /usr/bin/nvidia-smi
I0511 15:53:14.140154 27377 nvc_info.c:233] selecting /usr/bin/nvidia-debugdump
I0511 15:53:14.140212 27377 nvc_info.c:233] selecting /usr/bin/nvidia-persistenced
I0511 15:53:14.140269 27377 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-control
I0511 15:53:14.140324 27377 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-server
I0511 15:53:14.140395 27377 nvc_info.c:370] listing device /dev/nvidiactl
I0511 15:53:14.140415 27377 nvc_info.c:370] listing device /dev/nvidia-uvm
I0511 15:53:14.140432 27377 nvc_info.c:370] listing device /dev/nvidia-uvm-tools
I0511 15:53:14.140449 27377 nvc_info.c:370] listing device /dev/nvidia-modeset
I0511 15:53:14.140520 27377 nvc_info.c:274] listing ipc /run/nvidia-persistenced/socket
W0511 15:53:14.140573 27377 nvc_info.c:278] missing ipc /tmp/nvidia-mps
I0511 15:53:14.140594 27377 nvc_info.c:494] requesting device information with ''
I0511 15:53:14.147767 27377 nvc_info.c:524] listing device /dev/nvidia0 (GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371 at 00000000:03:00.0)
NVRM version: 440.33.01
CUDA version: 10.2
Device Index: 0
Device Minor: 0
Model: GeForce 920MX
Brand: GeForce
GPU UUID: GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371
Bus Location: 00000000:03:00.0
Architecture: 5.0
I0511 15:53:14.147861 27377 nvc.c:318] shutting down library context
I0511 15:53:14.148492 27379 driver.c:192] terminating driver service
I0511 15:53:14.234076 27377 driver.c:233] driver service terminated successfully
kernel version
Linux hema 5.3.0-51-generic #44~18.04.2-Ubuntu SMP Thu Apr 23 14:27:18 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon May 11 17:55:40 2020
Driver Version : 440.33.01
CUDA Version : 10.2
Attached GPUs : 1
GPU 00000000:03:00.0
Product Name : GeForce 920MX
Product Brand : GeForce
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-23fcb2ab-a6c2-b9e3-f455-6bf92a57b371
Minor Number : 0
VBIOS Version : 82.08.5A.00.0D
MultiGPU Board : No
Board ID : 0x300
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x03
Device : 0x00
Domain : 0x0000
Device Id : 0x134F10DE
Bus Id : 00000000:03:00.0
Sub System Id : 0x39F117AA
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 4x
Current : 4x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 493000 KB/s
Rx Throughput : 3000 KB/s
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : N/A
HW Power Brake Slowdown : N/A
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 2004 MiB
Used : 870 MiB
Free : 1134 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 3 MiB
Free : 253 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Temperature
GPU Current Temp : 41 C
GPU Shutdown Temp : 99 C
GPU Slowdown Temp : 94 C
GPU Max Operating Temp : 98 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 993 MHz
SM : 993 MHz
Memory : 900 MHz
Video : 973 MHz
Applications Clocks
Graphics : 967 MHz
Memory : 900 MHz
Default Applications Clocks
Graphics : 965 MHz
Memory : 900 MHz
Max Clocks
Graphics : 993 MHz
SM : 993 MHz
Memory : 900 MHz
Video : 973 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes
Process ID : 1347
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 34 MiB
Process ID : 1655
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 76 MiB
Process ID : 2765
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 184 MiB
Process ID : 2951
Type : G
Name : /usr/bin/gnome-shell
Used GPU Memory : 273 MiB
Process ID : 3830
Type : G
Name : /opt/google/chrome/chrome --type=gpu-process --field-trial-handle=17903442744480519122,5081937925041455948,131072 --gpu-preferences=MAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAABgAAAAAAAQAAAAAAAAAAAAAAAAAAAACAAAAAAAAAA= --shared-files
Used GPU Memory : 292 MiB
docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:25:46 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:24:19 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
nvidia packages
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============================================-============================-============================-=================================================================================================
un libgldispatch0-nvidia <none> <none> (no description available)
ii libnvidia-cfg1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA binary OpenGL/GLX configuration library
un libnvidia-cfg1-any <none> <none> (no description available)
un libnvidia-common <none> <none> (no description available)
ii libnvidia-common-440 440.82-0ubuntu0~0.18.04.1 all Shared files used by the NVIDIA libraries
rc libnvidia-compute-435:amd64 435.21-0ubuntu0.18.04.2 amd64 NVIDIA libcompute package
ii libnvidia-compute-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.0.7-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.0.7-1 amd64 NVIDIA container runtime library
un libnvidia-decode <none> <none> (no description available)
ii libnvidia-decode-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA Video Decoding runtime libraries
un libnvidia-encode <none> <none> (no description available)
ii libnvidia-encode-440:amd64 440.33.01-0ubuntu1 amd64 NVENC Video Encoding runtime library
un libnvidia-fbc1 <none> <none> (no description available)
ii libnvidia-fbc1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
un libnvidia-gl <none> <none> (no description available)
ii libnvidia-gl-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un libnvidia-ifr1 <none> <none> (no description available)
ii libnvidia-ifr1-440:amd64 440.33.01-0ubuntu1 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
un libnvidia-ml1 <none> <none> (no description available)
un nvidia-304 <none> <none> (no description available)
un nvidia-340 <none> <none> (no description available)
un nvidia-384 <none> <none> (no description available)
un nvidia-390 <none> <none> (no description available)
un nvidia-common <none> <none> (no description available)
rc nvidia-compute-utils-435 435.21-0ubuntu0.18.04.2 amd64 NVIDIA compute utilities
ii nvidia-compute-utils-440 440.33.01-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-container-runtime 3.1.4-1 amd64 NVIDIA container runtime
un nvidia-container-runtime-hook <none> <none> (no description available)
ii nvidia-container-toolkit 1.0.5-1 amd64 NVIDIA container runtime hook
rc nvidia-dkms-435 435.21-0ubuntu0.18.04.2 amd64 NVIDIA DKMS package
ii nvidia-dkms-440 440.33.01-0ubuntu1 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel <none> <none> (no description available)
un nvidia-docker <none> <none> (no description available)
rc nvidia-docker2 2.2.2-1 all nvidia-docker CLI wrapper
ii nvidia-driver-440 440.33.01-0ubuntu1 amd64 NVIDIA driver metapackage
un nvidia-driver-binary <none> <none> (no description available)
un nvidia-kernel-common <none> <none> (no description available)
rc nvidia-kernel-common-435 435.21-0ubuntu0.18.04.2 amd64 Shared files used with the kernel module
ii nvidia-kernel-common-440 440.33.01-0ubuntu1 amd64 Shared files used with the kernel module
un nvidia-kernel-source <none> <none> (no description available)
un nvidia-kernel-source-435 <none> <none> (no description available)
ii nvidia-kernel-source-440 440.33.01-0ubuntu1 amd64 NVIDIA kernel source package
un nvidia-legacy-304xx-vdpau-driver <none> <none> (no description available)
un nvidia-legacy-340xx-vdpau-driver <none> <none> (no description available)
un nvidia-libopencl1-dev <none> <none> (no description available)
ii nvidia-modprobe 440.33.01-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
un nvidia-opencl-icd <none> <none> (no description available)
un nvidia-persistenced <none> <none> (no description available)
ii nvidia-prime 0.8.8.2 all Tools to enable NVIDIA's Prime
ii nvidia-settings 440.64-0ubuntu0~0.18.04.1 amd64 Tool for configuring the NVIDIA graphics driver
un nvidia-settings-binary <none> <none> (no description available)
un nvidia-smi <none> <none> (no description available)
un nvidia-utils <none> <none> (no description available)
ii nvidia-utils-440 440.33.01-0ubuntu1 amd64 NVIDIA driver support binaries
un nvidia-vdpau-driver <none> <none> (no description available)
ii xserver-xorg-video-nvidia-440 440.33.01-0ubuntu1 amd64 NVIDIA binary Xorg driver
Meet the same problem, any solutions?
@elliothe i have uninstalled CUDA, Nvidia Drivers, nvidia docker and docker. Then installed everything again from scratch. This solved the problem for me
@HemaZ Thanks for the solution. I may do the same if I have no alternative ways.
Got same problem, fixed by run sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
OS/docker info:
$ dockerd --version
Docker version 19.03.8, build afacb8b7f0
$ uname -a
Linux x 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
@albin3 didn't fix it for me ..
I followed all instructions in https://developer.nvidia.com/blog/announcing-cuda-on-windows-subsystem-for-linux-2 yet still seeing:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"mountpoint for devices not found\"": unknown.
Same issue here.
Trying to run this repository´s demo but I got the following error
$ docker-compose up
ERROR: for vehicle_counting Cannot start service vehicle_counting: OCI runtime create failed: container_linux.go:349: starting
container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 cause
\\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown
ERROR: for vehicle_counting Cannot start service vehicle_counting: OCI runtime create failed: container_linux.go:349: starting
container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused
\\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process
request\\\\n\\\"\"": unknown
Tried to instlal nvidia-toolkit as suggested in here but still not working.
Here's my $ docker info output
Client:
Debug Mode: false
Server:
Containers: 3
Running: 0
Paused: 0
Stopped: 3
Images: 7
Server Version: 19.03.12
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: nvidia runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-42-generic
Operating System: Ubuntu 20.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.844GiB
Name: geo-vbox
ID: PLLH:2H5F:NGLW:52TT:2Q77:AUHV:S3PX:3THU:XIEA:NYMX:FEYD:E2AT
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Any idea how to solve it?
Same issue. I working with docker version Docker version 19.03.12, build 48a66213fe inside wsl 2 emulation for win10
I also have the same problem, working with Docker version 19.03.12inside WSL 2 emulation for Windows 10. Kernal Version: 4.19.121-microsoft-standard.
Having same issue with AGX Xavier:
https://github.com/NVIDIA/nvidia-docker/issues/1203#issuecomment-670640220
Exact same issue here. Followed nvidia guide
Window 10 version 1909 build 18363.1049
Docker version 19.03.12
WSL2
Ubuntu 18.04 and 20.04
Kernal Version: 4.19.121-microsoft-standard
Windows nvidia drivers 455.41
CUDA 11.1
The output of
nvidia-container-cli -k -d /dev/tty info
I0821 16:21:57.950311 5686 nvc.c:282] initializing library context (version=1.3.0, build=af0220ff5c503d9ac6a1b5a491918229edbb37a4)
I0821 16:21:57.950354 5686 nvc.c:256] using root /
I0821 16:21:57.950358 5686 nvc.c:257] using ldcache /etc/ld.so.cache
I0821 16:21:57.950376 5686 nvc.c:258] using unprivileged user 1000:1000
I0821 16:21:57.950389 5686 nvc.c:299] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0821 16:21:57.950454 5686 nvc.c:301] dxcore initialization failed, continuing assuming a non-WSL environment
W0821 16:21:57.950514 5686 nvc.c:172] failed to detect NVIDIA devices
W0821 16:21:57.950641 5687 nvc.c:187] failed to set inheritable capabilities
W0821 16:21:57.950680 5687 nvc.c:188] skipping kernel modules load due to failure
I0821 16:21:57.950836 5688 driver.c:101] starting driver service
E0821 16:21:57.950966 5688 driver.c:161] could not start driver service: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory
I0821 16:21:57.951083 5686 driver.c:196] driver service terminated successfully
nvidia-container-cli: initialization error: driver error: failed to process request
Same here stuck
Same here stuck
The original issue described here, that has an error of:
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.1, please update your driver to a newer version, or use an earlier cuda container
Is due to the fact that the original poster had an NVIDIA driver that was too old to run CUDA 10.1.
The poster acknowledged this and closed the issue on March 21st.
https://github.com/NVIDIA/nvidia-docker/issues/1225#issuecomment-601990042
Since that time, this issue has been reopened and commented on many times with unrelated error messages.
Since the original issue was resolved, I am going to close this issue again, and encourage you to open a new issue if you are still having problems with different errors.
https://ngc.nvidia.com/catalog/containers/nvidia:l4t-base
try using this base image. it solved all my problems to jetson tegra arm64 architecture issues and now I can seamlessly docker pull and use my docker images across jetson tegra devices
Anytime nvidia docker fails you will see an error that begins with:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: ...
This part of the error message is output by docker itself, and is out of our control.
It's the part after stderr: that is relevant to nvidia-docker.
In the original post, this error was:
```
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.0, please update your driver to a newer version, or use an earlier cuda container
````
@gsss124 is this actually the same error response you were seeing? Given the description of your problem, it seems unlikely.
In any case, I would recommend performing your step 4 using docker's daemon.json file instead of editing the docker service directly:
$ cat /etc/docker/daemon.json
{
"data-root": "/your/custom/location",
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Anytime nvidia docker fails you will see an error that begins with:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: ...This part of the error message is output by docker itself, and is out of our control.
It's the part after
stderr:that is relevant tonvidia-docker.In the original post, this error was:
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.0, please update your driver to a newer version, or use an earlier cuda container@gsss124 is this actually the same error response you were seeing? Given the description of your problem, it seems unlikely.
Thanks for the reply. This was not the error, it was only related to OCI. Now docker info gives the custom data-root location, but to my surprise it is still using the system drive as I see a reduction in space available on the system drive and space available is same in my custom data-root drive. So, I will delete my reply above.
In any case, I would recommend performing your step 4 using docker's
daemon.jsonfile instead of editing the docker service directly:$ cat /etc/docker/daemon.json { "data-root": "/your/custom/location", "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } }
Thanks for this, but I tried this method and it did not work for me. But I will give it a shot again by adding system restart step. I even tried nvidia-container-runtime separately, that didn't work. After editing docker.service, it gave me data-root as a my custom location but still using the /var/lib/docker location to store data! I don't understand what is happening.
In any case, I would recommend performing your step 4 using docker's
daemon.jsonfile instead of editing the docker service directly:$ cat /etc/docker/daemon.json { "data-root": "/your/custom/location", "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } }Thanks for this, but I tried this method and it did not work for me. But I will give it a shot again by adding system restart step. I even tried nvidia-container-runtime separately, that didn't work. After editing docker.service, it gave me data-root as a my custom location but still using the /var/lib/docker location to store data! I don't understand what is happening.
To my horror, it has created a new drive taking a part of space of system drive, named it to my custom data-root name and renamed my old drive! It's not using /var/lib/docker, but a part of it renamed to my custom data-root name.
sudo service docker start
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: stat failed: /usr/lib/wsl/lib/libcuda.so.1: no such file or directory\\n\\"\"": unknown.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: stat failed: /usr/lib/wsl/lib/libcuda.so.1: no such file or directory\\n\\"\"": unknown.
ldconfig -p | grep cuda
libicudata.so.66 (libc6,x86-64) => /lib/x86_64-linux-gnu/libicudata.so.66
libcuda.so.1 (libc6,x86-64) => /usr/lib/wsl/lib/libcuda.so.1
ls -al /usr/lib/wsl/lib
total 70792
dr-xr-xr-x 1 root root 512 Sep 18 15:53 .
drwxr-xr-x 4 root root 4096 Sep 18 12:28 ..
-r--r--r-- 1 root root 124664 Aug 30 09:51 libcuda.so
-r--r--r-- 2 root root 832936 Sep 12 08:44 libd3d12.so
-r--r--r-- 2 root root 5073944 Sep 12 08:44 libd3d12core.so
-r--r--r-- 2 root root 25069816 Sep 12 08:44 libdirectml.so
-r--r--r-- 2 root root 878768 Sep 12 08:44 libdxcore.so
-r--r--r-- 1 root root 40496936 Aug 30 09:51 libnvwgf2umx.so
sudo ln -s /usr/lib/wsl/lib/libcuda.so /usr/lib/wsl/lib/libcuda.so.1
ln: failed to create symbolic link '/usr/lib/wsl/lib/libcuda.so.1': Read-only file system
seems as if its missing and the video driver is still required, unless there is something that can make it appear at location
ls: cannot access '/usr/lib/wsl/lib/libcuda.so.1'
any thoughts?
From my understanding putting the video driver is no longer required in docker -- ubuntu guest
Directory of C:\Windows\System32lxsslib
09/18/2020 03:53 PM
C:\Windows\System32lxsslib>mklink libcuda.so.1 libcuda.so
symbolic link created for libcuda.so.1 <<===>> libcuda.so
still no work, but seems closer
more info docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
18 sudo find / -iname /usr/lib/wsl/lib/libcuda.so.1
19 sudo find / -iname libcuda.so.1
20 ldconfig -p | grep cuda
21 ls /usr/lib/wsl/lib/libcuda.so.1
22 #sudo ls -d libcuda.so.1
23 cd /
24 sudo ls -d libcuda.so.1
25 ls -al /usr/lib/wsl
26 ls -al /usr/lib/wsl/drivers
27 ls -al /usr/lib/wsl/drivers | grep -i libcuda*
28 ls -al /usr/lib/wsl/
29 ls -al /usr/lib/wsl/lib
30 sudo ln -s /usr/lib/wsl/lib/libcuda.so /usr/lib/wsl/lib/libcuda.so.1
31 sudo ln -s /usr/lib/wsl/lib/libcuda.so.1 /usr/lib/wsl/lib/libcuda.so
32 sudo ln -s /usr/lib/wsl/lib/libcuda.so /usr/lib/wsl/lib/libcuda.so.1
33 echo $LD_LIBRARY_PATH
34 sudo apt install nvidia-361-dev
35 nvidia-smi
36 sudo apt isntall nvidia-utils-435
37 sudo apt install nvidia-utils-435
38 cd %SYSTEMROOT%\System32lxsslib
39 cd %SYSTEMROOT%\
40 cd %SYSTEMROOT%
41 ls
42 ls /usr/lib/wsl/lib/
43 ls -al /usr/lib/wsl/lib/
44 docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
45 sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
46 sudo apt-remove nvidia-docker2
47 sudo apt-get remove nvidia-docker2
48 sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
49 docker run --rm --privileged nvidia/cuda nvidia-smi
50 nvidia-docker run --rm nvidia/cuda nvidia-smi
51 nvidia-docker run --rm --privileged nvidia/cuda nvidia-smi
52 docker run --rm --privileged nvidia/cuda nvidia-smi
53 nvidia-smi
54 sudo apt-get install nvidia-docker2
55 nvidia-docker run --rm --privileged nvidia/cuda nvidia-smi
56 docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
57 docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare
58 nvcc --version
59 sudo apt-get install nvidia-cuda-toolkit
60 docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -compare
In any case, I would recommend performing your step 4 using docker's
daemon.jsonfile instead of editing the docker service directly:$ cat /etc/docker/daemon.json { "data-root": "/your/custom/location", "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } }
I tried this again by editing /etc/docker/daemon.json and got the following stderr:
nvidia-container-cli: ldcache error: process /sbin/ldconfig.real failed with error code: 1\\\\n\\\"\""
Full output:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: ldcache error: process /sbin/ldconfig.real failed with error code: 1\\\\n\\\"\"": unknown
docker info now displays the required custom directory and space is reduced in the right directory. Now it is ldcache error. I checked here but my seccomp output is YES:
>>cat /boot/config-$(uname -r) | grep -i seccomp
Output:
CONFIG_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
_Please suggest what might be the problem._
sudo service docker start
* Starting Docker: docker [ OK ]sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: stat failed: /usr/lib/wsl/lib/libcuda.so.1: no such file or directory\n\""": unknown.docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused "process_linux.go:432: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: stat failed: /usr/lib/wsl/lib/libcuda.so.1: no such file or directory\n\""": unknown.ldconfig -p | grep cuda
libicudata.so.66 (libc6,x86-64) => /lib/x86_64-linux-gnu/libicudata.so.66
libcuda.so.1 (libc6,x86-64) => /usr/lib/wsl/lib/libcuda.so.1ls -al /usr/lib/wsl/lib
total 70792
dr-xr-xr-x 1 root root 512 Sep 18 15:53 .
drwxr-xr-x 4 root root 4096 Sep 18 12:28 ..
-r--r--r-- 1 root root 124664 Aug 30 09:51 libcuda.so
-r--r--r-- 2 root root 832936 Sep 12 08:44 libd3d12.so
-r--r--r-- 2 root root 5073944 Sep 12 08:44 libd3d12core.so
-r--r--r-- 2 root root 25069816 Sep 12 08:44 libdirectml.so
-r--r--r-- 2 root root 878768 Sep 12 08:44 libdxcore.so
-r--r--r-- 1 root root 40496936 Aug 30 09:51 libnvwgf2umx.sosudo ln -s /usr/lib/wsl/lib/libcuda.so /usr/lib/wsl/lib/libcuda.so.1
ln: failed to create symbolic link '/usr/lib/wsl/lib/libcuda.so.1': Read-only file systemseems as if its missing and the video driver is still required, unless there is something that can make it appear at location
ls: cannot access '/usr/lib/wsl/lib/libcuda.so.1'
any thoughts?
From my understanding putting the video driver is no longer required in docker -- ubuntu guest
Are you using a virtual machine? As stated by @klueska output after stderr is of interest. Your error says
stderr: nvidia-container-cli: mount error: stat failed: /usr/lib/wsl/lib/libcuda.so.1: no such file or directory\\n\""": unknown.
Something related to nvidia-driver not being available where required.
@wanfuse123 please file a new issue if you need help debugging this. Your issue looks unrelated to the one here (especially since it seems you are running on Windows, and not linux).
@tytcc I also faced the same problem on ubunut16.04 machine. I have the latest driver 440.64.00 installed and now i tried to run example
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
i get this error
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: cuda error: unknown error\\\\n\\\"\"": unknown. ERRO[0001] error waiting for container: context canceled
I also face the problem, did you solve it?
nvidia-smi does not work under wsl2 as of right now. Use the following test
instead
"medium.com" + "how-to-use-nvidia-gpu-in-docker-to-run-tensorflow"
use their
On Sat, Oct 17, 2020 at 10:42 PM chauncygu notifications@github.com wrote:
@tytcc https://github.com/tytcc I also faced the same problem on
ubunut16.04 machine. I have the latest driver 440.64.00 installed and now i
tried to run example
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
i get this error
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:349: starting container process caused
"process_linux.go:449: container init caused \"process_linux.go:432:
running prestart hook 0 caused \\"error running hook: exit status 1,
stdout: , stderr: nvidia-container-cli: initialization error: cuda error:
unknown error\\n\\"\"": unknown. ERRO[0001] error waiting for container:
context canceledI also face the problem, did you solve it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/NVIDIA/nvidia-docker/issues/1225#issuecomment-711108195,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADDBZWO35LJAFRGXOXF4SXLSLJIY3ANCNFSM4LQL2LDA
.
sorry got cut off. Use their testing examples and that container. It costs
5 bucks for access but I thought it was worth it for one year access. (
NOTE I have nothing to do with their site. I just spent the five bucks for
it)
anyway use their testing examples.
You can't use "nvidia-smi" it is not working right now in the
containers. Apparently nvidia and microsoft are working hard on the problem
On Sat, Oct 17, 2020 at 11:00 PM Steven Anderson wanfuse123@gmail.com
wrote:
nvidia-smi does not work under wsl2 as of right now. Use the following
test instead"medium.com" + "how-to-use-nvidia-gpu-in-docker-to-run-tensorflow"
use their
On Sat, Oct 17, 2020 at 10:42 PM chauncygu notifications@github.com
wrote:@tytcc https://github.com/tytcc I also faced the same problem on
ubunut16.04 machine. I have the latest driver 440.64.00 installed and now i
tried to run example
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
i get this error
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:349: starting container process caused
"process_linux.go:449: container init caused \"process_linux.go:432:
running prestart hook 0 caused \\"error running hook: exit status 1,
stdout: , stderr: nvidia-container-cli: initialization error: cuda error:
unknown error\\n\\"\"": unknown. ERRO[0001] error waiting for container:
context canceledI also face the problem, did you solve it?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/NVIDIA/nvidia-docker/issues/1225#issuecomment-711108195,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADDBZWO35LJAFRGXOXF4SXLSLJIY3ANCNFSM4LQL2LDA
.
update on the medium link, look at the comments I have made an updated script that runs a simple test.
@elliothe i have uninstalled CUDA, Nvidia Drivers, nvidia docker and docker. Then installed everything again from scratch. This solved the problem for me
I think it is right. It work for me when I uninstall Nvidia Driver(version 460):
thanks.
iser@iser:~$ sudo apt-get purge nvidia*
正在读取软件包列表... 完成
正在分析软件包的依赖关系树
正在读取状态信息... 完成
注意,根据Glob 'nvidia' 选中了 'nvidia-kernel-common-418-server'
注意,根据Glob 'nvidia' 选中了 'nvidia-325-updates'
注意,根据Glob 'nvidia' 选中了 'nvidia-346-updates'
注意,根据Glob 'nvidia' 选中了 'nvidia-driver-binary'
注意,根据Glob 'nvidia' 选中了 'nvidia-331-dev'
注意,根据Glob 'nvidia' 选中了 'nvidia-304-updates-dev'
注意,根据Glob 'nvidia' 选中了 'nvidia-compute-utils-418-server'
注意,根据Glob 'nvidia' 选中了 'nvidia-384-dev'
注意,根据Glob 'nvidia' 选中了 'nvidia-docker2'
注意,根据Glob 'nvidia' 选中了 'nvidia-libopencl1-346-updates'
注意,根据Glob 'nvidia' 选中了 'nvidia-driver-440-server'
注意,根据Glob 'nvidia' 选中了 'nvidia-340-updates-uvm'
-------following is the note of installing successfully.----------
Adding group iser' (GID 1000) ...
Done.
Adding useriser' ...
Adding new user iser' (1000) with groupiser' ...
Creating home directory /home/iser' ...
Copying files from/etc/skel' ...
[ OK ] Congratulations! You have successfully finished setting up Apollo Dev Environment.
[ OK ] To login into the newly created apollo_dev_iser container, please run the following command:
[ OK ] bash docker/scripts/dev_into.sh
[ OK ] Enjoy!
Most helpful comment
I am seeing the same with driver version
440.82: