Nvidia-docker: Nvidia-docker using cuda installed on ubuntu not the cuda version in the container

Created on 2 Apr 2020  路  4Comments  路  Source: NVIDIA/nvidia-docker

1. Issue or feature description

I've installed nvidia-container toolkit using instructions on the ReadME, but when I run an nvidia docker container it uses the cuda installed on my machine (cuda 10.2) rather than the cuda version in the docker container:

$ sudo docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
Thu Apr  2 14:24:47 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:1D:00.0  On |                  N/A |
|  0%   54C    P0    65W / 250W |   1039MiB / 11175MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

2. Steps to reproduce the issue

I have Ubuntu 18.04 with Cuda 10.1 installed but nvidia-smi says it is Cuda 10.2.

3. Information to attach (optional if deemed irrelevant)

  • [ ] Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info
I0402 14:28:20.591077 31666 nvc.c:281] initializing library context (version=1.0.7, build=b71f87c04b8eca8a16bf60995506c35c937347d9)
I0402 14:28:20.591128 31666 nvc.c:255] using root /
I0402 14:28:20.591137 31666 nvc.c:256] using ldcache /etc/ld.so.cache
I0402 14:28:20.591145 31666 nvc.c:257] using unprivileged user 1000:1000
W0402 14:28:20.592480 31667 nvc.c:186] failed to set inheritable capabilities
W0402 14:28:20.592531 31667 nvc.c:187] skipping kernel modules load due to failure
I0402 14:28:20.592797 31668 driver.c:133] starting driver service
I0402 14:28:20.619918 31666 nvc_info.c:438] requesting driver information with ''
I0402 14:28:20.620692 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.440.59
I0402 14:28:20.620764 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.440.59
I0402 14:28:20.620812 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.440.59
I0402 14:28:20.620860 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.440.59
I0402 14:28:20.620925 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.440.59
I0402 14:28:20.620987 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.440.59
I0402 14:28:20.621031 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.440.59
I0402 14:28:20.621093 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.440.59
I0402 14:28:20.621155 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.440.59
I0402 14:28:20.621198 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.440.59
I0402 14:28:20.621240 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.440.59
I0402 14:28:20.621282 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.440.59
I0402 14:28:20.621340 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.440.59
I0402 14:28:20.621382 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.440.59
I0402 14:28:20.621440 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.440.59
I0402 14:28:20.621510 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.440.59
I0402 14:28:20.621558 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.440.59
I0402 14:28:20.621616 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.440.59
I0402 14:28:20.621662 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.440.59
I0402 14:28:20.622172 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.440.59
I0402 14:28:20.622424 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.59
I0402 14:28:20.622473 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.440.59
I0402 14:28:20.622519 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.440.59
I0402 14:28:20.622565 31666 nvc_info.c:152] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.440.59
I0402 14:28:20.622639 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.440.59
I0402 14:28:20.622684 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.440.59
I0402 14:28:20.622748 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.440.59
I0402 14:28:20.622809 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.440.59
I0402 14:28:20.622854 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.440.59
I0402 14:28:20.622913 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-ifr.so.440.59
I0402 14:28:20.622971 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.440.59
I0402 14:28:20.623014 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.440.59
I0402 14:28:20.623056 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.440.59
I0402 14:28:20.623101 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.440.59
I0402 14:28:20.623158 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-fatbinaryloader.so.440.59
I0402 14:28:20.623198 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.440.59
I0402 14:28:20.623257 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.440.59
I0402 14:28:20.623301 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.440.59
I0402 14:28:20.623348 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.440.59
I0402 14:28:20.623428 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libcuda.so.440.59
I0402 14:28:20.623506 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.440.59
I0402 14:28:20.623549 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.440.59
I0402 14:28:20.623591 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.440.59
I0402 14:28:20.623631 31666 nvc_info.c:152] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.440.59
W0402 14:28:20.623656 31666 nvc_info.c:303] missing library libvdpau_nvidia.so
W0402 14:28:20.623665 31666 nvc_info.c:307] missing compat32 library libnvidia-cfg.so
W0402 14:28:20.623674 31666 nvc_info.c:307] missing compat32 library libvdpau_nvidia.so
W0402 14:28:20.623683 31666 nvc_info.c:307] missing compat32 library libnvidia-rtcore.so
W0402 14:28:20.623694 31666 nvc_info.c:307] missing compat32 library libnvoptix.so
W0402 14:28:20.623704 31666 nvc_info.c:307] missing compat32 library libnvidia-cbl.so
I0402 14:28:20.624064 31666 nvc_info.c:233] selecting /usr/bin/nvidia-smi
I0402 14:28:20.624091 31666 nvc_info.c:233] selecting /usr/bin/nvidia-debugdump
I0402 14:28:20.624114 31666 nvc_info.c:233] selecting /usr/bin/nvidia-persistenced
I0402 14:28:20.624138 31666 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-control
I0402 14:28:20.624163 31666 nvc_info.c:233] selecting /usr/bin/nvidia-cuda-mps-server
I0402 14:28:20.624193 31666 nvc_info.c:370] listing device /dev/nvidiactl
I0402 14:28:20.624202 31666 nvc_info.c:370] listing device /dev/nvidia-uvm
I0402 14:28:20.624212 31666 nvc_info.c:370] listing device /dev/nvidia-uvm-tools
I0402 14:28:20.624224 31666 nvc_info.c:370] listing device /dev/nvidia-modeset
I0402 14:28:20.624256 31666 nvc_info.c:274] listing ipc /run/nvidia-persistenced/socket
W0402 14:28:20.624276 31666 nvc_info.c:278] missing ipc /tmp/nvidia-mps
I0402 14:28:20.624284 31666 nvc_info.c:494] requesting device information with ''
I0402 14:28:20.629847 31666 nvc_info.c:524] listing device /dev/nvidia0 (GPU-86442add-86e3-3748-07cf-1473ed07bf88 at 00000000:1d:00.0)
NVRM version:   440.59
CUDA version:   10.2

Device Index:   0
Device Minor:   0
Model:          GeForce GTX 1080 Ti
Brand:          GeForce
GPU UUID:       GPU-86442add-86e3-3748-07cf-1473ed07bf88
Bus Location:   00000000:1d:00.0
Architecture:   6.1
I0402 14:28:20.629904 31666 nvc.c:318] shutting down library context
I0402 14:28:20.630486 31668 driver.c:192] terminating driver service
I0402 14:28:20.710000 31666 driver.c:233] driver service terminated successfully
  • [ ] Kernel version from uname -a
    Linux b0bby 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • [ ] Any relevant kernel output lines from dmesg
  • [ ] Driver information from nvidia-smi -a
==============NVSMI LOG==============

Timestamp                           : Thu Apr  2 15:31:27 2020
Driver Version                      : 440.59
CUDA Version                        : 10.2

Attached GPUs                       : 1
GPU 00000000:1D:00.0
    Product Name                    : GeForce GTX 1080 Ti
    Product Brand                   : GeForce
    Display Mode                    : Enabled
    Display Active                  : Enabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-86442add-86e3-3748-07cf-1473ed07bf88
    Minor Number                    : 0
    VBIOS Version                   : 86.02.40.00.49
    MultiGPU Board                  : No
    Board ID                        : 0x1d00
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.01.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization Mode         : None
        Host VGPU Mode              : N/A
    IBMNPU
        Relaxed Ordering Mode       : N/A
    PCI
        Bus                         : 0x1D
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1B0610DE
        Bus Id                      : 00000000:1D:00.0
        Sub System Id               : 0x1B0610DE
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays Since Reset         : 0
        Replay Number Rollovers     : 0
        Tx Throughput               : 1000 KB/s
        Rx Throughput               : 26000 KB/s
    Fan Speed                       : 22 %
    Performance State               : P8
    Clocks Throttle Reasons
        Idle                        : Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Not Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Not Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 11175 MiB
        Used                        : 1031 MiB
        Free                        : 10144 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 7 MiB
        Free                        : 249 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 11 %
        Memory                      : 4 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    FBC Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            Single Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit            
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending Page Blacklist      : N/A
    Temperature
        GPU Current Temp            : 61 C
        GPU Shutdown Temp           : 96 C
        GPU Slowdown Temp           : 93 C
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 23.11 W
        Power Limit                 : 250.00 W
        Default Power Limit         : 250.00 W
        Enforced Power Limit        : 250.00 W
        Min Power Limit             : 125.00 W
        Max Power Limit             : 350.00 W
    Clocks
        Graphics                    : 265 MHz
        SM                          : 265 MHz
        Memory                      : 405 MHz
        Video                       : 544 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 2025 MHz
        SM                          : 2025 MHz
        Memory                      : 5505 MHz
        Video                       : 1620 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 744
            Type                    : G
            Name                    : /usr/share/code/code --type=gpu-process --field-trial-handle=419947836154772874,12670331925527841451,131072 --disable-features=LayoutNG,SpareRendererForSitePerProcess --disable-color-correct-rendering --no-sandbox --gpu-preferences=IAAAAAAAAAAgAAAgAAAAAAAAYAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --service-request-channel-token=2505472567094389835
            Used GPU Memory         : 405 MiB
        Process ID                  : 1359
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 20 MiB
        Process ID                  : 1427
            Type                    : G
            Name                    : /usr/bin/gnome-shell
            Used GPU Memory         : 49 MiB
        Process ID                  : 2570
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 231 MiB
        Process ID                  : 2703
            Type                    : G
            Name                    : /usr/bin/gnome-shell
            Used GPU Memory         : 179 MiB
        Process ID                  : 3074
            Type                    : G
            Name                    : /opt/google/chrome/chrome --type=gpu-process --field-trial-handle=15021130468580176559,6932094184711606934,131072 --enable-crash-reporter=ab74912c-8899-49ca-8ffd-bf454f8efaff, --gpu-preferences=KAAAAAAAAAAgAAAgAAAAAAAAYAAAAAAAEAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAA --shared-files
            Used GPU Memory         : 139 MiB

  • [ ] Docker version from docker version
Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b7f0
 Built:             Wed Mar 11 01:25:46 2020
 OS/Arch:           linux/amd64
 Experimental:      false
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/version: dial unix /var/run/docker.sock: connect: permission denied

  • [ ] NVIDIA packages version from dpkg -l '*nvidia*' _or_ rpm -qa '*nvidia*'
||/ Name                         Version             Architecture        Description
+++-============================-===================-===================-=============================================================
un  libgldispatch0-nvidia        <none>              <none>              (no description available)
ii  libnvidia-cfg1-440:amd64     440.59-0ubuntu0.18. amd64               NVIDIA binary OpenGL/GLX configuration library
un  libnvidia-cfg1-any           <none>              <none>              (no description available)
un  libnvidia-common             <none>              <none>              (no description available)
ii  libnvidia-common-418         430.64-0ubuntu0~gpu all                 Transitional package for libnvidia-common-430
ii  libnvidia-common-430         440.59-0ubuntu0.18. all                 Transitional package for libnvidia-common-440
ii  libnvidia-common-435         435.21-0ubuntu0.18. all                 Shared files used by the NVIDIA libraries
ii  libnvidia-common-440         440.59-0ubuntu0.18. all                 Shared files used by the NVIDIA libraries
ii  libnvidia-compute-418:amd64  430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-compute-430
ii  libnvidia-compute-430:amd64  440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-compute-440
rc  libnvidia-compute-435:amd64  435.21-0ubuntu0.18. amd64               NVIDIA libcompute package
ii  libnvidia-compute-440:amd64  440.59-0ubuntu0.18. amd64               NVIDIA libcompute package
ii  libnvidia-compute-440:i386   440.59-0ubuntu0.18. i386                NVIDIA libcompute package
ii  libnvidia-container-tools    1.0.7-1             amd64               NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64   1.0.7-1             amd64               NVIDIA container runtime library
un  libnvidia-decode             <none>              <none>              (no description available)
ii  libnvidia-decode-418:amd64   430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-decode-430
ii  libnvidia-decode-430:amd64   440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-decode-440
ii  libnvidia-decode-440:amd64   440.59-0ubuntu0.18. amd64               NVIDIA Video Decoding runtime libraries
ii  libnvidia-decode-440:i386    440.59-0ubuntu0.18. i386                NVIDIA Video Decoding runtime libraries
un  libnvidia-encode             <none>              <none>              (no description available)
ii  libnvidia-encode-418:amd64   430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-encode-430
ii  libnvidia-encode-430:amd64   440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-encode-440
ii  libnvidia-encode-440:amd64   440.59-0ubuntu0.18. amd64               NVENC Video Encoding runtime library
ii  libnvidia-encode-440:i386    440.59-0ubuntu0.18. i386                NVENC Video Encoding runtime library
un  libnvidia-fbc1               <none>              <none>              (no description available)
ii  libnvidia-fbc1-418:amd64     430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-fbc1-430
ii  libnvidia-fbc1-430:amd64     440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-fbc1-440
ii  libnvidia-fbc1-440:amd64     440.59-0ubuntu0.18. amd64               NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-fbc1-440:i386      440.59-0ubuntu0.18. i386                NVIDIA OpenGL-based Framebuffer Capture runtime library
un  libnvidia-gl                 <none>              <none>              (no description available)
ii  libnvidia-gl-418:amd64       430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-gl-430
ii  libnvidia-gl-430:amd64       440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-gl-440
un  libnvidia-gl-435             <none>              <none>              (no description available)
ii  libnvidia-gl-440:amd64       440.59-0ubuntu0.18. amd64               NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  libnvidia-gl-440:i386        440.59-0ubuntu0.18. i386                NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un  libnvidia-ifr1               <none>              <none>              (no description available)
ii  libnvidia-ifr1-418:amd64     430.64-0ubuntu0~gpu amd64               Transitional package for libnvidia-ifr1-430
ii  libnvidia-ifr1-430:amd64     440.59-0ubuntu0.18. amd64               Transitional package for libnvidia-ifr1-440
ii  libnvidia-ifr1-440:amd64     440.59-0ubuntu0.18. amd64               NVIDIA OpenGL-based Inband Frame Readback runtime library
ii  libnvidia-ifr1-440:i386      440.59-0ubuntu0.18. i386                NVIDIA OpenGL-based Inband Frame Readback runtime library
un  libnvidia-ml1                <none>              <none>              (no description available)
un  nvidia-304                   <none>              <none>              (no description available)
un  nvidia-340                   <none>              <none>              (no description available)
un  nvidia-384                   <none>              <none>              (no description available)
un  nvidia-390                   <none>              <none>              (no description available)
un  nvidia-common                <none>              <none>              (no description available)
ii  nvidia-compute-utils-418:amd 430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-compute-utils-430
ii  nvidia-compute-utils-430:amd 440.59-0ubuntu0.18. amd64               Transitional package for nvidia-compute-utils-440
rc  nvidia-compute-utils-435     435.21-0ubuntu0.18. amd64               NVIDIA compute utilities
ii  nvidia-compute-utils-440     440.59-0ubuntu0.18. amd64               NVIDIA compute utilities
ii  nvidia-container-runtime     3.1.4-1             amd64               NVIDIA container runtime
un  nvidia-container-runtime-hoo <none>              <none>              (no description available)
ii  nvidia-container-toolkit     1.0.5-1             amd64               NVIDIA container runtime hook
ii  nvidia-dkms-418              430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-dkms-430
ii  nvidia-dkms-430              440.59-0ubuntu0.18. amd64               Transitional package for nvidia-dkms-440
rc  nvidia-dkms-435              435.21-0ubuntu0.18. amd64               NVIDIA DKMS package
ii  nvidia-dkms-440              440.59-0ubuntu0.18. amd64               NVIDIA DKMS package
un  nvidia-dkms-kernel           <none>              <none>              (no description available)
ii  nvidia-driver-418            430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-driver-430
ii  nvidia-driver-430            440.59-0ubuntu0.18. amd64               Transitional package for nvidia-driver-440
ii  nvidia-driver-440            440.59-0ubuntu0.18. amd64               NVIDIA driver metapackage
un  nvidia-driver-binary         <none>              <none>              (no description available)
un  nvidia-kernel-common         <none>              <none>              (no description available)
ii  nvidia-kernel-common-418:amd 430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-kernel-common-430
ii  nvidia-kernel-common-430:amd 440.59-0ubuntu0.18. amd64               Transitional package for nvidia-kernel-common-440
rc  nvidia-kernel-common-435     435.21-0ubuntu0.18. amd64               Shared files used with the kernel module
ii  nvidia-kernel-common-440     440.59-0ubuntu0.18. amd64               Shared files used with the kernel module
un  nvidia-kernel-source         <none>              <none>              (no description available)
ii  nvidia-kernel-source-418     430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-kernel-source-430
ii  nvidia-kernel-source-430     440.59-0ubuntu0.18. amd64               Transitional package for nvidia-kernel-source-440
un  nvidia-kernel-source-435     <none>              <none>              (no description available)
ii  nvidia-kernel-source-440     440.59-0ubuntu0.18. amd64               NVIDIA kernel source package
un  nvidia-legacy-304xx-vdpau-dr <none>              <none>              (no description available)
un  nvidia-legacy-340xx-vdpau-dr <none>              <none>              (no description available)
un  nvidia-libopencl1-dev        <none>              <none>              (no description available)
ii  nvidia-modprobe              418.67-0ubuntu1     amd64               Load the NVIDIA kernel driver and create device files
un  nvidia-opencl-icd            <none>              <none>              (no description available)
un  nvidia-persistenced          <none>              <none>              (no description available)
ii  nvidia-prime                 0.8.8.2             all                 Tools to enable NVIDIA's Prime
ii  nvidia-settings              440.44-0ubuntu0.18. amd64               Tool for configuring the NVIDIA graphics driver
un  nvidia-settings-binary       <none>              <none>              (no description available)
un  nvidia-smi                   <none>              <none>              (no description available)
un  nvidia-utils                 <none>              <none>              (no description available)
ii  nvidia-utils-418:amd64       430.64-0ubuntu0~gpu amd64               Transitional package for nvidia-utils-430
ii  nvidia-utils-430:amd64       440.59-0ubuntu0.18. amd64               Transitional package for nvidia-utils-440
ii  nvidia-utils-440             440.59-0ubuntu0.18. amd64               NVIDIA driver support binaries
un  nvidia-vdpau-driver          <none>              <none>              (no description available)
ii  xserver-xorg-video-nvidia-41 430.64-0ubuntu0~gpu amd64               Transitional package for xserver-xorg-video-nvidia-430
ii  xserver-xorg-video-nvidia-43 440.59-0ubuntu0.18. amd64               Transitional package for xserver-xorg-video-nvidia-440
ii  xserver-xorg-video-nvidia-44 440.59-0ubuntu0.18. amd64               NVIDIA binary Xorg driver
dpkg-query: no packages found matching *nvidia*rpm
dpkg-query: no packages found matching -qa

  • [ ] NVIDIA container library version from nvidia-container-cli -V
version: 1.0.7
build date: 2020-01-21T18:59+00:00
build revision: b71f87c04b8eca8a16bf60995506c35c937347d9
build compiler: x86_64-linux-gnu-gcc-7 7.4.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
  • [ ] NVIDIA container library logs (see troubleshooting)
  • [ ] Docker command, image and tag used
    sudo docker run --gpus all nvidia/cuda:10.0-base nvidia-smi

Most helpful comment

This is not a bug.

Although it is not immediately intuitive, it is done this way by design, and actually required because of the way the underlying NVIDIA driver stack works.

The driver itself is split into two components:
(1) A set of kernel modules that can only be communicated with via ioctls from user-space
(2) A set of user-space libraries that provide well defined APIs that talk to the kernel-modules through their ioctls

Examples of the kernel modules include nvidia.ko, nvidia_uvm.ko, etc.

Examples of the user-space libraries include libnvidia-ml.so, libnvidia-cuda.so, etc.

With this in mind, the set of ioctls (and thus the kernel ABI itself) is not considered stable and can (and does) change across every single minor version of the driver.

On the other hand, the APIs provided by the user-space libraries are considered stable and remain backwards compatible across driver versions.

What this means, however, is that you can't simply bundle the user-space libraries into a container and expect them to be portable across different machines running different NVIDIA driver versions.

Instead, the NVIDIA container toolkit (i.e. nvidia-docker) makes sure to dynamically bind-mount these user-space libraries from your host into your container each time it is run. So no matter what you do, you will always end up running with whatever libnvidia-cuda.so library is installed on your host.

However, keep in mind that libnvidia-cuda.so is different than libcuda.so. The first is designed to proxy CUDA commands from your host into the kernel, while the latter is the actual CUDA software stack itself, which makes underlying calls into the libvidia-cuda.so driver component.

So even if you are using a driver that has libnvidia-cuda.so on version 11.0, if you run, say nvcr.io/nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04 (which bundles libcuda.so inside of it), you will actually end up running your code on the CUDA 10.2 software stack, and not the CUDA 11.0 software stack. This happens because the 10.2 libcuda.so bundled in the container is able to successfully call into the stable 11.0 libcuda.so API (which is backwards compatible).

Since nvidia-smi only relies on libnvidia-cuda.so and not libcuda.so you will always see the CUDA version displayed as the one corresponding to your driver installation, and not the libcuda.so version you have bundled in your container.

All 4 comments

I'm having the exact same problem, but CUDA 11.0 is what's being displayed when running sudo docker run --gpus all nvidia/cuda:10.0-base nvidia-smi

exact same problem, me too with cuda 11 being displayed

Yeah, me too.
Main system:
NVIDIA-SMI 455.23.05 Driver Version: 455.23.05 CUDA Version: 11.1
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --gpus all -it --rm nvcr.io/nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04
NVIDIA-SMI 455.23.05 Driver Version: 455.23.05 CUDA Version: 11.1

Seems like nvidia-docker bug.

This is not a bug.

Although it is not immediately intuitive, it is done this way by design, and actually required because of the way the underlying NVIDIA driver stack works.

The driver itself is split into two components:
(1) A set of kernel modules that can only be communicated with via ioctls from user-space
(2) A set of user-space libraries that provide well defined APIs that talk to the kernel-modules through their ioctls

Examples of the kernel modules include nvidia.ko, nvidia_uvm.ko, etc.

Examples of the user-space libraries include libnvidia-ml.so, libnvidia-cuda.so, etc.

With this in mind, the set of ioctls (and thus the kernel ABI itself) is not considered stable and can (and does) change across every single minor version of the driver.

On the other hand, the APIs provided by the user-space libraries are considered stable and remain backwards compatible across driver versions.

What this means, however, is that you can't simply bundle the user-space libraries into a container and expect them to be portable across different machines running different NVIDIA driver versions.

Instead, the NVIDIA container toolkit (i.e. nvidia-docker) makes sure to dynamically bind-mount these user-space libraries from your host into your container each time it is run. So no matter what you do, you will always end up running with whatever libnvidia-cuda.so library is installed on your host.

However, keep in mind that libnvidia-cuda.so is different than libcuda.so. The first is designed to proxy CUDA commands from your host into the kernel, while the latter is the actual CUDA software stack itself, which makes underlying calls into the libvidia-cuda.so driver component.

So even if you are using a driver that has libnvidia-cuda.so on version 11.0, if you run, say nvcr.io/nvidia/cuda:10.2-cudnn8-devel-ubuntu18.04 (which bundles libcuda.so inside of it), you will actually end up running your code on the CUDA 10.2 software stack, and not the CUDA 11.0 software stack. This happens because the 10.2 libcuda.so bundled in the container is able to successfully call into the stable 11.0 libcuda.so API (which is backwards compatible).

Since nvidia-smi only relies on libnvidia-cuda.so and not libcuda.so you will always see the CUDA version displayed as the one corresponding to your driver installation, and not the libcuda.so version you have bundled in your container.

Was this page helpful?
0 / 5 - 0 ratings