Openpose: Nvidia/Titan RTX Check failed: error == cudaSuccess (48 vs. 0) no kernel image is available for execution on the device

Created on 8 Jul 2019  路  23Comments  路  Source: CMU-Perceptual-Computing-Lab/openpose

Hi I am using openpose-1.5.0 with gpu usage. Other system specifications are as below.
cuda-10.0
cydnn-7
ubuntu-18.04
GPU-Titan RTX.

Also I am using the openpose with python api. There were no installation errors. But when i ran the below code

import sys
import cv2
import os
from sys import platform
import argparse
import time

sys.path.append('/usr/local/python')
from openpose import pyopenpose as op

params = dict()
params["model_folder"] = "/openpose/models/"
params["num_gpu"] = 1


opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

image_path = "/openpose/examples/media/COCO_val2014_000000000241.jpg"

datum = op.Datum()
imageToProcess = cv2.imread(image_path)
datum.cvInputData = imageToProcess
opWrapper.waitAndEmplace([datum])
opWrapper.waitAndPop([datum])

print("Body keypoints: \n" + str(datum.poseKeypoints))

I am getting following ERROR:

F0708 14:38:22.912272   180 pooling_layer.cu:212] Check failed: error == cudaSuccess (48 vs. 0)  no kernel image is available for execution on the device
*** Check failure stack trace: ***
    @     0x7f0cb40c10cd  google::LogMessage::Fail()
    @     0x7f0cb40c2f33  google::LogMessage::SendToLog()
    @     0x7f0cb40c0c28  google::LogMessage::Flush()
    @     0x7f0cb40c3999  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f0cb370b02b  caffe::PoolingLayer<>::Forward_gpu()
    @     0x7f0cb3658032  caffe::Net<>::ForwardFromTo()
    @     0x7f0cb4b58a27  op::NetCaffe::forwardPass()
    @     0x7f0cb4b80dda  op::PoseExtractorCaffe::forwardPass()
    @     0x7f0cb4b79d55  op::PoseExtractor::forwardPass()
    @     0x7f0cb4b76d4f  op::WPoseExtractor<>::work()
    @     0x7f0cb4bba119  op::Worker<>::checkAndWork()
    @     0x7f0cb4bba2a3  op::SubThread<>::workTWorkers()
    @     0x7f0cb4bc3f08  op::SubThreadQueueInOut<>::work()
    @     0x7f0cb4bbc581  op::Thread<>::threadFunction()
    @     0x7f0cec5f266f  (unknown)
    @     0x7f0cf33916db  start_thread
    @     0x7f0cf36ca88f  clone
Aborted (core dumped)

I have gone through all closed issues regarding the same, but i could not find the answer. Please help me out figure out the issue.

3rd party (unsupported - might not reply)

Most helpful comment

Hi Ketul, I had several trials with manual overwriting of the cmake file, and could see the successful demo!
I got an openpose clone, then got a caffe clone to the 3rdparty folder.
After that, I directly add the gpu architectures.

so,
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd 3rdparty/
git clone https://github.com/CMU-Perceptual-Computing-Lab/caffe
then,
open Cuda.cmake file in cmake folder.
then directly added information like this,

Known NVIDIA GPU achitectures Caffe can be compiled for.
This list will be used for CUDA_ARCH_NAME = All option

set(Caffe_known_gpu_archs "30 35 50 52 60 61 70 75")
...

if(${CUDA_ARCH_NAME} STREQUAL "Fermi")
set(__cuda_arch_bin "20 21(20)")
elseif(${CUDA_ARCH_NAME} STREQUAL "Kepler")
set(__cuda_arch_bin "30 35")
elseif(${CUDA_ARCH_NAME} STREQUAL "Maxwell")
set(__cuda_arch_bin "50")
elseif(${CUDA_ARCH_NAME} STREQUAL "Pascal")
set(__cuda_arch_bin "60 61")

elseif(${CUDA_ARCH_NAME} STREQUAL "Volta")
set(__cuda_arch_bin "70")
elseif(${CUDA_ARCH_NAME} STREQUAL "Turing")
set(__cuda_arch_bin "75")

elseif(${CUDA_ARCH_NAME} STREQUAL "All")
set(__cuda_arch_bin ${Caffe_known_gpu_archs})
elseif(${CUDA_ARCH_NAME} STREQUAL "Auto")
caffe_detect_installed_gpus(__cuda_arch_bin)
else() # (${CUDA_ARCH_NAME} STREQUAL "Manual")
set(__cuda_arch_bin ${CUDA_ARCH_BIN})
endif()

after that, typed the below.
cd openpose
sudo bash ./scripts/ubuntu/install_deps.sh
mkdir build
cd build
cmake ..
make -j4
sudo make install

Although I read your manual overwriting of the cmake file did not work so far, I hope my case could be helpful for you.

All 23 comments

Hi, that means the GPU version has not been compiled for the GPU architecture you are using.

Try compiling OpenPose, that would compile it for your specific GPU architecture.

@gineshidalgo99 I did not understand the last part. Can you please guide me how can i run openpose for specific GPU architecture?

Please add all your hw config for a more detailed answer.

@KetulParikh Dear Ketul, have you already solved your issue? I am facing the same problem with RTX 2080 Ti (the same 7.5 smi as your Titan RTX).

@ozangungortuhh Nope. I tried by reinstalling it again, but causing the same error. Btw I have two gpu system. Does this affect the flow or should i need to add any variable in pyopenpose config?

I tired the NVIDIA drivers 430 and 410 with a RTX 2080 Titan. Same error (using Docker)

@enric1994 As I have linux x86 Os I have used 390 and 340 drivers but still the error remains same.

@gineshidalgo99

Please add all your hw config for a more detailed answer.
To answer this

OS : Linux x86 (Ubuntu 18.04)
GPU: 2 - Titan RTX (24GB each)
RAM: 128GB
NVIDA driver: 418 (Tried with 430 and 410)
CUDA: 10.0
cudnn: 7

Let me know what else do you need. I need to solve this problem asap.

@KetulParikh @enric1994 I think the problem might be that caffe that is being used by openpose (it
s written that they are using their own fork) is not supporting our gpu architectures. When I run cmake, the following caffe configuration is being displayed. And there we can see that for auto target GPU, archs up to sm_61 is being configured. (so not 75 as we need)
Caffe Installation Summary
So we should try to compile openpose with a custom caffe, since i dont have cmake experience, I am trying to understand how to do that right now

@KetulParikh @enric1994 Also, please share your answer if you manage to compile openpose with a custom caffe, thanks :)

@ozangungortuhh in my case sm_75 is already added. In cmake file it is dynamically picking up the ARCH option based on cuda version. attaching image for reference.
Screenshot from 2019-07-31 12-28-59

@gineshidalgo99 @ozangungortuhh @enric1994 While building the caffe automatic gpu detection is failed. Following are the console logs.

-- Boost version: 1.65.1
-- Found the following Boost libraries:
--   system
--   thread
--   filesystem
--   chrono
--   date_time
--   atomic
-- Found gflags  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libgflags.so)
-- Found glog    (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libglog.so)
-- Found PROTOBUF Compiler: /usr/bin/protoc
-- HDF5: Using hdf5 compiler wrapper to determine C configuration
-- HDF5: Using hdf5 compiler wrapper to determine CXX configuration
-- CUDA detected: 10.0
-- Found cuDNN: ver. 7.6.2 found (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Automatic GPU detection failed. Building for all known architectures.
-- Added CUDA NVCC flags for: sm_30 sm_35 sm_50 sm_52 sm_60 sm_61
-- Found Atlas (include: /usr/include/x86_64-linux-gnu library: /usr/lib/x86_64-linux-gnu/libatlas.so lapack: /usr/lib/x86_64-linux-gnu/liblapack.so
-- Python interface is disabled or not all required dependencies found. Building without it...
-- 
-- ******************* Caffe Configuration Summary *******************
-- General:
--   Version           :   1.0.0
--   Git               :   1.0-146-gb5ede488
--   System            :   Linux
--   C++ compiler      :   /usr/bin/c++
--   Release CXX flags :   -O3 -DNDEBUG -fPIC -Wall -std=c++11 -Wno-sign-compare -Wno-uninitialized
--   Debug CXX flags   :   -g -fPIC -Wall -std=c++11 -Wno-sign-compare -Wno-uninitialized
--   Build type        :   Release
-- 
--   BUILD_SHARED_LIBS :   ON
--   BUILD_python      :   OFF
--   BUILD_matlab      :   OFF
--   BUILD_docs        :   OFF
--   CPU_ONLY          :   OFF
--   USE_OPENCV        :   OFF
--   USE_LEVELDB       :   OFF
--   USE_LMDB          :   OFF
--   USE_NCCL          :   OFF
--   ALLOW_LMDB_NOLOCK :   OFF
--   USE_HDF5          :   ON
-- 
-- Dependencies:
--   BLAS              :   Yes (Atlas)
--   Boost             :   Yes (ver. 1.65)
--   glog              :   Yes
--   gflags            :   Yes
--   protobuf          :   Yes (ver. 3.0.0)
--   CUDA              :   Yes (ver. 10.0)
-- 
-- NVIDIA CUDA:
--   Target GPU(s)     :   Auto
--   GPU arch(s)       :   sm_30 sm_35 sm_50 sm_52 sm_60 sm_61
--   cuDNN             :   Yes (ver. 7.6.2)
-- 
-- Install:
--   Install path      :   /openpose/build/caffe

But after building the sub-module when cmake runs following logs were displayed. In this case sm_75 was found.

Rerunning cmake after building Caffe submodule
-- GCC detected, adding compile flags
-- GCC detected, adding compile flags
-- Building with CUDA.
-- CUDA detected: 10.0
-- Found cuDNN: ver. 7.6.2 found (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Automatic GPU detection failed. Building for all known architectures.
-- Added CUDA NVCC flags for: sm_30 sm_35 sm_37 sm_50 sm_52 sm_53 sm_60 sm_61 sm_62 sm_70 sm_75
-- Found cuDNN: ver. 7.6.2 found (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Found gflags  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libgflags.so)
-- Found glog    (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libglog.so)
-- Caffe will be downloaded from source now. NOTE: This process might take several minutes depending
        on your internet connection.
-- Caffe has already been downloaded.

So Can anyone help me to solve the error by building the caffe using sm_75. I also tried by adding sm_75 manually to the cuda.cmake file, but i got this error

CMake Warning:
  Manually-specified variables were not used by the project:
          CUDA_ARCH_BIN

As there is no other way reaming. I think Building caffe should inherit the sm_75 ARCH

@gineshidalgo99 Also I found that in Cuda.cmake file in 3rdparty/caffe/cmake/Cuda.cmake There is no support for sm_70 and sm_75 driver. How can we add support for the drivers?

That is exactly our problem as far as I can understand. The caffe fork that is being used here is not compiled for sm_75 due to 3rdparty/caffe&cmake/Cuda.cmake Maybe one of the maintainers could help us out how to use our own caffe or a fork of it?

Hi, all. Have you already solved your issue? I am facing the same error with RTX 2080 ti. The GPU arch(s) does not include sm_70 and sm_75. Do I need to wait for the update? Or does the Custom Caffe work?
OS : Linux x86 (Ubuntu 18.04)
RAM: 16 GB
NVIDA driver: 418
CUDA: 10.1
cudnn: 7.4

@KenFuijwara we have to wait for update. i have tried by manually overwrite the cmake file but it didn't work. I have asked the developers to update the docker image for nvidia TITAN and RTX 2080 TI. But I haven't heard back. If you can mail them it would be great Thank you.

Thanks, Ketul. I also tried overwriting the cmake file, which did not work. I will ask the developers.

Sure Thanks!!!

Hi Ketul, I had several trials with manual overwriting of the cmake file, and could see the successful demo!
I got an openpose clone, then got a caffe clone to the 3rdparty folder.
After that, I directly add the gpu architectures.

so,
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd 3rdparty/
git clone https://github.com/CMU-Perceptual-Computing-Lab/caffe
then,
open Cuda.cmake file in cmake folder.
then directly added information like this,

Known NVIDIA GPU achitectures Caffe can be compiled for.
This list will be used for CUDA_ARCH_NAME = All option

set(Caffe_known_gpu_archs "30 35 50 52 60 61 70 75")
...

if(${CUDA_ARCH_NAME} STREQUAL "Fermi")
set(__cuda_arch_bin "20 21(20)")
elseif(${CUDA_ARCH_NAME} STREQUAL "Kepler")
set(__cuda_arch_bin "30 35")
elseif(${CUDA_ARCH_NAME} STREQUAL "Maxwell")
set(__cuda_arch_bin "50")
elseif(${CUDA_ARCH_NAME} STREQUAL "Pascal")
set(__cuda_arch_bin "60 61")

elseif(${CUDA_ARCH_NAME} STREQUAL "Volta")
set(__cuda_arch_bin "70")
elseif(${CUDA_ARCH_NAME} STREQUAL "Turing")
set(__cuda_arch_bin "75")

elseif(${CUDA_ARCH_NAME} STREQUAL "All")
set(__cuda_arch_bin ${Caffe_known_gpu_archs})
elseif(${CUDA_ARCH_NAME} STREQUAL "Auto")
caffe_detect_installed_gpus(__cuda_arch_bin)
else() # (${CUDA_ARCH_NAME} STREQUAL "Manual")
set(__cuda_arch_bin ${CUDA_ARCH_BIN})
endif()

after that, typed the below.
cd openpose
sudo bash ./scripts/ubuntu/install_deps.sh
mkdir build
cd build
cmake ..
make -j4
sudo make install

Although I read your manual overwriting of the cmake file did not work so far, I hope my case could be helpful for you.

That's great! Can you submit a PR or build a Docker image of that?

Thanks!

Fixed in the latest commit of OpenPose. Could you try and let me know the result?
VERY IMPORTANT: For how GitHub works, you will have to completely remove the OpenPose folder and re-download it and re-compile it so it uses the right Caffe version.

Sorry for the delay.

@gineshidalgo99 I can confirm that it works on my RTX 2080

adding onto @KenFuijwara, if you have already ran OpenPose once and it worked, but now it's not working (suppose you're in a pre-made docker where it was already ran), you should run make clean before running make -j4

This fixed it for the Nvidia Tesla V100 GPU as well

Was this page helpful?
0 / 5 - 0 ratings