a) build/examples/openpose/openpose.bin --image_dir /home/ubuntu/Dev/openpose/examples/media
(gives the error below)
b) build/examples/openpose/openpose.bin --no_gpu 0 --image_dir /home/ubuntu/Dev/openpose/examples/media
(open window, displays images but no recognitions made.)
You might select multiple topics, delete the rest:
Operating system (lsb_release -a on Ubuntu):
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04 LTS
Release: 16.04
Codename: xenial
CUDA version (cat /usr/local/cuda/version.txt in most cases):
CUDA Version 8.0.34
Caffe version:
Default from OpenPose
OpenCV version: 2.4 installed from JetPack 3.0.
build/examples/openpose/openpose.bin --image_dir /home/ubuntu/Dev/openpose/examples/media
Starting pose estimation demo.
Starting thread(s)
F0608 00:53:13.197923 29939 pooling_layer.cu:212] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
@ 0x7f935c6718 google::LogMessage::Fail()
@ 0x7f935c8614 google::LogMessage::SendToLog()
@ 0x7f935c6290 google::LogMessage::Flush()
@ 0x7f935c8eb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f92b7ef40 caffe::PoolingLayer<>::Forward_gpu()
@ 0x7f92a085b0 caffe::Net<>::ForwardFromTo()
@ 0x7f936873dc op::NetCaffe::forwardPass()
@ 0x7f936ee710 op::PoseExtractorCaffe::forwardPass()
@ 0x7f936fa274 op::WPoseExtractor<>::work()
@ 0x7f93719c2c op::Worker<>::checkAndWork()
@ 0x7f9371ce98 op::SubThread<>::workTWorkers()
@ 0x7f937261e4 op::SubThreadQueueInOut<>::work()
@ 0x7f93721df0 op::Thread<>::threadFunction()
@ 0x7f934b6280 (unknown)
@ 0x7f91fadfb4 start_thread
Aborted
Hi, 2 quick questions:
--no_gpu 0? There is no such an option. I guess you meant --num_gpu. For that one, you need at least 1 GPU: --num_gpu 1.Thanks.
Here is the output of a program with the cudnn version....
$ ./mnistCUDNN
cudnnGetVersion() : 5105 , CUDNN_VERSION from cudnn.h : 5105 (5.1.5)
Host compiler version : GCC 4.9.2
There are 1 CUDA capable devices on your machine :
device 0 : sms 2 Capabilities 5.3, SmClock 72.0 Mhz, MemSize (Mb) 3994, MemClock 12.8 Mhz, Ecc=0, boardGroupID=0
Using device 0
On the --num_gpu 0, I was just playing to see if I could get the program to do something !
I am slightly confused, it is then working with --num_gpu 1 so that this issue can be closed? Or what is the output when --num_gpu 1 is used? Thanks
It does not work with either --num_gpu setting.
With --num_gpu=1, I get the "Check failed: error == cudaSuccess (8 vs. 0) invalid device function"
[I assume this is the correct way to enable gpu]
With --num_gpu=0, the program finishes without any errors but does not detect anything in the samples images.
[I was just playing to see if I could get the program to run at all]
OK got it.
Since you are using a custom Ubuntu (the one from Nvidia), we cannot give you too much more help for the Caffe part (where it is failing), since we do not have that device to try.
Try to run Caffe and some Caffe demo (maybe the Caffe tests) there. Once Caffe is working with the GPU, OpenPose just uses C++11, Caffe and Caffe's dependencies.
Let us know your results. Thanks
Ok.
Caffe is working fine for all its tests and at least some demos. But let me run in a debugger to see what is actually failing.
My guess is some issue with version mismatch between caffe/cuda/cudnn/jetson
Do you know if anyone else has got it working on Jetson ?
Yeah please, let me know the exact function where it fails, so I can make more guesses about OpenPose.
No idea about people using OpenPose on Jetson.
I now have "openpose.bin" running. I needed to change some of the CUDA arch params in Makefile.config for Jetson Tx1
However, I still do not see any useful or interesting output:
Finally have it working on the Jetson TK1.... I needed to fix a few issues with the build files for caffe and openpose as follows:
My CUDA_ARCH settings:
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_53,code=sm_53
INCLUDE_DIRS := /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := /usr/local/lib /usr/lib /usr/lib/aarch64-linux-gnu/hdf5/serial
I also forced some flags in the Makefile (this may not be neccassary but its late and I'm tired so not doing anymore as its working for me)
-DCUDA_ARCH_NAME="Manual" -DCUDA_ARCH_BIN="53" -DCUDA_ARCH_PTX="53" -DUSE_CUDNN=1
I also build using the latest openpose src.
Thank you for posting the solution! So other people can use it too.
In conclusion, the only changes were located in the Makefile and Makefile.config files. This is good, so you and Jetson users will be able to easily update OpenPose at any point.
I am closing this issue then.
I am curious to know if cortinas finally achieved to run open pose on jetson TK1 !!
Cortinas could you please email me at smanismech[at]me[dot]com .
I have a jetson TX2 and I have some memory outage issue when i run openpose on it.
Thank you in advance
@cortinas
Hi, I am trying to run OpenPose on my Jetson TK1.
And I've tried the method you gave above.
I edited the file Makefile.config in the 3rdparty/caffe/, changed the CUDA_ARCH settings and added NCLUDE_DIRS := /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := /usr/local/lib /usr/lib /usr/lib/aarch64-linux-gnu/hdf5/serial
then I ran make all -j4 && make distribute -j4 to build.
But I got ERROR:
NVCC src/caffe/solvers/adadelta_solver.cu
nvcc fatal : Unsupported gpu architecture 'compute_53'
make: *** [.build_release/cuda/src/caffe/solvers/adadelta_solver.o] Error 1
make: *** Waiting for unfinished jobs....
Is there anything I did wrong?
My CUDA version is 6.5
thx.
Hello?
Don't even try it.
On Jetson TX2 with jetpack 3.1 I get 1FPS performance for prerecorded video or realtime .
I don't think it worths to run it on TK1. It is GPU hungry model !!
@IoaSman1 Have you test how much time dose OpenPose process one image?
Even using the tips in the FAQ (but it'll decrease accuracy) in tge doc/installation file is that slow?
@IoaSman1 I am doing the same thing here with Jetson Tx2. Is it straightforward to make the whole thing work? Would appreciate very much if you can share the steps...
If someone wants to share the steps, feel free to make a pull request with the steps for any other OS or embedded board! I'll merge it. Thanks!
awesome. Looking forward to that!
Thanks!
On Sep 10, 2017, at 2:52 PM, Gines notifications@github.com wrote:
If someone wants to share the steps, feel free to make a pull request with the steps for any other OS or embedded board! I'll merge it. Thanks!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/58#issuecomment-328374676, or mute the thread https://github.com/notifications/unsubscribe-auth/AWBRGcvw1-EHbvpqkZHpZLrqTy_y3tOIks5shFocgaJpZM4NzdKD.
Got it working on TX2 last night, PR incoming. With loads of reduction (128x96) in net_resolution I got to 10+fps. Used external webcam as it wasn't straightforward with the board one. Hands and Face work (256x256 nets) but both at the same time is too memory intensive, it oom crashes.
After I finish the PR I'll take a look at TensorRT hoping for higher realtime performances.
@IoaSman1 have you tried reducing the net_resolution, I can push it up to 4-7 fps based on how low I am willing to go on net_resolution, the accuracy drop is not significant too
Hope this helps
Hello
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.
F0123 10:55:27.467897 13141 pooling_layer.cu:212] Check failed: error == cudaSuccess (48 vs. 0) no kernel image is available for execution on the device
* Check failure stack trace: *
@ 0x7f92b39718 google::LogMessage::Fail()
@ 0x7f92b3b614 google::LogMessage::SendToLog()
@ 0x7f92b39290 google::LogMessage::Flush()
@ 0x7f92b3beb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f92f40bc8 caffe::PoolingLayer<>::Forward_gpu()
@ 0x7f92d66058 caffe::Net<>::ForwardFromTo()
@ 0x7f93e68a2c op::NetCaffe::forwardPass()
@ 0x7f93f9897c op::PoseExtractorCaffe::forwardPass()
@ 0x7f93f8e178 op::PoseExtractor::forwardPass()
@ 0x7f93f9cc18 op::WPoseExtractor<>::work()
@ 0x7f93e96bac op::Worker<>::checkAndWork()
@ 0x7f93e9b528 op::SubThread<>::workTWorkers()
@ 0x7f93ea57cc op::SubThreadQueueInOut<>::work()
@ 0x7f93ea1308 op::Thread<>::threadFunction()
@ 0x7f9394f280 (unknown)
@ 0x7f91f77fc4 start_thread
Aborted
facing the above error with TX1. tried the changes mentioned above. Please guide here.
Most helpful comment
Finally have it working on the Jetson TK1.... I needed to fix a few issues with the build files for caffe and openpose as follows:
My CUDA_ARCH settings:
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_53,code=sm_53
INCLUDE_DIRS := /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := /usr/local/lib /usr/lib /usr/lib/aarch64-linux-gnu/hdf5/serial
I also forced some flags in the Makefile (this may not be neccassary but its late and I'm tired so not doing anymore as its working for me)
-DCUDA_ARCH_NAME="Manual" -DCUDA_ARCH_BIN="53" -DCUDA_ARCH_PTX="53" -DUSE_CUDNN=1
I also build using the latest openpose src.