Opencv_contrib: CUDA backend for the opencv_dnn

Created on 20 Feb 2017 · 19Comments · Source: opencv/opencv_contrib

It is not a bug, but question about new feature. After some experiments with Caffe and opencv_dnn I have found that for a present moment Caffe with CUDA performs forward propagation (in average, across different networks) 25 times faster than the opencv_dnn with LAPACK or OPENCL. So, it is evident that CUDA gives great speed advantage in this task. Could anybody add CUDA backend to opencv_dnn?

System information (version)

OpenCV => 3.2
Windows 64 Bit
Compiler => Visual Studio 2015 / Visual Studio 2013 / Mingw 5.2

dnn (moved out) feature

Source

pi-null-mezon

👍8

Most helpful comment

We plan on leading a Google Summer of Code project to add a GPU backend for DNN. If you can help, see the idea page for OpenCV GSoC

garybradski on 2 Feb 2019

🎉12 👍2

All 19 comments

@pi-null-mezon, we've added Halide backend since the issue was opened. It let us choose OpenCL computational target and run networks on GPU (even NVidia). We'll experiment with CUDA target and compare efficiency later.
On the other hand, default CPU efficiency has been dramatically improved last time. You may see efficiency comparison at table.

dkurt on 2 Aug 2017

👍1

Hello @dkurt! Thanks for the good news! Am I right that according to the performance table, you've provided above, the fastest backend now is DNN C++ but not DNN Hallide?

pi-null-mezon on 2 Aug 2017

@pi-null-mezon, You are right. In most cases default backend is more efficient on CPU. But Halide backend is now one and only way to run models on GPU. So if you have powerful GPU on board you can use OpenCV to run networks on it.

dkurt on 3 Aug 2017

@dkurt, how can I switch opencv_dnn backend from C++ to Halide if I am working on Windows? Am I right that I need to download Halide binaries and rebuild opencv with some kind of USE_HALLIDE flags turned on?

pi-null-mezon on 14 Aug 2017

@pi-null-mezon, unfortunately, the worst thing is LLVM and there is no pre-compiled LLVM binaries. But you may try to use truncated version of it (I've downloaded it by svn co on Ubuntu). We have some instruction for Windows in tutorial How to enable Halide backend for improve efficiency.

As far as I remember, we have no Halide in our testing system for Windows, Linux with OpenCL only. So we could miss some bugs there. Anyway, you may create an issue if something wont work out.

dkurt on 14 Aug 2017

@dkurt hello! Finally I have build Opencv with Halide on Windows. At least it works, but one thing I can not find in the tutorials is how to make a selection between different GPUs on machine to perform calculations. For the instance I've got two GPU: Intel HD Graphics and AMD Radeon. How can I force Opencv to use particular one?

pi-null-mezon on 30 Aug 2017

@pi-null-mezon, according to Halide documentation, you may select device id just by environment variable: export HL_GPU_DEVICE=1 for Linux or set HL_GPU_DEVICE=1 for Windows. I tested locally that it switches either between CPU and GPU (in short words between devices of clinfo output on Linux).

dkurt on 30 Aug 2017

👍1

@dkurt thanks! GPU computations work! But results after dnn::net::forward() are not similar to CPU version. I need to make more tests and maybe will open new issue. Thanks!

pi-null-mezon on 31 Aug 2017

👍1

@pi-null-mezon how did your tests work out? I'm wondering if I should bother putting in the effort to build the halide back end on Windows.

TechnikEmpire on 21 Dec 2017

@TechnikEmpire you definitely should try it, but watch out #9530

pi-null-mezon on 22 Dec 2017

@pi-null-mezon Cool thanks, but if it fails completely with GPU backend then that sort of defeats the purpose for me. I get a decent framerate using default backend and CPU with yahoo nsfw model, but I'm looking for a portable way to try and speed that up on the GPU when available. Last time I checked, the halide backend on CPU didn't perform as well.

TechnikEmpire on 22 Dec 2017

@dkurt thanks! GPU computations work! But results after dnn::net::forward() are not similar to CPU version. I need to make more tests and maybe will open new issue. Thanks!

you run GPU computations work . Did you call cv::dnn::Net::setHalideScheduler ? . I skipped call setHalideScheduler and it crash.

baoson202 on 11 Dec 2018

Does it work out of the box? How do we configure the CUDA backend for this?

kaangoksal on 11 Dec 2018

👍1

Everyone here - stop messing about with CUDA and Halide and just use the inference engine, which is now open source.

This is the best possible performance you can squeeze out of DNN and it does not disappoint.

TechnikEmpire on 11 Dec 2018

👎3

@TechnikEmpire, IE cannot run deep learning models on NVIDIA GPUs. And OpenCV for now have no CUDA backend as well. One of the possible ways is to test Halide backend with CUDA target.

dkurt on 11 Dec 2018

👍2

@dkurt Yeah I know, I was just throwing it out there that the IE is a very good, well optimized back end targeting CPU. Was letting people know because I was blown away by the performance. I realize a GPU accelerated back end can still out-perform a CPU backend.

TechnikEmpire on 11 Dec 2018

We plan on leading a Google Summer of Code project to add a GPU backend for DNN. If you can help, see the idea page for OpenCV GSoC

garybradski on 2 Feb 2019

🎉12 👍2

This issue can now be closed. CUDA support was merged two days ago into master.