Meshroom: [FR]: Use OpenCL instead privative alternatives (CUDA, Metal)

Created on 16 Aug 2019 · 43Comments · Source: alicevision/meshroom

I just reported previosly the impossibility to render with Meshroom, probably cause despite I have an NVidia GPU, _Nvidia does not provide any CUDA package_ for OpenSUSE 15.1 . I use Blender, GIMP ... all of them are using OpenCL. Meshroom is developed for Linux and Windows. OpenCL is updated continuously for both platforms. OpenCL performance is slightly under propietary Nvidia or AMD APIs, so, why do not let Meshroom to use OpenCL GPGPU API? Even Intel GPU users could use Meshroom if it uses OpenCL framework.

Please, could you consider this suggestion?

Thank you

CUDA do not close feature request wip

Source

RafaelLinux

👍28 ❤12 🚀9

Most helpful comment

@ShalokShalom from the HiP code we can compile both CUDA and AMD versions. Similar to the parameter _target platform/os_ in the cmake, CUDA or AMD can be defined. So depending on the compiler parameters we can define the versions (OS+cuda/amd).
So once we can compile all supported plattforms from our hipified code, we can create a PR to use HiP instead of CUDA code by default in the official repo.

natowi on 19 Nov 2019

❤11 👍11 🎉3

All 43 comments

Read https://github.com/alicevision/AliceVision/issues/439
Here is the Background on why CUDA is used in many applications:
https://www.quora.com/Why-cant-a-deep-learning-framework-like-TensorFlow-support-all-GPUs-like-a-game-does-Many-games-in-the-market-support-almost-all-GPUs-from-AMD-and-Nvidia-Even-older-GPUs-are-supported-Why-cant-these-frameworks

natowi on 16 Aug 2019

👎4

I read the thread. Some commentaries are from 2018, and OpenCL 2.2 didn't exist, and many changes come from then. CUDA is used in many applications, but OpenCL too () . In that list is Darktable too, that I usually use.

Anyway, Fabencastian wrote

Currently, we have neither the interest nor the resources to do another implementation of the CUDA code to another GPU framework.

That's a pity, cause lot of users could not try Meshroom, despite it's a great develop. I'm just now in the PC with the Intel GPU, so there is no way to use Meshroom and tried alternatives, like Metashape, that doesn't require necessarily and Nvidia GPU.

RafaelLinux on 16 Aug 2019

👍12

That @fabiencastan does not have the time to do a port of a - for him working implementation - does not mean that other cannot implement it in their own time. A very big thing here is, would you implement it in OpenCL, or something different. Some good pointers on the wiki what are viable alternatives could help people that want to start on this task.

skinkie on 20 Aug 2019

Hi skinkie, I have no sufficient skills to code in C/C++. I'll give a try if it were Python, PHP or even JS. I point to the fact that "less users able to run an application = less interest in the application = less feedback" and finally, the great idea falls in an lost effort. It's true it's easier to work with the CUDA API, but a lot of users in this forum has reported info about how to migrate or simplify change to OpenCL. That could be a good point to start. That's only my opinion, of course.

RafaelLinux on 20 Aug 2019

@RafaelLinux As user you can use Meshroom without CUDA, the only part of the application that is 'hidden' is the DepthMap stage and even that allows for preview without CUDA. As developer MeshRoom is Python + QML low entry level to make impact. The first _acceleration_ CUDA is used in is the feature extraction. You could just try to get this to work: https://github.com/pierrepaleo/sift_pyocl

Personally my focus for Meshroom is introducing some heuristics for matching images and supervised learning opposed to the current brute force approach. Not that I am a photogrammetry specialist, but I can surely try to work on this open source project.

skinkie on 20 Aug 2019

❤2

Maybe I'm using incorrectly Meshroom, cause if I only reach DepthMap, I only see a cloud of points, so I can see the model result.

RafaelLinux on 21 Aug 2019

https://github.com/alicevision/meshroom/wiki/Draft-Meshing

skinkie on 21 Aug 2019

Thank you, is a good workaround. I ll try it. Anyway, remember users don't mind how long it takes, quality is the priority, so please, don't forget this feature request ;)

RafaelLinux on 21 Aug 2019

❤2

One could also use hipfy from AMD to convert CUDA code to HIP, wich can be built to work on either NVIDIA or AMD cards (with very nice performance, I currently use it for Tensorflow, and it works like a charm !)

aviallon on 26 Aug 2019

@aviallon The last time (2018) hip did not support some cuda functions https://github.com/alicevision/AliceVision/issues/439#issuecomment-417422887
and there was no full support for windows and amdgpu linux https://github.com/alicevision/AliceVision/issues/439#issuecomment-417635336.

You are welcome to try again using hipfy.

natowi on 26 Aug 2019

for reference https://github.com/cpc/hipcl

arpu on 18 Sep 2019

👍6

for reference https://github.com/cpc/hipcl

This is interesting, have anyone tried it?

pppppppp783 on 25 Sep 2019

https://www.computer.org/publications/tech-news/from-cuda-to-opencl-execution/

pppppppp783 on 25 Sep 2019

Nvidia does not provide any CUDA package_ for OpenSUSE 15.1.

This is simply a packaging issue since Arch has CUDA despite being not in the list here.

You already reported that issue to both, the open SUSE packagers and the NVidea CUDA team?

And you can probably repackage either the 15.0 variant of openSUSE package or the Arch package, which uses an independent source, as you can see in the link.

ShalokShalom on 27 Sep 2019

@ShalokShalom the problem with Cuda remains that older hardware absolutely does not work with newer CUDA versions. This causes problems for nvidia-drivers and cuda, where one is effectively searching for the 'ideal pair' between them. I would be very interested if opencl could bridge this gap even by choosing the execution pipeline of choice.

skinkie on 27 Sep 2019

❤3

And how is that with HiP? Nvidia hardware runs on it as well?

I consider using a Geforce GT 610 for CUDA, can you tell me how to choose the suitable CUDA version?

Thanks a lot

ShalokShalom on 27 Sep 2019

@ShalokShalom

And how is that with HiP? Nvidia hardware runs on it as well?

"HIP allows developers to convert CUDA code to portable C++. The same source code can be compiled to run on NVIDIA or AMD GPUs"

I consider using a Geforce GT 610 for CUDA, can you tell me how to choose the suitable CUDA version?

On Windows, install the latest version, on Linux this might depend on your Distro. GT 610 supports CUDA 2.1, MR requires 2+

natowi on 27 Sep 2019

I am on Linux, what decides which version is optimal? I am on KaOS, that is a rolling distribution.

So, does HiP negligible the version differences between CUDA and the different NVidia hardware?

Could or should we replace CUDA entirely with it or is the overhead to big?

ShalokShalom on 27 Sep 2019

@ShalokShalom With HiP we can compile two versions of Meshroom: for CUDA and AMD GPUs. For CUDA users nothing changes. (https://kaosx.us/docs/nvidia/ But you won´t get far with a 1GB GT 610)

natowi on 27 Sep 2019

❤3

We have to wait for HiP to support cudaMemcpy2DFromArray. Then we can add AMD support for AV/MR and try HiPCL.

natowi on 27 Sep 2019

👍6 🎉2

@natowi But you won´t get far with a 1GB GT 610)

If Meshroom would allow parallel computation for nodes where both CPU and GPU could for example do feature extraction. Any additional computing resource could help. It depends on how much overhead the GPU would give in compare to a (faster) decent CPU but I would still see the potential for independent computation tasks.

skinkie on 27 Sep 2019

👍4

looks like hip supports now cudaMemcpy2DFromArray any progress on this?

arpu on 16 Nov 2019

@skinkie see https://github.com/alicevision/meshroom/issues/175

@arpu Yes, all CUDA functions are now supported by HiP and I was able to convert the code to HiP using the conversion tool (read here for details). The only thing left is to write a new cmake file that includes HiP and supports both CUDA and AMD compilation and the different platforms. Here is the Meshroom PopSift plugin I used for testing. At the moment I don´t have the time to figure out how to rewrite the cmake file, but I think @ShalokShalom wanted to look into this.
You are welcome to do so as well.

natowi on 16 Nov 2019

❤7

One question is very critical, I think: Will we ship two versions?

Linux distributions do their packaging themselves and we could benefit enormously by finding someone who is willing to maintain Alice for their userbase since that could result in new developers and funding.

2 versions, one for CUDA and one for HIP is something they will never do.

ShalokShalom on 19 Nov 2019

👎1

natowi on 19 Nov 2019

❤11 👍11 🎉3

Any idea how long that approximately takes? I feel like a child just before Christmas eve :D

PickUpYaAmmo on 25 Nov 2019

❤2

@PickUpYaAmmo I will take another look at this over the winter holidays.

natowi on 25 Nov 2019

👍13 🎉9 ❤2

yay iim excited fr this i been following these threads for a while now i'm excited this finally happening! thank you guys so much! any idea of a "guesstimate" when we may see the first release ?

BootySmack on 15 Dec 2019

👀7

Is ROCm not an AMD alternative for CUDA? Someone else mentioned using HIP to convert CUDA for TensorFlow work, but AMD has been supporting TensorFlow with ROCm, albeit, ROCm support is still a bit rough atm and perhaps not as accessible as OpenCL.

There's also third party libs like ArrayFire which are a bit more limited in functionality afaik, but abstract OpenCL/CUDA under a single API and creates JIT compiled kernels, not sure how appropriate it is for this project but it's meant to do a pretty good job at compute workloads and optimizing them, is written in C++ too like this project, so it may be preferable to maintaining/developing OpenCL/CUDA code directly?

polarathene on 23 Dec 2019

👎1

It might need some refactoring, but shouldn't it be possible to simply support both code paths and use some logic to decide which one to use based on some condition (availablity of hardware or configuration of some kind)? This would not require a second package for different hardware.

Keridos on 18 Apr 2020

@Keridos what about checking the performance of the converted cuda code, running on nVidia hardware and decide if the original cuda code should be retained?

skinkie on 18 Apr 2020

Most systems today probably only have either nVidia card(s) or AMD card(s) inserted. Some may also have iGPUs from intel, too. Defaulting to the non Intel ones should be straightfoward. For systems with mixed AMD/nVidia GPUs i'd stick with notifiying the user (if possible) to select a card manually by overriding the selection in the config. Blender has this UI dialogue where you can select the hardware/API for GPU accelerated rendering. See screenshot below.

~~It doesn't list CUDA since iirc Cycles Rendering doesnt support CUDA.~~ A configuration like that could also double for forcing CPU processing if users wanted that for some reason.

Since I do not have a (modern) nVidia GPU available I cannot test the performance difference properly. Interesting might be the difference in performance between the CUDA code and the generated Open-CL compatible code on nVidia GPUs though I suspect the CUDA code will run measurably faster.

Keridos on 18 Apr 2020

@Keridos A bit off-topic, but Cycles most definitely supports CUDA. See https://docs.blender.org/manual/en/latest/render/cycles/gpu_rendering.html

@polarathene ROCm is supported only on GFX8 GPUs, meaning GCN3 (Polaris) and higher.

So even though ROCm would make things easier, it would limit the usage to only newer cards, whereas native OpenCL code could run even on the pre-GCN3 GPUs (which support OpenCL 2.1, just not ROCm).

From my understanding, the support is a decision made by AMD, where porting parts of ROCm to GCN1 and GCN2 would be considerably more work as those older generations have bigger underlying architectural differences, and not necessarily a limitation of the hardware.

Ristovski on 18 Apr 2020

👍4

@Ristovski indeed, but porting from CUDA to ROCm is much more straightforward...

aviallon on 4 May 2020

Heads up: Intel is going to release PCIe GPUs, so I would not override Intel chips by default.

ShalokShalom on 5 May 2020

👍2

Heads up: Intel is going to release PCIe GPUs, so I would not override Intel chips by default.

there could soon be a combination of a ryzen with low end vega and a more powerful dedicated intel pcie card. so maybe check if the card is attatched via pcie. but even then some people might have an old nvidia card installed just for software that absolutely requires them or to attatch more monitors but the integrated gpu is more powerful.

elypter on 10 May 2020

👍2

Good Day everyone, any updates about rendering on non CUDA GPU's?

Alemusica on 23 May 2020

👎1

https://github.com/alicevision/meshroom/wiki/Draft-Meshing

kalidem on 25 May 2020

👍6

Hi guys, has there been any progress on this?

Mhowser on 30 Aug 2020

👀13 👍5 🚀1

As a thought experiment, if the functionality used by MeshRoom were rewritten using the CPU instead of GPU (if that is possible), how much slower would it be? My (little) understanding is that GPGPU basically lets you do massive parallel computation (and of course offload stuff from the CPU itself). If this were rewritten with say, loops, what would the slow-down be?

Jimw338 on 22 Sep 2020

My not well informed estimation is that the slowdown could be enormous. I'm pretty sure that many types of computations can run hundreds of times slower on CPU and it seems that Meshroom really makes full use of my GPU when doing what it does.

I think it's pretty much the perfect kind of work to run on the GPU because it can be massively parallel, which means that losing that massive parallelism would slow it down a lot.

But anybody feel free to correct me if I'm wrong, some of that is kind of guess/impression.

zicklag on 22 Sep 2020

You could eventually use both, as Blender does.

ShalokShalom on 22 Sep 2020

👍3

We are looking into an alternative to the DepthMap node that runs on the CPU (it is not yet ready to use). It is not as good as the native DepthMap node quality wise, but better than DraftMeshing.
You will be informed once it is ready. Asking for updates does not speed up the process ;)

I did some research and the best solution for porting is still HIP. I´ll continue to see if I am able to build a test version, but it will definitively take time as I am learning by trial and error and success is not guaranteed.
There is the _long term option_ to try to crowdfund the development, but at the moment it is hard to guess how many people would actually support this. So let´s find out (short survey)

natowi on 10 Nov 2020

👍11 ❤8 🎉5

Was this page helpful?

0 / 5 - 0 ratings