Dlib: CUDA cuDNN CMake try_compile tests fail w/ Xcode clang=8.0.0

Created on 2 Jul 2017  路  18Comments  路  Source: davisking/dlib

I'm walking through dlib DNN CUDA and cuDNN setup for a Mac OS X 10.12 with a preferred Xcode 8.1 (or newer) toolchain.

clang --version:

Apple LLVM version 8.0.0 (clang-800.0.42.1)

Apparently, Xcode/clang doesn't like the CUDA and cuDNN CMake try_compile tests. In particular, the -std=c++11 CUDA_NVCC_FLAGS flag triggers an error. Removing the flag makes it happy.

An earlier post ( https://github.com/davisking/dlib/issues/356 ) suggests a clang-omp compiler can be used as a workaround on OS X, but the latest Xcode toolchain seems to work fine with a few mods. I'd prefer to work this into the current CMake if possible.

Here is a pointer to the relevant code in the WIP fork:
https://github.com/headupinclouds/dlib/commit/1b8bb49ec7527817e4b43d88f3949544ff9246f1

#list(APPEND CUDA_NVCC_FLAGS "-arch=sm_30;-std=c++11;-D__STRICT_ANSI__;-D_MWAITXINTRIN_H_INCLUDED;-D_FORCE_INLINES")
# -std=c++11 fails on OSX/Xcode clang=8.0.0
list(APPEND CUDA_NVCC_FLAGS "-arch=sm_30;-D__STRICT_ANSI__;-D_MWAITXINTRIN_H_INCLUDED;-D_FORCE_INLINES")

I think we can probably exclude the flag for detected XCODE builds:

if(NOT XCODE) # possibly more specific (AND CLANG_SOMETHING)
   list(APPEND CUDA_NVCC_FLAGS "-std=c++11")
endif()

After that, I ran into an issue related to a previously documented CUDA + OpenMP requirement, which doesn't seem to be the case for Xcode/clang. An initial workaround in my test fork is here:

https://github.com/headupinclouds/dlib/commit/72b5464ae083dfed8660ca5c587108d6f330bb71

Essentially adding NOT XCODE to omit openmp_libraries linking seems to be sufficient:

if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE) 

If such changes would be merged upstream, I can dig a little deeper and put a revise pr.xcode.cuda.fix PR together for review and further discussion. Let me know if that makes sense. Thanks!

Most helpful comment

I'll investigate further. I did confirm that -std=c++11 doesn't already exist in CUDA_NVCC_FLAGS at the point cuda_add_library(cuda_test STATIC cuda_test.cu) is called.

All 18 comments

Sure, submit a PR if you can. However, definitely look deeper into the -std=c++11 thing since that sounds very wrong. Maybe the switch is appearing twice and removing it makes it appear once or something like that. But not giving -std=c++11 at all is crazy.

I'll investigate further. I did confirm that -std=c++11 doesn't already exist in CUDA_NVCC_FLAGS at the point cuda_add_library(cuda_test STATIC cuda_test.cu) is called.

I just wanted to confirm that the second commit mentioned here (which omits OpenMP libraries) just saved me from a hassle of building dlib with Cuda enabled. Was struggling with llvm, clang and OpenMP installations for 3 days when I found this post and tried the second commit. dlib built fine and dnn example worked without any problems. It seems at least on my OS and Cuda, the Xcode/clang toolchain has omitted the need for OpenMP check.

OS: macOS 10.12.6
Xcode: Version 8.3.3 (8E3004b)
Cuda Version: 9.0.197
cudnn: cudnn-9.0-osx-x64-v7

Cool. Want to submit a PR for this? :)

Yeah sure, maybe @headupinclouds should do it because he did all the work, if he can't I'll do it gladly. Also I have not tested on other versions of macOS, Is it possible that this PR would ruin it for others with other OS versions?

It's definitely possible that it might mess something up, which is why I'm
not doing it, since I can't test it for your systems. I'm also not totally
sure what the exact change is without seeing a diff or patch so I can't
really say. But in any case, any testing you can do beyond your own system
is super.

The change I made was the exact copy of @headupinclouds 's commit here. it changes the OpenMP check for Xcode by changing if (NOT openmp_libarires AND NOT MSVC) to if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE). The fact that this was the way it was, makes me think that maybe on previous macOS/Xcode versions this was necessary and the change might mess up builds for people with previous versions. I could be totally wrong though.
I have one other system but unfortunately with the exact same version condition as the one I tested on.

Nevertheless I can make the PR but you can merge it when we'll be sure. Is that OK? or should we first make sure and then make the PR?

It's probably fine. I added that openmp stuff because it was required by the linux version of cuda. Maybe it's not required at all by the macos version of cuda. What happens when you build these things outside xcode with just cmake and make?

I'm already building using cmake in terminal in the report I mentioned. cmake -D DLIB_NO_GUI_SUPPORT=yes .. and then cmake --build . --config Release. When compiling dlib the compiler used is Apple Clang 8. Here's what I get in terminal:
dlibbuild

Cool. Seems good. Do remove that xcode specific message statement though.

Yeah sure, maybe @headupinclouds should do it because he did all the work

@kino-dome : Please feel free to send any of those changes in your own PR. It would be nice to resolve this.

In particular, the -std=c++11 CUDA_NVCC_FLAGS flag triggers an error

:point_up: I recall there was still some weirdness related to the -std=c++11 flag breaking Cuda builds in Xcode/Clang (after fixing the OpenMP issue). From your comment above, it sounds like this might be resolved in the Xcode+Cuda+Cudnn combination you tested here:

OS: macOS 10.12.6; Xcode: Version 8.3.3 (8E3004b); Cuda Version: 9.0.197; cudnn: cudnn-9.0-osx-x64-v7

Thanks for providing the configuration details. It looks like that may be addressed by upgrading. I'm looking forward to trying it.

@kino-dome : Please feel free to send any of those changes in your own PR. It would be nice to resolve this.

Thanks @headupinclouds. I'll take care of it then.

鈽濓笍 I recall there was still some weirdness related to the -std=c++11 flag breaking Cuda builds in Xcode/Clang (after fixing the OpenMP issue). From your comment above, it sounds like this might be resolved in the Xcode+Cuda+Cudnn combination you tested here:

I actually didn't use your first commit with the -std=c++11 flag. It was just the second commit that omits OpenMP.

I recall there was still some weirdness related to the -std=c++11 flag breaking CUDA builds

In case anyone else encounters this part of the issue...

I believe the core issue is due to a typo in FindCUDA.cmake

If I run the following patch on my FindCUDA.cmake the dlib try_compile test works fine:

sed -i .bk 's|MATCHES \"-std;|MATCHES \"-std=|g' /usr/local/Cellar/cmake/3.10.1/share/cmake/Modules/FindCUDA.cmake

This MATCH operation was failing to detect the current -std=c++11 flag and was adding a second one, which nvcc didn't like (note the ";"):

    if( NOT "${CUDA_NVCC_FLAGS}" MATCHES "-std;c\\+\\+11" ) # needs 's|;|=|'
      list(APPEND nvcc_flags --std c++11)
    endif()

This was tested w/ CUDA 9.0 on OSX and CMake 3.10.1. The actual error from the test (--debug-trycompile) was:

/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /dlib/dlib/cmake_utils/test_for_cuda/cuda_test.cu -o /dlib/_builds/libcxx-Release/dlib/cuda_test_build/CMakeFiles/cuda_test.dir//cuda_test_generated_cuda_test.cu.o.NVCC-depend -ccbin /usr/bin/clang -m64 --std c++11 -DDLIB_USE_CUDA -Xcompiler ,\"-stdlib=libc++\",\"-g\" -arch=sm_30 -std=c++11 -D__STRICT_ANSI__ -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -DNVCC -I/usr/local/cuda/include -I/dlib/dlib/cmake_utils/test_for_cuda/../../dnn
nvcc fatal   : redefinition of argument 'std'
CMake Error at cuda_test_generated_cuda_test.cu.o.cmake:219 (message):
  Error generating
  /dlib/_builds/libcxx-Release/dlib/cuda_test_build/CMakeFiles/cuda_test.dir//./cuda_test_generated_cuda_test.cu.o

see: https://gitlab.kitware.com/cmake/cmake/merge_requests/1628

If this PR is merged, then it should be resolved in a future CMake release.

A workaround would be to omit the -std=c++11 flag in CUDA_NVCC_FLAGS here:

https://github.com/davisking/dlib/blob/29f85d065b88eb40741a5b2cb8fad4d2eed58aab/dlib/cmake_utils/test_for_cuda/CMakeLists.txt#L10

This seems to be a combination of nvcc's intolerance of duplicate flags (at least some versions) and a CMake FindCUDA typo. Given that the flags are applied inside cuda_add_library(), and it isn't easy to enumerate the platforms that will have the problem, the best and most future safe local dlib workaround might be to simply run try_compile a second time without -std=c++ if the first one fails. I can send a PR for that. Let me know if you have another preference.

Is the error still happening as far as anyone knows though? I was under the impression it was worked around in a previous PR.

Is the error still happening as far as anyone knows though

I'm not able to build with DLIB_USE_CUDA=ON unless I apply the patch above. The local FindCUDA.cmake patch is workable for me.

Since Apple hasn't built HW using NVIDIA since 2013 or so, I'm guessing the intersection of dlib dnn + OS X users who are likely to hit this issue is fairly small (Hackintosh, early eGPU, old MacBooks). Still, it does seem curious.

@kino-dome : You mentioned you are now able to build. You provided the following spec.

OS: macOS 10.12.6; Xcode: Version 8.3.3 (8E3004b); Cuda Version: 9.0.197; cudnn: cudnn-9.0-osx-x64-v7

This is the setup I'm using, except I'm on macOS 10.12.2, which I think is unrelated to the issue. The one thing you didn't share was your cmake -version (specifically your FindCUDA.cmake in case it is different). Any chance you can report that here? I would like to understand how you are able to build.


I also realized the other part of the issue if (NOT openmp_libraries AND NOT MSVC AND NOT XCODE) should really be if (NOT openmp_libraries AND NOT MSVC AND NOT APPLE). I hit the OpenMP issue on an OS X system using a pure (Apple) clang toolchain without Xcode. I believe it has to do with the OS X NVIDIA CUDA release, which, from CMake's perspective, is effectively APPLE ("Darwin") and is unrelated to XCODE. I can send a PR for that.

Sounds good. Please send a PR for this stuff then :)

Hey @headupinclouds, sorry for the delay. I checked my cmake version and it's 3.3.1 . Also what you said about replacing NOT XCODE with NOT APPLE makes sense, I only tried the first variant and it worked for me but your logic is valid.

Is the error still happening as far as anyone knows though? I was under the impression it was worked around in a previous PR.

The PR I made was in relation to the OpenMP issue not the -std=c++11 issue headupinclouds mentioned later. Hope his PR solves this once and for all :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AeroClassics picture AeroClassics  路  4Comments

AivanF picture AivanF  路  4Comments

srikanthreddybethi picture srikanthreddybethi  路  4Comments

jackweiwang picture jackweiwang  路  4Comments

reunanen picture reunanen  路  3Comments