Describe the bug
I built onnxruntime as described by running the build.sh script.
When linking against libonnxruntime_mlas.a into my shared lib I get the following error message:
libonnxruntime_mlas.a(SconvKernelSse2.S.o): relocation R_X86_64_PC32 against symbol MlasConvPostProcessFloatSseFilter4Output1' can not be used when making a shared object; recompile with -fPIC
CMake does not look special, here an excerpt:
TARGET_LINK_LIBRARIES(testLib
-Xlinker --gc-sections
#-Wl,--start-group
${BEGIN_WHOLE_ARCHIVE}
${CUDA_LIBS}
#${PROVIDERS_CUDA}
${ORT_LIBS_WHOLE_ARCHIVE}
${END_WHOLE_ARCHIVE}
RDTools
${Boost_LIBRARIES}
${ORT_LIBS_DIR}/external/protobuf/cmake/libprotobuf.a
#-Wl,--end-group
${ORT_LIBS_NO_WHOLE_ARCHIVE}
dl
rt
pthread
)
System information
To Reproduce
Build from source and try to link all onnx libs into a dynamic lib. Currently I cannot provide my complete sources, as it is a commercial project.
Expected behavior
Should link without problems, as ibonnxruntime_mlas.a has been compiled with fpic.
onnxruntime has a C API and dynamic lib, could you use that instead?
Actually I would like to link everything static. The SconvKernelSse2.S was introduced recently. Before static linking worked fine.
If that is true, why do you need wholearchive?
See: https://wiki.gentoo.org/wiki/Project:AMD64/Fixing_-fPIC_Errors_Guide
I tried to link everything in the same way the .so of onnx runtime is linked. There whole archive is used for some onnx dependencies and for some others not. Actually I dont know the reason, I just reproduced it.
${ORT_LIBS_NO_WHOLE_ARCHIVE} contains libonnxruntime_mlas.a, to make things a bit clearer.
The fact that I dont understand is that I made dead sure that libonnxruntime_mlas.a has been compiled with -fPIC, nevertheless the linker complains. As mentioned, things work using older versions where SconvKernelSse2.S has not been added. I just would like to know if someone else has experienced similar behaviour, or the error is completely on my side. I mean I link other static libs as well (boost, cuda, an own static lib), so the error could also be related to some interplay between these dependencies.
I attached the CMake file used.
sampleCMake.txt
You specified libonnxruntime_mlas twice on the link command line
Please try to build onnxruntime with "--build_shared_lib", even you don't want to use the shared lib. Because, it will make sure every object is built with "-fPIC". If any one is not, the build will fail.
That's right... ${ORT_LIBS_DIR}/external/re2/libre2.a has also been defined twice, as I noticed. I corrected this, but the root problem remains. I will try your suggestion. Thanks for your fast help by the way.
I rebuilt with --build_shared_lib.
readelf --relocs libonnxruntime_mlas.a | egrep '(GOT|PLT|JU?MP_SLOT)' returns sth. so the lib has obviously been compiled with fpic. However, the problem remains. I am just asking myself why libonnxruntime.so is built without problems, but I am not able to build an .so myself using the built static onnx dependencies with my own CMake file. Perhaps I am missing some flags, I will try to use the same build flags used when building libonnxruntime.so.
@commanderka were you able to resolve the issue or do you still need assistance?
I decided to use an older version of onnx runtime (without SconvKernelSse2.S)
I will report if the problem should persist when using the current master branch, as soon as I can find time for it.
I decided to use an older version of onnx runtime (without SconvKernelSse2.S)
I will report if the problem should persist when using the current master branch, as soon as I can find time for it.
@commanderka Hi, I know that you are trying the static link way, How was that? It's that works for you, thanks for your share.
Had same problem, except we needed a dynamic library. For some reason it compiled with gcc-9.1, but I wasn't able to get it compiled with gcc-7.3.
But as @commanderka, mentioned libonnxruntime.so compiled without problems (well, I had some issues with gcc-7.3, specifically undefined reference to `vtable for std::_Sp_counted_deleter<void*, void (*)(void*), std::allocator<void>, (__gnu_cxx::_Lock_policy)2>', but I didn't really needed a dynamic version of onnxruntime and it compiled under gcc-8.3). Then I tried to get the same error based on the linker options and removing --version-script did the job.
So basically minimum of what you want to do is to to pass -Wl,--version-script=version_file where version_file has roughly following content:
MYLIB_0.1 {
global:
*;
local:
MlasConvPostProcess*;
};
What is important here is that MlasConvPostProcess* is in local section which hides all symbols staring with MlasConvPostProcess and they're no longer part of that shared library interface. For more info take a look at Control over symbol exports in GCC
onnxruntime-0.5.0 and never had any problem with static linking.Thanks for the pointer. These new assembly symbols needed to be marked as ".hidden" to avoid exporting the symbols through the Procedure Linkage Table (PLT). I was able to repro your link failure with libonnxruntime.so after removing the version_script that onnxruntime.cmake builds. With the changes, then the error went away.
The changes are currently in tracysh/qgemm which will be submitted to master very soon. I'll resolve this issue when that is done.
This was resolved with PR #1644
Had same problem, except we needed a dynamic library. For some reason it compiled with
gcc-9.1, but I wasn't able to get it compiled withgcc-7.3.But as @commanderka, mentioned
libonnxruntime.socompiled without problems (well, I had some issues withgcc-7.3, specificallyundefined reference to `vtable for std::_Sp_counted_deleter<void*, void (*)(void*), std::allocator<void>, (__gnu_cxx::_Lock_policy)2>', but I didn't really needed a dynamic version ofonnxruntimeand it compiled undergcc-8.3). Then I tried to get the same error based on the linker options and removing--version-scriptdid the job.So basically minimum of what you want to do is to to pass
-Wl,--version-script=version_filewhereversion_filehas roughly following content:MYLIB_0.1 { global: *; local: MlasConvPostProcess*; };What is important here is that
MlasConvPostProcess*is inlocalsection which hides all symbols staring withMlasConvPostProcessand they're no longer part of that shared library interface. For more info take a look at Control over symbol exports in GCCNote: This fix works only for dynamic library. Also we encountered this error only after upgrade to recently released
onnxruntime-0.5.0and never had any problem with static linking.
Hi @JOndra91,
Were you able to get it compile with gcc-7.3 ? I have the exact same issue and I am able to compile using gcc-8.3. But it seems like CUDA 10.0 doesn't seem to support gcc-8 and hence using gcc-7 seems to be the only way forward...
@hariharans29 Are you asking about the shared library? Unfortunately I wasn't able to compile it with gcc-7.3.
Thanks @JOndra91
Thanks. I can confirm that the fix proposed by @hariharans29 works.
Most helpful comment
Thanks for the pointer. These new assembly symbols needed to be marked as ".hidden" to avoid exporting the symbols through the Procedure Linkage Table (PLT). I was able to repro your link failure with libonnxruntime.so after removing the version_script that onnxruntime.cmake builds. With the changes, then the error went away.
The changes are currently in tracysh/qgemm which will be submitted to master very soon. I'll resolve this issue when that is done.