When I build a shared object using Clang from separate c-sources on Linux with -fembed-bitcode option, I cannot load it into Sulong.
Create the following two files:
test1.c:
int f() { return 42; }
test2.c:
int g() { return 24; }
Build libtest.so and try to load it:
$ clang-6.0 -fPIC -shared -O1 -g -fembed-bitcode test1.c test2.c -o libtest.so
$ ~/soft/graalvm-ce-1.0.0-rc10/bin/lli ./libtest.so
ERROR: java.lang.IllegalStateException: Unexpected Record Type Id: 0
org.graalvm.polyglot.PolyglotException: java.lang.IllegalStateException: Unexpected Record Type Id: 0
at java.lang.Throwable.<init>(Throwable.java:265)
at java.lang.Exception.<init>(Exception.java:66)
at java.lang.RuntimeException.<init>(RuntimeException.java:62)
at java.lang.IllegalStateException.<init>(IllegalStateException.java:55)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.defineAbbreviation(LLVMScanner.java:319)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToOffset(LLVMScanner.java:218)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToEnd(LLVMScanner.java:201)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.parseBitcodeBlock(LLVMScanner.java:160)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.parse(LLVMScanner.java:132)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:642)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:372)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:331)
at com.oracle.truffle.llvm.Sulong.parse(Sulong.java:113)
at com.oracle.truffle.api.TruffleLanguage$ParsingRequest.parse(TruffleLanguage.java:798)
at com.oracle.truffle.api.TruffleLanguage.parse(TruffleLanguage.java:1233)
at com.oracle.truffle.api.TruffleLanguage$LanguageImpl.parse(TruffleLanguage.java:2413)
at org.graalvm.polyglot.Context.eval(Context.java:335)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.execute(LLVMLauncher.java:210)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.launch(LLVMLauncher.java:63)
at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:132)
at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:72)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.main(LLVMLauncher.java:53)
at com.oracle.svm.core.JavaMainWrapper.run(JavaMainWrapper.java:164)
Original Internal Error:
java.lang.IllegalStateException: Unexpected Record Type Id: 0
at java.lang.Throwable.<init>(Throwable.java:265)
at java.lang.Exception.<init>(Exception.java:66)
at java.lang.RuntimeException.<init>(RuntimeException.java:62)
at java.lang.IllegalStateException.<init>(IllegalStateException.java:55)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.defineAbbreviation(LLVMScanner.java:319)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToOffset(LLVMScanner.java:218)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToEnd(LLVMScanner.java:201)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.parseBitcodeBlock(LLVMScanner.java:160)
at com.oracle.truffle.llvm.parser.scanner.LLVMScanner.parse(LLVMScanner.java:132)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:642)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:372)
at com.oracle.truffle.llvm.Runner.parse(Runner.java:331)
at com.oracle.truffle.llvm.Sulong.parse(Sulong.java:113)
at com.oracle.truffle.api.TruffleLanguage$ParsingRequest.parse(TruffleLanguage.java:798)
at com.oracle.truffle.api.TruffleLanguage.parse(TruffleLanguage.java:1233)
at com.oracle.truffle.api.TruffleLanguage$LanguageImpl.parse(TruffleLanguage.java:2413)
at com.oracle.truffle.polyglot.PolyglotSourceCache.parseImpl(PolyglotSourceCache.java:92)
at com.oracle.truffle.polyglot.PolyglotSourceCache.parseCached(PolyglotSourceCache.java:73)
at com.oracle.truffle.polyglot.PolyglotLanguageContext.parseCached(PolyglotLanguageContext.java:186)
at com.oracle.truffle.polyglot.PolyglotContextImpl.eval(PolyglotContextImpl.java:718)
at org.graalvm.polyglot.Context.eval(Context.java:335)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.execute(LLVMLauncher.java:210)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.launch(LLVMLauncher.java:63)
at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:132)
at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:72)
at com.oracle.truffle.llvm.launcher.LLVMLauncher.main(LLVMLauncher.java:53)
at com.oracle.svm.core.JavaMainWrapper.run(JavaMainWrapper.java:164)
Caused by: Attached Guest Language Frames (0)
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.10
DISTRIB_CODENAME=cosmic
DISTRIB_DESCRIPTION="Ubuntu 18.10"
$ clang-6.0 --version
clang version 6.0.1-9 (tags/RELEASE_601/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ ~/soft/graalvm-ce-1.0.0-rc10/bin/lli --version
Graal llvm 6.0.0 (GraalVM CE Native 1.0.0-rc10)
The problem with -fembed-bitcode is that this is a compiler flag, and it's not doing what you might expect when you use it during linking.
To understand what is happening, split the command into a separate compiler and linker step:
$ clang -fPIC -O1 -g -fembed-bitcode -c test1.c
$ clang -fPIC -O1 -g -fembed-bitcode -c test2.c
These commands compile the C files to object files with embedded bitcode. These object files contain an .llvmbc section, and you can look at this section:
$ objcopy -O binary --only-section=.llvmbc test1.o test1.bc
$ llvm-dis test1.bc
$ less test1.ll
So far so good, still exactly what you would expect. The problem is now that the linker doesn't understand embedded bitcode sections, it just treats them as random data.
$ clang -fPIC -shared -O1 -g test1.o test2.o -o libtest.so
Note that there is no -fembed-bitcode flag here. You can pass it, but it won't do anything. Since it's a compiler flag, clang will not forward it to the linker anyway.
What this does is just standard linking. It sees two object files, both of them contain an .llvmbc section, so it will just concatenate these sections. The result is not a valid bitcode file:
$ objcopy -O binary --only-section=.llvmbc libtest.so libtest.bc
$ llvm-dis libtest.bc
LLVM ERROR: Invalid encoding
There are several workarounds for that problem, depending on what your requirements are.
The simplest way is to use the -emit-llvm flag to emit bitcode only, and then use llvm-link to link the individual bitcode files together. One way to automate this is to use wllvm. It will do all the bitcode linking for you.
Note that wllvm does not produce ELF files with embedded bitcode, you just get a raw bitcode file. This file will work for Sulong. If you absolutely need embedded bitcode for some reason, you still have to use objcopy to manually insert the bitcode in the final shared object. See our test Makefile, where we do this.
We are currently evaluating options for making this simpler and more automated, but we have nothing ready yet.
Thank you very much for thorough explanation!
We are currently evaluating options for making this simpler and more automated, but we have nothing ready yet.
Just as a hack: can the boundaries of the original .llvmbc sections be determined or they require some non-trivial linking anyway?
But after all, it seems that when doing Link-time optimization, one need to specify -fuse-ld=gold and the LLVM's linker plugin is then loaded into the gold linker. So shouldn't this plugin invoke llvm-link on the .llvmbc sections as well? Looks like it is LLVM toolchain task, not yours.
Since this issue is one of the few google search results (and google led me here) I'd like ot point out that -fembed-bitcode is usually the least intrusive way of getting bitcode without modifying the build system. The .llvmbc section gets merged by the linker, as pointed out above, and you can just extract the files by looking for the bitcode magic, 0x4243c0de. This is unfortunately not enough, because the linker adds zero padding at the end that needs to be removed, then those files can just be combined properly using llvm-link.
@Hoernchen Thanks, this idea looks promising and quite straightforward in implementation!
@atrosinenko @Hoernchen staring with upcoming LLVM 10, there is now the possibility to embed bitcode when doing LTO with lld. To make things even simpler, GraalVM 19.3.0 and later ships with an LLVM toolchain as well as wrappers which will set the right flags for the GraalVM LLVM runtime:
export LLVM_TOOLCHAIN=$($GRAALVM_HOME/bin/lli --print-toolchain-path)
$LLVM_TOOLCHAIN/clang hello.c -o hello
If you want to know more, please see our update manual as well as our blog post on that topic.
Most helpful comment
The problem with
-fembed-bitcodeis that this is a compiler flag, and it's not doing what you might expect when you use it during linking.To understand what is happening, split the command into a separate compiler and linker step:
These commands compile the C files to object files with embedded bitcode. These object files contain an
.llvmbcsection, and you can look at this section:So far so good, still exactly what you would expect. The problem is now that the linker doesn't understand embedded bitcode sections, it just treats them as random data.
Note that there is no
-fembed-bitcodeflag here. You can pass it, but it won't do anything. Since it's a compiler flag,clangwill not forward it to the linker anyway.What this does is just standard linking. It sees two object files, both of them contain an
.llvmbcsection, so it will just concatenate these sections. The result is not a valid bitcode file:There are several workarounds for that problem, depending on what your requirements are.
The simplest way is to use the
-emit-llvmflag to emit bitcode only, and then usellvm-linkto link the individual bitcode files together. One way to automate this is to use wllvm. It will do all the bitcode linking for you.Note that wllvm does not produce ELF files with embedded bitcode, you just get a raw bitcode file. This file will work for Sulong. If you absolutely need embedded bitcode for some reason, you still have to use
objcopyto manually insert the bitcode in the final shared object. See our test Makefile, where we do this.We are currently evaluating options for making this simpler and more automated, but we have nothing ready yet.