Split off from #514.
The config file in that issue produces the following error at link time:
ERROR: "__memcat_p" [drivers/hwtracing/stm/stm_core.ko] undefined!
A simpler/faster build is just x86_64_defconfig + CONFIG_STM=m.
This is really odd as __memcat_p is marked as EXPORT_SYMBOL_GPL in lib/memcat_p.c. Even changing it to EXPORT_SYMBOL doesn't resolve the build failure.
I narrowed this down to an ld.lld issue:
$ echo "CONFIG_STM=m" >> arch/x86/configs/x86_64_defconfig
No error:
$ make -j$(nproc) CC=clang O=out defconfig modules_prepare vmlinux modules
Error:
$ make -j$(nproc) CC=clang LD=ld.lld O=out defconfig modules_prepare vmlinux modules
I assume the problem here is that somehow __memcat_p isn't getting added to the __ksymtab section?
I assume the problem here is that somehow __memcat_p isn't getting added to the __ksymtab section?
Looks like it, yes. The build steps create a linker script containing the exported symbols, including:
$ grep memcat lib/.lib-ksyms.o.lds
EXTERN(__memcat_p)
But the behavior is different when ld.lld is invoked; it doesn't add any of the symbols in that script to the output file:
$ ld -m elf_x86_64 -z max-page-size=0x200000 -r -o lib/lib-ksyms.o -T lib/.lib-ksyms.o.lds lib/.lib_exports.o; objdump -x lib/lib-ksyms.o | grep memcat
0000000000000000 *UND* 0000000000000000 __memcat_p
$
$ ld.lld-9 -m elf_x86_64 -z max-page-size=0x200000 -r -o lib/lib-ksyms.o -T lib/.lib-ksyms.o.lds lib/.lib_exports.o; objdump -x lib/lib-ksyms.o | grep memcat
$
I tried a number of command-line params for ld.lld, with no success.
Added review for lld: https://reviews.llvm.org/D63564
great work @ihalip !
I can confirm that fix works, thanks!
should this have a 32b or 64b x86 ARCH tag?
I did not test other architectures but I doubt this is an x86 specific issue.
I replied on https://reviews.llvm.org/D63564#1551417. My question is why the -r link needs EXTERN(__memcat_p)
@MaskRay I believe that comes from: https://github.com/torvalds/linux/blob/8c79f4cd441b27df6cadd11b70a50e06b3b3a2bf/scripts/Makefile.build#L438-L449
@masahir0y can also weigh if he has time.
My question is why the
-rlink needsEXTERN(__memcat_p)
The trick is to create an "empty" object file (lib-ksyms.o) containing undefined references to all the symbols that need to be exported and are not built-in. For example, memcat_p is not used internally by the kernel but it is part of the kernel API (and it's used by the stm module), so it should be exported at all times.
The build process in the lib/ directory is as follows:
lib.a, containing regular symbolslib-ksyms.o, containing undefined references to symbols that need to be exportedbuilt-in.a, containing built-in symbols and lib-ksyms.ovmlinux against both lib.a and built-in.alib.a is added to the final linking step with --no-whole-archive. Because lib-ksyms.o contains undefined references which are found in lib.a, they are then added into the resulting binary.
@ihalip Is it possible to change it to:
-u or a linker script that contains EXTERN())This way, no lld change is needed. (The lld side change isn't easy https://reviews.llvm.org/D63564#1552017)
I can try to do that.
-u/--undefined has different meanings in ld.bfd and ld.lld, which is why we're seeing this issue right now. What do you think about the possibility of adding another option to ld.lld, say something like --output-undefined, which would behave like ld.bfd's -u?
Looking into this a little. The kernel defines memcat_p as a macro which calls into the externally defined __memcat_p in include/linux/string.h, that has a BUILD_BUG_ON_MSG if the arg types do not match. We've had issues with BUILD_BUG_ related macros in the past where failed assertions resulted in obscure linkage failures.
drivers/hwtracing/stm/policy.c has a call to memcat_p.
merged = memcat_p(stp_policy_node_attrs, attrs);
stp_policy_node_attrs is of type struct configfs_attribute * [], and attrs is of type struct configfs_attribute **. Clang and GCC agree that with one level of indirection stripped off, the types match: https://godbolt.org/z/iTRj-D, so that's not the bug.
I see ld.lld invoked as followed, then the error message. I don't see the invocation @ihalip posted in https://github.com/ClangBuiltLinux/linux/issues/515#issuecomment-503528414.
ld.lld -m elf_x86_64 -z max-page-size=0x200000 -r -o drivers/hwtracing/stm/stm_core.o drivers/hwtracing/stm/core.o drivers/hwtracing/stm/policy.o
Maybe step 6 of scripts/Makefile.modpost?
drivers/hwtracing/stm/policy.o has 28 undefined symbol references. Why is __memcat_p special/different?
Looks like it's using EXPORT_SYMBOL_GPL where as some others are EXPORT_SYMBOL as @nathanchance noted (changing the export and a clean build makes no difference). Let's compare __memcat_p to strcmp.
__memcat_p is defined in lib/memcat_p.o.
strcmp is defined in lib/string.o
Both object files are included in the same lib-y statement in lib/Makefile.
TODO: compare output of:
$ make CC=clang -j71 drivers/hwtracing/stm/stm_core.ko V=1
$ make CC=clang -j71 drivers/hwtracing/stm/stm_core.ko V=1 LD=ld.lld
and difference between drivers/hwtracing/stm/.*.cmd files.
hmm...definitely perplexing.
Why is
__memcat_pspecial/different?
It's only used by the stm module, which is built outside the kernel. Since it's not used anywhere inside the kernel, ld.lld strips it out in the final link step. Using CONFIG_STM=y, there's no error.
The meaning of --undefined X/-u X/EXTERN(x) is different for ld.bfd and ld.lld. The manpage of ld.lld says that this option forces a symbol to be undefined during linking. Since it's not referenced, the link succeeds and the symbol is discarded. Instead, ld.bfd outputs a binary containing the undefined symbol, even if it's not referenced.
I think I could implement a workaround like @MaskRay suggested, but I doubt it will be readily accepted by the kernel folks.
I would say it's worth seeing what the kernel workaround looks like first.
It would be something like this: https://gist.githubusercontent.com/ihalip/82a932259be696f2c0534f6f6a331a10/raw/331cf062a030df71d441ccf790258f4179406886/gistfile1.txt
I changed the build scripts to not delete the intermediate linker scripts and re-use them when linking vmlinux.
What about moving memcat_p from the kernel's lib to the stm module, since that's the only user in the tree?
It would be something like this: https://gist.githubusercontent.com/ihalip/82a932259be696f2c0534f6f6a331a10/raw/331cf062a030df71d441ccf790258f4179406886/gistfile1.txt
I changed the build scripts to not delete the intermediate linker scripts and re-use them when linking vmlinux.
That does not seem too bad? If that fixes the issue, it might be worth sending as an RFC (or @masahir0y can take a look at it here).
What about moving
memcat_pfrom the kernel's lib to thestmmodule, since that's the only user in the tree?
Hmmm, I guess we could do that but that feels like papering over the issue.
Looping back to this now that I have started testing allmodconfig builds locally.
There is also a test module that gets enabled:
ERROR: "__memcat_p" [lib/test_memcat_p.ko] undefined!
ERROR: "__memcat_p" [drivers/hwtracing/stm/stm_core.ko] undefined!
I ran @ihalip 's patch through its paces locally and I did not experience any regressions (with ccache, I'll do a full set of clean builds on mainline and linux-next overnight):
Build results
arm:multi_v7_defconfig | Build successful in 0 minutes and 40 seconds | Boot successful
arm:allnoconfig | Build successful in 0 minutes and 7 seconds
arm:allyesconfig | Build successful in 3 minutes and 10 seconds
arm:allyesconfig-be | Build successful in 3 minutes and 54 seconds
arm:allmodconfig | Build successful in 2 minutes and 48 seconds
arm:allmodconfig-be | Build successful in 2 minutes and 44 seconds
arm64:defconfig | Build successful in 0 minutes and 45 seconds | Boot successful
arm64:allnoconfig | Build successful in 0 minutes and 7 seconds
arm64:allyesconfig | Build successful in 3 minutes and 42 seconds
arm64:allyesconfig-be | Build successful in 4 minutes and 35 seconds
arm64:allmodconfig | Build successful in 2 minutes and 56 seconds
arm64:allmodconfig-be | Build successful in 2 minutes and 50 seconds
mipsel:malta_defconfig | Build successful in 0 minutes and 43 seconds | Boot successful
ppc32:ppc44x_defconfig | Build successful in 0 minutes and 19 seconds | Boot successful
ppc32:allnoconfig | Build successful in 0 minutes and 7 seconds
ppc64:pseries_defconfig | Build successful in 0 minutes and 45 seconds | Boot successful
ppc64:allnoconfig | Build successful in 0 minutes and 7 seconds
ppc64le:powernv_defconfig | Build successful in 0 minutes and 50 seconds | Boot successful
x86:defconfig | Build successful in 0 minutes and 29 seconds | Boot successful
x86:allyesconfig | Build successful in 4 minutes and 7 seconds
x86:allmodconfig | Build successful in 5 minutes and 24 seconds
I think that it is worth sending as an RFC so Masahiro can give his input.
cc @adelva1984 @metti
FWIW, the issues goes away by:
It's not clear to me _why_ this solves the issue, but hopefully that can help narrowing it down.
It's not clear to me why this solves the issue
obj-y -> object file gets added to lib/built-in.a
lib-y -> object file gets added to lib/lib.a
The difference between them is that lib/built-in.a is linked into vmlinux with --whole-archive. I guess such a change is a very easy and reasonable fix.
Ultimately, the root cause of this issue is that ld.bfd and ld.lld have different interpretations for the --undefined/-u option. ld.bfd forces symbols to be undefined, ld.lld doesn't:
$ echo | gcc -x c -c - -o a.o
$ ld.bfd a.o -u test -o a.bfd.o 2>/dev/null && objdump -t a.bfd.o | grep test
0000000000000000 *UND* 0000000000000000 test
$ ld.lld-10 a.o -u test -o a.lld.o 2>/dev/null && objdump -t a.lld.o | grep test
$
Right, thanks for the explanation. And what about reverting 93048c0 ? This basically just moves __memcat_p to lib/string.o, but it is still in lib-y. It feels a bit inconsistent to strip the symbol in one case and not the other no ?
I guess such a change is a very easy and reasonable fix.
Ack, as a one-off fix. But looking forwards stripping EXPORT_SYMBOL_GPL'd symbols is not something we can live with, so a more generic fix is still very much welcome IMO.
A more generic fix in the kernel build system would be something similar to my previous comment. This reuses a temporary linker script (initially used when linking lib/lib.a) when linking vmlinux. The change is quite ugly and intrusive.
patch exists: https://groups.google.com/forum/#!topic/clang-built-linux/vXGwrSRJniY
What's the status of this issue? Needs a Reviewed-By: ping:) ? If yes, where should it go?
Commented: https://groups.google.com/forum/#!topic/clang-built-linux/vXGwrSRJniY
If a.o has an undefined symbol foo, {ld.bfd,ld.lld} -shared a.o will place foo in .dynsym.
While GNU ld's EXTERN(foo) and -u foo behavior is consistent with regard to .symtab, it is not consistent with regard to .dynsym. ld.bfd -shared -u test a.o does not place test in .dynsym. So the symbol is different from a regular undefined symbol.
The behavior is understandable: if GNU ld places foo in .dynsym, at runtime, the shared object will likely have an unresolved reference at run time and will thus be rejected by ld.so. So for .dynsym, GNU ld's behavior is actually similar to lld's: -u foo triggers linking of an archive.
Sorry for the long delay.
As I replied in LKML, moving memcat_p.o to obj-y is not a proper fix. It is hiding just one symptom.
Irrespective of the ld.lld issue, I have been thinking of getting rid of
the dummy lib-ksyms.o since it is ugly.
Many of lib-y objects contain EXPORT_SYMBOL.
When CONFIG_MODULES=y, we need to link them to vmlinux
with/without users of them in vmlinux.
So, this is a patch for simplification.
https://gist.github.com/masahir0y/de751187a32eef10822d91f48e60c863
This fixes LD=ld.lld builds for CONFIG_STM=m.
Thanks for the patch (commented)
submitted: https://lore.kernel.org/lkml/CAKwvOdksxVa=NGtyT3hsuHg6SJG4CbNWAepf+dxwVDC1+36zyw@mail.gmail.com/T/#md2a1140be178c25b6b58d6cb5a090abc7f02dbf3
Masahiro picked up his patch: https://lore.kernel.org/lkml/CAK7LNAT3SR3rc5F1BYA0=Wxp3PRbd+ueDZ-h_UzCj=9m8CLWLQ@mail.gmail.com/
Will close once it hits mainline.
Merged into mainline: https://git.kernel.org/linus/7273ad2b08f8ac9563579d16a3cf528857b26f49
With that, we can now link an allmodconfig x86_64 kernel with ld.lld!
congrats @masahir0y , there was a fair amount of cleanup of various architectures to get that landed.