This happens on x86 randconfigs, currently looking at linux-next-20201203:
arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at offset 0x3e
Example .config: https://pastebin.com/wwwhUL8L
I assume this happens with LLVM_IAS=1, it's how I reproduced it. Looks like in this case, some section symbols are not being output.
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 SECTION LOCAL DEFAULT 1 <---- missing with LLVM_IAS=1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 <---- missing with LLVM_IAS=1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 4 <---- missing with LLVM_IAS=1
4: 0000000000000000 24 FUNC GLOBAL DEFAULT 1 preempt_schedule_thunk
5: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND preempt_schedule
6: 0000000000000018 24 FUNC GLOBAL DEFAULT 1 preempt_schedule_notrace_
7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND preempt_schedule_notrace
Looks like in this case, some section symbols are not being output.
Yes, this is a known difference where objtool has been patched in the past. So then sounds like something may have changed possibly in objtool to rely on the section symbols, again. But the warning is about a symbol for an instruction; I wouldn't expect section symbols to be related to instructions. See also: https://github.com/ClangBuiltLinux/linux/issues/669#issuecomment-534303910
cc @jcai19 @groeck looks like CrOS just hit this as well.
Yes. Unfirtunately is all but impossible for "outsiders" to determine if this is a llvm assembler problem or an objtool problem or maybe both.
in the bad case, arch/x86/entry/thunk_64.s does not define:
STATIC_JUMP_IF_TRUE
STATIC_JUMP_IF_FALSE
SET_NOFLUSH_BIT
ADJUST_KERNEL_CR3
SWITCH_TO_KERNEL_CR3
SWITCH_TO_USER_CR3_NOSTACK
SWITCH_TO_USER_CR3_STACK
SAVE_AND_SWITCH_TO_KERNEL_CR3
RESTORE_CR3
LOAD_CPU_AND_NODE_SEG_LIMIT
GET_PERCPU_BASE looks like just pcpu_unit_offsets(%rip), \reg. ___EXPORT_SYMBOL looks different. there's thunks for preempt_schedule_thunk and preempt_schedule_notrace_thunk and a .L_restore symbol.
So CONFIG_PREEMPTION=y (bad), CONFIG_SMP=n (bad), CONFIG_PAGE_TABLE_ISOLATION=n (bad). But those alone on top of defconfig are not enough to repro. I'm having trouble pinning down the configs via tools/testing/ktest/config-bisect.pl.
$ clang -Wp,-MMD,arch/x86/entry/.thunk_64.o.d -nostdinc -isystem /android0/llvm-project/llvm/build/lib/clang/12.0.0/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -fmacro-prefix-map=./= -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -m64 -DCONFIG_X86_X32_ABI -c -o arch/x86/entry/thunk_64.o arch/x86/entry/thunk_64.S
$ ./tools/objtool/objtool orc generate --no-fp --no-unreachable --retpoline arch/x86/entry/thunk_64.o
arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at offset 0x3e
$ llvm-objdump -Dr arch/x86/entry/thunk_64.o
0000000000000018 <preempt_schedule_notrace_thunk>:
...
3e: c3 retq
Even with Josh's patch from https://groups.google.com/g/clang-built-linux/c/1C6YoJKBsQQ/m/a8IS1NjGAgAJ applied.
insn_to_reloc_sym_addend() calls find_symbol_containing twice; once with offset, once with offset - 1. Both calls return NULL in the bad config.
Can somebody attach the .o file? Not sure why my mailer's rejecting it.
thunk_64.o.txt
cc @jpoimboe
The problem is this code segment at the bottom of arch/x86/entry/thunk_64.S. With the LLVM assembler stripping the .text section symbol, objtool has no way to reference this code when it generates ORC unwinder entries, because this code is outside of any ELF function.
SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
popq %r11
popq %r10
popq %r9
popq %r8
popq %rax
popq %rcx
popq %rdx
popq %rsi
popq %rdi
popq %rbp
ret
_ASM_NOKPROBE(.L_restore)
SYM_CODE_END(.L_restore)
I don't have access to this assembler at the moment, but is there a way to encourage it to not strip the ".text" section symbol in this file, either with a flag or some trick to create a section reference?
Something like the below would probably fix it, by forcing the code into an ELF symbol.
Still, it would be really nice to have a flag to turn off the section symbol stripping, as I don't think it has any benefits for the kernel (and it definitely has drawbacks for cases like this).
diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index ccd32877a3c4..4920037445fb 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -44,7 +44,7 @@ SYM_FUNC_END(\name)
#endif
#ifdef CONFIG_PREEMPTION
-SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
+SYM_CODE_START_GLOBAL_NOALIGN(__thunk_restore)
popq %r11
popq %r10
popq %r9
@@ -56,6 +56,6 @@ SYM_CODE_START_LOCAL_NOALIGN(.L_restore)
popq %rdi
popq %rbp
ret
- _ASM_NOKPROBE(.L_restore)
-SYM_CODE_END(.L_restore)
+ _ASM_NOKPROBE(__thunk_restore)
+SYM_CODE_END(__thunk_restore)
#endif
I don't have access to this assembler at the moment, but is there a way to encourage it to not strip the ".text" section symbol in this file, either with a flag or some trick to create a section reference?
@MaskRay didn't we have a thread about this previously? I don't think adding support for emitting symbols for sections would be too bad.
https://github.com/llvm/llvm-project/commit/a401eee22fabea8d214ab604037c937277477c38 seems relevant.
```diff
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index 10c61fc8b453..bb9089ebe85b 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -598,9 +598,6 @@ bool ELFWriter::isInSymtab(const MCAsmLayout &Layout, const MCSymbolELF &Symbol,
if (Symbol.isTemporary())
return false;
Failed Tests (17):
LLVM :: CodeGen/PowerPC/pcrel-tls-general-dynamic.ll
LLVM :: CodeGen/PowerPC/pcrel-tls-initial-exec.ll
LLVM :: ExecutionEngine/JITLink/X86/ELF_x86-64_relocations.s
LLVM :: MC/AArch64/size-directive.s
LLVM :: MC/ELF/alias.s
LLVM :: MC/ELF/cgprofile.s
LLVM :: MC/ELF/comdat.s
LLVM :: MC/ELF/empty.s
LLVM :: MC/ELF/reloc-same-name-section.s
LLVM :: MC/ELF/relocation-alias.s
LLVM :: MC/ELF/section-sym.s
LLVM :: MC/ELF/undef.s
LLVM :: tools/llvm-objdump/X86/demangle.s
LLVM :: tools/llvm-readobj/ELF/many-sections2.s
lld :: ELF/error-handling-script-linux.test
lld :: ELF/icf-safe.s
lld :: ELF/lto/thinlto-single-module.ll
WIP: https://reviews.llvm.org/D93783 let's see what the feedback is. Fixes the reported issue.
Also submitted: https://lore.kernel.org/lkml/[email protected]/T/#u.
Replied on https://reviews.llvm.org/D93783#2470728
I understand it is a (continuous) pain for objtool but we probably should take it. The LLVM integrated assembler behavior is very much desired and emitting unneeded STT_SECTION can cause unneeded .o size bloat for many users. GNU as feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=27109 (I thought I had reported this earlier but it turned out that I hadn't. oops)
Fair enough, it does make sense for '-ffunction-sections' at least.
I see a second produce a similar warning
arch/x86/kernel/ftrace_64.o: warning: objtool: missing symbol for insn at offset 0x16
HJL posted a patch for GAS to omit STT_SECTION symbols as well: https://sourceware.org/pipermail/binutils/2020-December/114671.html
I see a second produce a similar warning
arch/x86/kernel/ftrace_64.o: warning: objtool: missing symbol for insn at offset 0x16
This one's slightly different.
Symbol table '.symtab' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 NOTYPE LOCAL DEFAULT 4 __ksym_marker___[...]
2: 000000000000000e 0 NOTYPE LOCAL DEFAULT 2 fgraph_trace
3: 000000000000000f 0 NOTYPE LOCAL DEFAULT 2 trace
4: 0000000000000000 165 FUNC GLOBAL DEFAULT 2 __fentry__
5: 000000000000000e 0 NOTYPE GLOBAL DEFAULT 2 ftrace_stub
6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND ftrace_trace_function
The instruction at 0x16 is inside the __fentry__ function, but objtool can't figure that out because its red-black tree of symbols is sorted by offset and it gets confused by the fact that the NOTYPE symbols overlap the function.
This will probably need an objtool fix.
Fix for this 2nd issue posted:
https://lkml.kernel.org/r/9638ee49574226218d978ce7e26f7a107021f509.1609990368.git.jpoimboe@redhat.com
accepted: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/entry&id=bde718b7e154afc99e1956b18a848401ce8e1f8e
Looks like CrOS just hit this issue.