Linux: Linux v5.3-rc1+: bpf issue when linking with lld

Created on 26 Jul 2019  Â·  42Comments  Â·  Source: ClangBuiltLinux/linux

Hi,

I see a problem in next-20190723 when linking with ld.lld.

After login I land in the maintenance mode as some systemd services failed to load:

  1. systemd-journald
  2. systemd-udevd
[Fri Jul 26 08:08:43 2019] systemd[453]: systemd-udevd.service: Failed to connect stdout to the journal socket, ignoring: Connection refused

This is not seen when I link with ld.bfd from Debian/buster.

In both cases I use clang-9.

$ clang-9 --version
ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 1e6b12dd3158a554c63d332df6b25f407eed9556) (based on LLVM 9.0.0)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/sdi/src/llvm-toolchain/install/bin

Applied patches (just for the records):

drm/i915: Remove redundant user_access_end() from __copy_from_user() error path
objtool: Improve UACCESS coverage
xen/trace: avoid clang warning on function pointers
drm/i915: Fix i915_gemfs_init() NULL dereference

Excerpt from my dmesg-log:

[Fri Jul 26 08:08:42 2019] BUG: unable to handle page fault for address: ffffffff85403370
[Fri Jul 26 08:08:42 2019] #PF: supervisor read access in kernel mode
[Fri Jul 26 08:08:42 2019] #PF: error_code(0x0000) - not-present page
[Fri Jul 26 08:08:42 2019] PGD 7620e067 P4D 7620e067 PUD 7620f063 PMD 44fe85063 PTE 800fffff8a3fc062
[Fri Jul 26 08:08:42 2019] Oops: 0000 [#1] SMP PTI 
[Fri Jul 26 08:08:42 2019] CPU: 2 PID: 417 Comm: (journald) Not tainted 5.3.0-rc1-5-amd64-cbl-asmgoto #5~buster+dileks1
[Fri Jul 26 08:08:42 2019] Hardware name: LENOVO 20HDCTO1WW/20HDCTO1WW, BIOS N1QET83W (1.58 ) 04/18/2019
[Fri Jul 26 08:08:42 2019] RIP: 0010:___bpf_prog_run+0x40/0x14f0
[Fri Jul 26 08:08:42 2019] Code: f3 eb 24 48 83 f8 38 0f 84 a9 0c 00 00 48 83 f8 39 0f 85 8a 14 00 00 0f 1f 00 48 0f bf 43 02 48 8d 1c c3 48 83 c3 08 0f b6 33 <48> 8b 04 f5 10 2e 40 85 48 83 f8 3b 7f 62 48 83 f8 1e 0f 8f c8 00
[Fri Jul 26 08:08:42 2019] RSP: 0018:ffff992ec028fcb8 EFLAGS: 00010246 
[Fri Jul 26 08:08:42 2019] RAX: ffff992ec028fd60 RBX: ffff992ec00e9038 RCX: 0000000000000002
[Fri Jul 26 08:08:42 2019] RDX: ffff992ec028fd40 RSI: 00000000000000ac RDI: ffff992ec028fce0
[Fri Jul 26 08:08:42 2019] RBP: ffff992ec028fcd0 R08: 0000000000000000 R09: ffff992ec028ff58
[Fri Jul 26 08:08:42 2019] R10: 0000000000000000 R11: ffffffff849b8210 R12: 000000007fff0000
[Fri Jul 26 08:08:42 2019] R13: ffff992ec028feb8 R14: 0000000000000000 R15: ffff992ec028fce0
[Fri Jul 26 08:08:42 2019] FS:  00007f5d20f1d940(0000) GS:ffff8ba3d2500000(0000) knlGS:0000000000000000
[Fri Jul 26 08:08:42 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Jul 26 08:08:42 2019] CR2: ffffffff85403370 CR3: 0000000445b3e001 CR4: 00000000003606e0
[Fri Jul 26 08:08:42 2019] Call Trace:
[Fri Jul 26 08:08:42 2019]  __bpf_prog_run32+0x44/0x70
[Fri Jul 26 08:08:42 2019]  ? flush_tlb_func_common+0xd8/0x230
[Fri Jul 26 08:08:42 2019]  ? mem_cgroup_commit_charge+0x8c/0x120
[Fri Jul 26 08:08:42 2019]  ? wp_page_copy+0x464/0x7a0
[Fri Jul 26 08:08:42 2019]  seccomp_run_filters+0x54/0x110
[Fri Jul 26 08:08:42 2019]  __seccomp_filter+0xf7/0x6e0
[Fri Jul 26 08:08:42 2019]  ? do_wp_page+0x32b/0x5d0
[Fri Jul 26 08:08:42 2019]  ? handle_mm_fault+0x90d/0xbf0
[Fri Jul 26 08:08:42 2019]  syscall_trace_enter+0x182/0x290
[Fri Jul 26 08:08:42 2019]  do_syscall_64+0x30/0x90
[Fri Jul 26 08:08:42 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Fri Jul 26 08:08:42 2019] RIP: 0033:0x7f5d220d7f59
[Fri Jul 26 08:08:42 2019] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
[Fri Jul 26 08:08:42 2019] RSP: 002b:00007ffd11332b48 EFLAGS: 00000246 ORIG_RAX: 000000000000013d
[Fri Jul 26 08:08:42 2019] RAX: ffffffffffffffda RBX: 000055bf8ab34010 RCX: 00007f5d220d7f59
[Fri Jul 26 08:08:42 2019] RDX: 000055bf8ab34010 RSI: 0000000000000000 RDI: 0000000000000001
[Fri Jul 26 08:08:42 2019] RBP: 000055bf8ab97fb0 R08: 000055bf8abbe180 R09: 00000000c000003e
[Fri Jul 26 08:08:42 2019] R10: 000055bf8abbe1e0 R11: 0000000000000246 R12: 00007ffd11332ba0
[Fri Jul 26 08:08:42 2019] R13: 00007ffd11332b98 R14: 00007f5d21f087f8 R15: 000000000000002c
[Fri Jul 26 08:08:42 2019] Modules linked in: i2c_dev parport_pc sunrpc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic mbcache crc16 jbd2 btrfs zstd_decompress zstd_compress algif_skcipher af_alg sd_mod dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 uas raid0 usb_storage multipath linear scsi_mod md_mod hid_cherry hid_generic usbhid hid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 i915 glue_helper crypto_simd nvme i2c_algo_bit cryptd psmouse xhci_pci drm_kms_helper e1000e i2c_i801 xhci_hcd intel_lpss_pci nvme_core intel_lpss drm usbcore thermal wmi video button
[Fri Jul 26 08:08:42 2019] CR2: ffffffff85403370
[Fri Jul 26 08:08:42 2019] ---[ end trace 867b35c7d6c6705a ]---
[Fri Jul 26 08:08:42 2019] RIP: 0010:___bpf_prog_run+0x40/0x14f0
[Fri Jul 26 08:08:42 2019] Code: f3 eb 24 48 83 f8 38 0f 84 a9 0c 00 00 48 83 f8 39 0f 85 8a 14 00 00 0f 1f 00 48 0f bf 43 02 48 8d 1c c3 48 83 c3 08 0f b6 33 <48> 8b 04 f5 10 2e 40 85 48 83 f8 3b 7f 62 48 83 f8 1e 0f 8f c8 00
[Fri Jul 26 08:08:42 2019] RSP: 0018:ffff992ec028fcb8 EFLAGS: 00010246
[Fri Jul 26 08:08:42 2019] RAX: ffff992ec028fd60 RBX: ffff992ec00e9038 RCX: 0000000000000002
[Fri Jul 26 08:08:42 2019] RDX: ffff992ec028fd40 RSI: 00000000000000ac RDI: ffff992ec028fce0
[Fri Jul 26 08:08:42 2019] RBP: ffff992ec028fcd0 R08: 0000000000000000 R09: ffff992ec028ff58
[Fri Jul 26 08:08:42 2019] R10: 0000000000000000 R11: ffffffff849b8210 R12: 000000007fff0000
[Fri Jul 26 08:08:42 2019] R13: ffff992ec028feb8 R14: 0000000000000000 R15: ffff992ec028fce0
[Fri Jul 26 08:08:42 2019] FS:  00007f5d20f1d940(0000) GS:ffff8ba3d2500000(0000) knlGS:0000000000000000
[Fri Jul 26 08:08:42 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Jul 26 08:08:42 2019] CR2: ffffffff85403370 CR3: 0000000445b3e001 CR4: 00000000003606e0

Inspired by #282 CONFIG_SECCOMP panic with LLD I disabled CONFIG_SECCOMP=n then I see a different call-trace.
You wanna see this one?

What files do you need?

  1. vmlinux?
  2. kernel/seccomp.o?
  3. bpf?

@kees
Can you enlighten the correlations between bpf and seccomp?

I tried some kernel-boot-parameter:

  1. nokaslr
  2. mitigations=off
  3. apparmor=off (default enabled in Debian/buster)
  4. hardened_usercopy=off

Any other parameters I can test?

I will test in QEMU, too.

Attached are my kernel-config and dmesg-log.

config-5.3.0-rc1-5-amd64-cbl-asmgoto.txt
dmesg_5.3.0-rc1-5-amd64-cbl-asmgoto.txt

[ARCH] x86_64 [BUG] linux [FIXED][LINUX] 5.3 [TOOL] lld

All 42 comments

clang-9 and ld.lld-9 with CONFIG_SECCOMP=n:

[Fri Jul 26 07:41:30 2019] BUG: unable to handle page fault for address: ffffffffbac03370
[Fri Jul 26 07:41:30 2019] RIP: 0010:___bpf_prog_run+0x40/0x14f0
[Fri Jul 26 07:41:30 2019] #PF: supervisor read access in kernel mode
[Fri Jul 26 07:41:30 2019] Code: f3 eb 24 48 83 f8 38 0f 84 a9 0c 00 00 48 83 f8 39 0f 85 8a 14 00 00 0f 1f 00 48 0f bf 43 02 48 8d 1c c3 48 83 c3 08 0f b6 33 <48> 8b 04 f5 10 2e c0 ba 48 83 f8 3b 7f 62 48 83 f8 1e 0f 8f c8 00
[Fri Jul 26 07:41:30 2019] #PF: error_code(0x0000) - not-present page
[Fri Jul 26 07:41:30 2019] RSP: 0018:ffffa8380025fa88 EFLAGS: 00010246
[Fri Jul 26 07:41:30 2019] PGD 21140e067
[Fri Jul 26 07:41:30 2019] P4D 21140e067
[Fri Jul 26 07:41:30 2019] RAX: ffffa8380025fb30 RBX: ffffa838000d1038 RCX: 0000000000000000
[Fri Jul 26 07:41:30 2019] PUD 21140f063
[Fri Jul 26 07:41:30 2019] RDX: ffffa8380025fb10 RSI: 00000000000000ac RDI: ffffa8380025fab0
[Fri Jul 26 07:41:30 2019] PMD 450828063
[Fri Jul 26 07:41:30 2019] RBP: ffffa8380025faa0 R08: ffff8c7ac5888e00 R09: 0000000000000000
[Fri Jul 26 07:41:30 2019] PTE 800ffffdef1fc062
[Fri Jul 26 07:41:30 2019] R10: ffff8c7acfe2e400 R11: ffffffffba1b5de0 R12: 0000000000000000
[Fri Jul 26 07:41:30 2019] R13: ffffa838000d1000 R14: 0000000000000000 R15: ffffa8380025fab0
[Fri Jul 26 07:41:30 2019] Oops: 0000 [#16] SMP PTI
[Fri Jul 26 07:41:30 2019] FS:  00007f80f0e98d40(0000) GS:ffff8c7ad2480000(0000) knlGS:0000000000000000
[Fri Jul 26 07:41:30 2019] CPU: 3 PID: 453 Comm: systemd-udevd Tainted: G      D           5.3.0-rc1-6-amd64-cbl-asmgoto #6~buster+dileks1
[Fri Jul 26 07:41:30 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Jul 26 07:41:30 2019] Hardware name: LENOVO 20HDCTO1WW/20HDCTO1WW, BIOS N1QET83W (1.58 ) 04/18/2019
[Fri Jul 26 07:41:30 2019] CR2: ffffffffbac03370 CR3: 00000004494ce005 CR4: 00000000003606e0
[Fri Jul 26 07:41:30 2019] RIP: 0010:___bpf_prog_run+0x40/0x14f0
[Fri Jul 26 07:41:30 2019] Code: f3 eb 24 48 83 f8 38 0f 84 a9 0c 00 00 48 83 f8 39 0f 85 8a 14 00 00 0f 1f 00 48 0f bf 43 02 48 8d 1c c3 48 83 c3 08 0f b6 33 <48> 8b 04 f5 10 2e c0 ba 48 83 f8 3b 7f 62 48 83 f8 1e 0f 8f c8 00
[Fri Jul 26 07:41:30 2019] RSP: 0018:ffffa83800423a88 EFLAGS: 00010246
[Fri Jul 26 07:41:30 2019] RAX: ffffa83800423b30 RBX: ffffa838000d1038 RCX: 0000000000000000
[Fri Jul 26 07:41:30 2019] RDX: ffffa83800423b10 RSI: 00000000000000ac RDI: ffffa83800423ab0
[Fri Jul 26 07:41:30 2019] RBP: ffffa83800423aa0 R08: ffff8c7ac496b800 R09: 0000000000000000
[Fri Jul 26 07:41:30 2019] R10: ffff8c7acb3ff500 R11: ffffffffba1b5de0 R12: 0000000000000000
[Fri Jul 26 07:41:30 2019] R13: ffffa838000d1000 R14: 0000000000000000 R15: ffffa83800423ab0
[Fri Jul 26 07:41:30 2019] FS:  00007f80f0e98d40(0000) GS:ffff8c7ad2580000(0000) knlGS:0000000000000000
[Fri Jul 26 07:41:30 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Jul 26 07:41:30 2019] CR2: ffffffffbac03370 CR3: 000000044673c005 CR4: 00000000003606e0
[Fri Jul 26 07:41:30 2019] Call Trace:
[Fri Jul 26 07:41:30 2019]  __bpf_prog_run32+0x44/0x70
[Fri Jul 26 07:41:30 2019]  ? security_sock_rcv_skb+0x3f/0x60
[Fri Jul 26 07:41:30 2019]  sk_filter_trim_cap+0xe4/0x220
[Fri Jul 26 07:41:30 2019]  ? __skb_clone+0x2e/0x100
[Fri Jul 26 07:41:30 2019]  netlink_broadcast_filtered+0x2df/0x4f0
[Fri Jul 26 07:41:30 2019]  netlink_sendmsg+0x34f/0x3c0
[Fri Jul 26 07:41:30 2019]  ___sys_sendmsg+0x315/0x330
[Fri Jul 26 07:41:30 2019]  ? alloc_set_pte+0x17f/0x650
[Fri Jul 26 07:41:30 2019]  ? filemap_map_pages+0xa2/0x470
[Fri Jul 26 07:41:30 2019]  ? do_read_fault+0x104/0x2a0
[Fri Jul 26 07:41:30 2019]  ? handle_mm_fault+0x768/0xbf0
[Fri Jul 26 07:41:30 2019]  __x64_sys_sendmsg+0x97/0xe0
[Fri Jul 26 07:41:30 2019]  do_syscall_64+0x59/0x90
[Fri Jul 26 07:41:30 2019]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Fri Jul 26 07:41:30 2019] RIP: 0033:0x7f80f1689914
[Fri Jul 26 07:41:30 2019] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 e9 5d 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 41 89 d4 55 48 89 f5 53
[Fri Jul 26 07:41:30 2019] RSP: 002b:00007fffc5743008 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[Fri Jul 26 07:41:30 2019] RAX: ffffffffffffffda RBX: 000055a1947c0360 RCX: 00007f80f1689914
[Fri Jul 26 07:41:30 2019] RDX: 0000000000000000 RSI: 00007fffc5743030 RDI: 000000000000000e
[Fri Jul 26 07:41:30 2019] RBP: 000055a1947c0820 R08: 000000000000000f R09: 000055a1947758f0
[Fri Jul 26 07:41:30 2019] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[Fri Jul 26 07:41:30 2019] R13: 0000000000000000 R14: 000000000000008a R15: 00007fffc5743020
[Fri Jul 26 07:41:30 2019] Modules linked in: i2c_dev parport_pc nfsd ppdev auth_rpcgss nfs_acl lp lockd parport grace sunrpc efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic mbcache crc16 jbd2 btrfs zstd_decompress zstd_compress algif_skcipher af_alg sd_mod dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 uas raid0 usb_storage multipath linear scsi_mod md_mod hid_cherry hid_generic usbhid hid crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xhci_pci nvme i915 aesni_intel xhci_hcd aes_x86_64 glue_helper crypto_simd i2c_algo_bit cryptd e1000e psmouse drm_kms_helper i2c_i801 nvme_core intel_lpss_pci usbcore intel_lpss drm thermal wmi video button
[Fri Jul 26 07:41:30 2019] CR2: ffffffffbac03370
[Fri Jul 26 07:41:30 2019] ---[ end trace 5096168d3266d949 ]---

Attached are my kernel-config and dmesg-log.

config-5.3.0-rc1-6-amd64-cbl-asmgoto.txt
dmesg_5.3.0-rc1-6-amd64-cbl-asmgoto.txt

From [1]:

It sounds like clang miscompiles interpreter.
modprobe test_bpf
should be able to point out which part of interpreter is broken.

BROKEN: test_bpf: #294 BPF_MAXINSNS: Jump, gap, jump, ... jited:0

  • Sedat -

Steps to reproduce:

# sysctl -n net.core.bpf_jit_enable
1

# modprobe -v test_bpf

More details see thread on netdev ML [2].

@nickdesaulniers @jpoimboe
Looking at "bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()" [3]

Can we have something similiar for clang?

[1] https://marc.info/?l=linux-netdev&m=156420984607234&w=2
[2] https://marc.info/?t=156412966200001&r=1&w=2
[3] https://git.kernel.org/linus/3193c0836f203a91bef96d88c64cccf0be090d9c

@MaskRay

With clang-9 + ld.bfd I am not seeing this issue.

So playing with clang-9 kbuild-flags does not make many sense to me.

You have an idea where to look at?

From [1]:

I tried with hopping to turn off "global common subexpression elimination":

index 383c87300b0d..92f934a1e9ff 100644
--- a/arch/x86/net/Makefile
+++ b/arch/x86/net/Makefile
@@ -3,6 +3,8 @@
 # Arch-specific network modules
 #

+KBUILD_CFLAGS += -O0
+
 ifeq ($(CONFIG_X86_32),y)
         obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o
 else

Still see...

BROKEN: test_bpf: #294 BPF_MAXINSNS: Jump, gap, jump, ... jited:0

[1] https://marc.info/?l=linux-netdev&m=156421540608064&w=2

Looking at "bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()" [3]

Can we have something similiar for clang?

Not sure if clang has that optimization, but either way it shouldn't cause a panic. I disabled that option in that function in GCC because it was producing code which objtool couldn't understand.

@MaskRay @GeorgiiR

I switched to Linux v5.3-rc2 and the issue still remains.

My clang-9 and ld.lld are from llvm-project.git#release/9.x Git branch:

$ clang-9 --version
(ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 1634b4bc934d67cb5fa356a925ba8efca2259f12) (based on LLVM 9.0.0)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/sdi/src/llvm-toolchain/install/bin

When llinking with ld.bfd I do not see the issue and can boot normally on bar metal.

I did a objdump of vmlinux.o and generated a diff like this:

$ objdump -M intel -d linux.clang9-bfd/vmlinux.o > vmlinux_o_clang9-bfd.txt
$ objdump -M intel -d linux.clang9-lld/vmlinux.o > vmlinux_o_clang9-lld.txt
$ diff -uprN vmlinux_o_clang9-bfd.txt vmlinux_o_clang9-lld.txt > vmlinux_o.diff

This is an example:

 0000000000082600 <bpf_int_jit_compile>:
    82600:      e8 00 00 00 00          call   82605 <bpf_int_jit_compile+0x5>
@@ -148773,9 +152339,16 @@ Disassembly of section .text:
    8431b:      be cc 00 00 00          mov    esi,0xcc
    84320:      5d                      pop    rbp
    84321:      e9 00 00 00 00          jmp    84326 <jit_fill_hole+0x16>
-   84326:      66 90                   xchg   ax,ax
-   84328:      0f 1f 84 00 00 00 00    nop    DWORD PTR [rax+rax*1+0x0]
-   8432f:      00 
+   84326:      cc                      int3   
+   84327:      cc                      int3   
+   84328:      cc                      int3   
+   84329:      cc                      int3   
+   8432a:      cc                      int3   
+   8432b:      cc                      int3   
+   8432c:      cc                      int3   
+   8432d:      cc                      int3   
+   8432e:      cc                      int3   
+   8432f:      cc                      int3

I see a lot of nop and xchg (ld.bfd) lines turned into cc int3 (ld.lld).

I looked through recent changes in lld tree and wanted to run some LLD tests, but was not able to include and run the tests.

I open tc-build issue #42.

[1] https://github.com/ClangBuiltLinux/tc-build/issues/42

I'm going to try to reproduce this in a VM as I don't have a local x86 machine to test this on.

I am curious though, did this first start on next-20190723 or does it happen on earlier versions?

 0000000000082600 <bpf_int_jit_compile>:
    82600:      e8 00 00 00 00          call   82605 <bpf_int_jit_compile+0x5>
@@ -148773,9 +152339,16 @@ Disassembly of section .text:
    8431b:      be cc 00 00 00          mov    esi,0xcc
    84320:      5d                      pop    rbp
    84321:      e9 00 00 00 00          jmp    84326 <jit_fill_hole+0x16>
-   84326:      66 90                   xchg   ax,ax
-   84328:      0f 1f 84 00 00 00 00    nop    DWORD PTR [rax+rax*1+0x0]
-   8432f:      00 
+   84326:      cc                      int3   
+   84327:      cc                      int3   
+   84328:      cc                      int3   
+   84329:      cc                      int3   
+   8432a:      cc                      int3   
+   8432b:      cc                      int3   
+   8432c:      cc                      int3   
+   8432d:      cc                      int3   
+   8432e:      cc                      int3   
+   8432f:      cc                      int3

In a SHF_EXECINSTR output section, if there is a gap between two adjacent input sections. ld.bfd fills in the gap with nops while lld uses trap instructions (0xcc on x86):

// binutils-gdb/bfd/cpu-i386.c`:`bfd_arch_i386_fill`
  /* xchg %ax,%ax */
  static const char nop_2[] = { 0x66, 0x90 };

  /* nopl 0L(%[re]ax,%[re]ax,1) */
  static const char nop_8[] =
    { 0x0f, 0x1f, 0x84, 0x00, 0x00, 0x00, 0x00, 0x00};

Does the bfd code somehow assume the gaps consist of nops?

Can you upload linux.clang9-{bfd,lld}/vmlinux.o somewhere or make it easier to reproduce 🙂?

I downgraded to Linux v5.3-rc2 on Monday to get a base for a Git bisect.

The issue still remains with clang-9 and lld from upgraded LLVM toolchain
version 9.0.0-rc1.

The last known good was Linux v5.1.18.
I have to check the used versions of clang-9 and lld (afaics not built with tc-build these days).
Yes, I was able to boot on bare metal.

Or rebuild last known good Linux with LLVM toolchain v9.0.0-rc1.

@MaskRay @nathanchance

This is with Linux v5.3-rc2 and this two patches from @jpoimboe and @arndb:

drm/i915: Remove redundant user_access_end() from __copy_from_user() error path
xen/trace: avoid clang warning on function pointers

LLVM toolchain versions:

[ clang9-bfd ]

clang-9: 1634b4bc934d67cb5fa356a925ba8efca2259f12 (from llvm-project.git#release/9.x Git branch)
ld.bfd:  2.31.1-16 (from binutils Debian/buster aka version 10 package)
ld.lld:  see clang-9

Looked into the archived build-dirs:

$ cd linux.clang9-bfd
$ readelf --string-dump .comment vmlinux

String dump of section '.comment':
  [     0]  ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 1634b4bc934d67cb5fa356a925ba8efca2259f12) (based on LLVM 9.0.0)


$ cd linux.clang9-lld
$ readelf --string-dump .comment vmlinux

String dump of section '.comment':
  [     0]  ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 1634b4bc934d67cb5fa356a925ba8efca2259f12) (based on LLVM 9.0.0)
  [    88]  Linker: LLD 9.0.0 (git://github.com/llvm/llvm-project 1634b4bc934d67cb5fa356a925ba8efca2259f12)

UPDATE-1: Both clang-9 versions are identical.
UPDATE-2: I upgraded my llvm-toolchain to version v9.0.0-rc1 - nope.

[ Attachments ]

MD5SUM.txt
vmlinux_o_clang9-bfd.txt.gz
vmlinux_o_clang9-lld.txt.gz

This is the first time I hoped it does not work :-).

I took the archived linux-5.1.y kernel-config and did:

$ scripts/config -d DRM_AMDGPU

I can boot on bare metal:

$ cat /proc/version 
Linux version 5.1.18-1-amd64-cbl-asmgoto ([email protected]@iniza) (ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 6aa75a25bdeea9cdc4b04cdd91e82e680444bf4b) (based on LLVM 9.0.0)) #1~buster+dileks1 SMP 2019-07-31

$ readelf --string-dump .comment vmlinux

String dump of section '.comment':
  [     0]  ClangBuiltLinux clang version 9.0.0 (git://github.com/llvm/llvm-project 6aa75a25bdeea9cdc4b04cdd91e82e680444bf4b) (based on LLVM 9.0.0)
  [    89]  Linker: LLD 9.0.0 (git://github.com/llvm/llvm-project 6aa75a25bdeea9cdc4b04cdd91e82e680444bf4b)

Loading module test_bpf is OK:

# modprobe -v test_bpf 
insmod /lib/modules/5.1.18-1-amd64-cbl-asmgoto/kernel/lib/test_bpf.ko

I will try latest Linux v5.2.5 and report later.

[ Attachments ]

config-5.1.18-1-amd64-cbl-asmgoto.txt
dmesg_5.1.18-1-amd64-cbl-asmgoto_modprobe-test_bpf.txt

All good with Linux v5.2.5.

I did a very time intensive git-bisect:

$ git bisect log
git bisect start
# good: [2519374d2a6b8aa5d395393f21e74232409c2e82] Linux 5.2.5
git bisect good 2519374d2a6b8aa5d395393f21e74232409c2e82
# bad: [609488bc979f99f805f34e9a32c1e3b71179d10b] Linux 5.3-rc2
git bisect bad 609488bc979f99f805f34e9a32c1e3b71179d10b
# good: [0ecfebd2b52404ae0c54a878c872bb93363ada36] Linux 5.2
git bisect good 0ecfebd2b52404ae0c54a878c872bb93363ada36
# good: [17a20acaf171124017f43bc70bb4d7ca88070659] Merge tag 'usb-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 17a20acaf171124017f43bc70bb4d7ca88070659
# good: [8de262531f5fbb7458463224a7587429800c24bf] Merge tag 'mfd-next-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
git bisect good 8de262531f5fbb7458463224a7587429800c24bf
# good: [5f4fc6d440d77a2cf74fe4ea56955674ac7e35e7] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect good 5f4fc6d440d77a2cf74fe4ea56955674ac7e35e7
# good: [af6af87d7e4ff67324425daa699b9cda32e3161d] Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect good af6af87d7e4ff67324425daa699b9cda32e3161d
# bad: [44b912cd0b55777796c5ae8ae857bd1d5ff83ed5] Merge tag 'for-linus-20190722' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
git bisect bad 44b912cd0b55777796c5ae8ae857bd1d5ff83ed5
# bad: [e6023adc5c6af79ac8ac5b17939f58091fa0d870] Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad e6023adc5c6af79ac8ac5b17939f58091fa0d870
# good: [168c79971b4a7be7011e73bf488b740a8e1135c8] Merge tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
git bisect good 168c79971b4a7be7011e73bf488b740a8e1135c8
# good: [07ab9d5bc53d7fe84047be1d403566123ab9cfaa] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 07ab9d5bc53d7fe84047be1d403566123ab9cfaa
# bad: [3193c0836f203a91bef96d88c64cccf0be090d9c] bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()
git bisect bad 3193c0836f203a91bef96d88c64cccf0be090d9c
# bad: [d99a6ce70ec6ed990b74bd4e34232fd830d20d27] x86/kvm: Fix fastop function ELF metadata
git bisect bad d99a6ce70ec6ed990b74bd4e34232fd830d20d27
# good: [cac9b9a4b08304f11daace03b8b48659355e44c1] stacktrace: Force USER_DS for stack_trace_save_user()
git bisect good cac9b9a4b08304f11daace03b8b48659355e44c1
# bad: [e55a73251da335873a6e87d68fb17e5aabb8978e] bpf: Fix ORC unwinding in non-JIT BPF code
git bisect bad e55a73251da335873a6e87d68fb17e5aabb8978e
# good: [87b512def792579641499d9bef1d640994ea9c18] objtool: Add support for C jump tables
git bisect good 87b512def792579641499d9bef1d640994ea9c18
# first bad commit: [e55a73251da335873a6e87d68fb17e5aabb8978e] bpf: Fix ORC unwinding in non-JIT BPF code

@jpoimboe

Can you look at this?

e55a73251da335873a6e87d68fb17e5aabb8978e is the first bad commit
commit e55a73251da335873a6e87d68fb17e5aabb8978e
Author: Josh Poimboeuf <[email protected]>
Date:   Thu Jun 27 20:50:47 2019 -0500

    bpf: Fix ORC unwinding in non-JIT BPF code

    Objtool previously ignored ___bpf_prog_run() because it didn't understand
    the jump table.  This resulted in the ORC unwinder not being able to unwind
    through non-JIT BPF code.

    Now that objtool knows how to read jump tables, remove the whitelist and
    annotate the jump table so objtool can recognize it.

    Also add an additional "const" to the jump table definition to clarify that
    the text pointers are constant.  Otherwise GCC sets the section writable
    flag and the assembler spits out warnings.

    Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")
    Reported-by: Song Liu <[email protected]>
    Signed-off-by: Josh Poimboeuf <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Acked-by: Alexei Starovoitov <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Cc: Kairui Song <[email protected]>
    Cc: Steven Rostedt <[email protected]>
    Cc: Borislav Petkov <[email protected]>
    Cc: Daniel Borkmann <[email protected]>
    Link: https://lkml.kernel.org/r/881939122b88f32be4c374d248c09d7527a87e35.1561685471.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar <[email protected]>

:040000 040000 4735e9d14fa416c1c361ec3923440a3d586a627d 31de80b85c7b0292e47a719ecb6b1a451de2f8ef M      kernel

@jpoimboe @MaskRay

Attached is the object file with reverted:

commit e55a73251da3 "bpf: Fix ORC unwinding in non-JIT BPF code"

bpf-core_clang9-lld_reverted-e55a73251da3.o.gz

After reverting the above culprit commit I can boot into Linux v5.3-rc2 with clang-9and lld with no issues.

cc @jpoimboe .

Thanks for reporting and finding the offending commit @dileks .

Otherwise GCC sets the section writable flag and the assembler spits out warnings.

So I'd be curious which section this symbol should be in, then we can check what's the difference between GCC and Clang.

@nickdesaulniers @jpoimboe

From whom/where is this quote from?

To clarify:
Identical clang-9 snapshot version.
Changing from BFD to LLD shows this issue.

I am not saying that clang-9 is mis- or not mis-compiling.

The diff from culprit commit e55a73251da335873a6e87d68fb17e5aabb8978e:

--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1299,7 +1299,7 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
 {
 #define BPF_INSN_2_LBL(x, y)    [BPF_##x | BPF_##y] = &&x##_##y
 #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z
-       static const void *jumptable[256] = {
+       static const void * const jumptable[256] __annotate_jump_table = {
                [0 ... 255] = &&default_label,
                /* Now overwrite non-defaults ... */
                BPF_INSN_MAP(BPF_INSN_2_LBL, BPF_INSN_3_LBL),
@@ -1558,7 +1558,6 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack)
                BUG_ON(1);
                return 0;
 }
-STACK_FRAME_NON_STANDARD(___bpf_prog_run); /* jump table */

 #define PROG_NAME(stack_size) __bpf_prog_run##stack_size
 #define DEFINE_BPF_PROG_RUN(stack_size) \

The last thing I checked was to see if .rodata..c_jump_table section exists when linking with LLD.

Here is __annotate_jump_table defined (I have set CONFIG_STACK_VALIDATION=y) :

[ include/linux/compiler.h ]

/* Unreachable code */
#ifdef CONFIG_STACK_VALIDATION
/*
 * These macros help objtool understand GCC code flow for unreachable code.
 * The __COUNTER__ based labels are a hack to make each instance of the macros
 * unique, to convince GCC not to merge duplicate inline asm statements.
 */
#define annotate_reachable() ({                                         \
        asm volatile("%c0:\n\t"                                         \
                     ".pushsection .discard.reachable\n\t"              \
                     ".long %c0b - .\n\t"                               \
                     ".popsection\n\t" : : "i" (__COUNTER__));          \
})
#define annotate_unreachable() ({                                       \
        asm volatile("%c0:\n\t"                                         \
                     ".pushsection .discard.unreachable\n\t"            \
                     ".long %c0b - .\n\t"                               \
                     ".popsection\n\t" : : "i" (__COUNTER__));          \
})
#define ASM_UNREACHABLE                                                 \
        "999:\n\t"                                                      \
        ".pushsection .discard.unreachable\n\t"                         \
        ".long 999b - .\n\t"                                            \
        ".popsection\n\t"

/* Annotate a C jump table to allow objtool to follow the code flow */
#define __annotate_jump_table __section(".rodata..c_jump_table") <--- XXX: See here

#else
#define annotate_reachable()
#define annotate_unreachable()
#define __annotate_jump_table
#endif

#ifndef ASM_UNREACHABLE
# define ASM_UNREACHABLE
#endif
#ifndef unreachable
# define unreachable() do {             \
        annotate_unreachable();         \
        __builtin_unreachable();        \
} while (0)
#endif <--- XXX: CONFIG_STACK_VALIDATION

Blame blame blame:

$ git blame include/linux/compiler.h | grep annotate_jump_table
87b512def7925 (Josh Poimboeuf          2019-06-27 20:50:46 -0500 121) #define __annotate_jump_table __section(".rodata..c_jump_table")
87b512def7925 (Josh Poimboeuf          2019-06-27 20:50:46 -0500 126) #define __annotate_jump_table

$ git log --oneline 87b512def7925 -1
87b512def792 (refs/bisect/good-87b512def792579641499d9bef1d640994ea9c18) objtool: Add support for C jump tables

Checking for this section in object-files:

$ ll bpf-core_clang9-**_o.txt
-rw-r--r-- 1 sdi sdi 6097712 Aug  7 00:45 bpf-core_clang9-bfd_o.txt
-rw-r--r-- 1 sdi sdi 6098622 Aug  7 00:46 bpf-core_clang9-lld_o.txt
-rw-r--r-- 1 sdi sdi 6067542 Aug  7 00:50 bpf-core_clang9-lld_reverted-e55a73251da3_o.txt

$ grep '.rodata..c_jump_table' bpf-core_clang9-**_o.txt
bpf-core_clang9-bfd_o.txt:Disassembly of section ".rodata..c_jump_table":
bpf-core_clang9-lld_o.txt:Disassembly of section ".rodata..c_jump_table":

Maybe I am looking at the wrong files or inspecting with the wrong tools and its options?
Maybe some post-processing eliminates sections?

BTW, I would like to extract a specific section from beginning to its end out of an object-file.

Peter Z. gave me this hint when checking for del_timer_sync in vmlinux:

objdump -D vmlinux | awk '/<[^>]*>:$/ { p=0; } /<del_timer_sync>:/ { p=1; } { if (p) print $0; }'

I would like to have this for .rodata..c_jump_table.

Looking at -fjump-tables/-fno-jump-tables CFLAGS:

[ arch/x86/Makefile ]

# Avoid indirect branches in kernel to deal with Spectre
ifdef CONFIG_RETPOLINE
  KBUILD_CFLAGS += $(RETPOLINE_CFLAGS)
  # Additionally, avoid generating expensive indirect jumps which
  # are subject to retpolines for small number of switch cases.
  # clang turns off jump table generation by default when under
  # retpoline builds, however, gcc does not for x86. This has
  # only been fixed starting from gcc stable version 8.4.0 and
  # onwards, but not for older ones. See gcc bug #86952.
  ifndef CONFIG_CC_IS_CLANG
    KBUILD_CFLAGS += $(call cc-option,-fno-jump-tables)
  endif
endif

For GCC yes, CLANG no.

Checking docs BPF and CLANG and -fno-jump-tables...

[ Documentation/bpf/bpf_devel_QA.rst ]

Q: In some cases clang flag ``-target bpf`` is used but in other cases the
default clang target, which matches the underlying architecture, is used.
What is the difference and when I should use which?

A: Although LLVM IR generation and optimization try to stay architecture
independent, ``-target <arch>`` still has some impact on generated code:
...
- The default target may turn a C switch statement into a switch table
  lookup and jump operation. Since the switch table is placed
  in the global readonly section, the bpf program will fail to load.
  The bpf target does not support switch table optimization.
  The clang option ``-fno-jump-tables`` can be used to disable
  switch table generation.

...thus I tried...

--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-y := core.o
+CFLAGS_core.o += $(call cc-option,-fno-jump-tables)
 CFLAGS_core.o += $(call cc-disable-warning, override-init)

 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o

NOPE.

So I'd be curious which section this symbol should be in, then we can check what's the difference between GCC and Clang.

The BPF jump table is placed in .rodata, IIRC. Sorry I don't have time to look deeper into this issue at the moment.

The BPF jump table is placed in .rodata, IIRC. Sorry I don't have time to look deeper into this issue at the moment.

Sorry, should be ".rodata..c_jump_table" as @dileks mentioned. @dileks you can use readelf -a --wide core.o (or on vmlinux.o) to see all the sections (and a lot more interesting ELF data).

@jpoimboe

I have poor ELF knowledge, so I don't know to look at what ...

I did a readelf -a -W and files are attached.

readelf-aW_bpf-core_o_clang9-bfd.txt.gz
readelf-aW_bpf-core_o_clang9-lld.txt.gz

I dumped the .rodata..c_jump_table data; both objects have identical data. The section header entries are also identical. Both objects have identical code. Looking at the above panic, it's failing with

[Fri Jul 26 08:08:42 2019] BUG: unable to handle page fault for address: ffffffff85403370

at the following instruction:

   3d80:       48 8b 04 f5 00 00 00    mov    0x0(,%rsi,8),%rax
    3d87:       00
                        3d84: R_X86_64_32S      ".rodata..c_jump_table"

Where it's trying to load the address of the jump table. So apparently the jump table isn't getting linked or loaded where it's supposed to be.

It might be interesting to compare the vmlinux.o files, specifically:

objdump -s -j '".rodata..c_jump_table"' vmlinux.o
readelf -S vmlinux.o
objdump -dr vmlinux.o # (don't need the whole file, just care about ___bpf_prog_run())

If you could attach the before/after of those files, I can take a look.

In addition to vmlinux.o, vmlinux might be interesting as well:

readelf -S vmlinux
readelf -s vmlinux
objdump -dr vmlinux # ___bpf_prog_run only

I believe this is the problem. From readelf-S_vmlinux.txt:

There are 94 section headers, starting at offset 0x180e4660:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         ffffffff81000000  00200000
       0000000000c00f11  0000000000000000  AX       0     0     4096
  [ 2] .notes            NOTE             ffffffff81c00f14  00e00f14
       00000000000001fc  0000000000000000   A       0     0     4
  [ 3] __ex_table        PROGBITS         ffffffff81c01110  00e01110
       0000000000001cf8  0000000000000000   A       0     0     4
  [ 4] ".rodata..c_jump_ PROGBITS         ffffffff81c02e10  00e02e10
       0000000000000800  0000000000000000   A       0     0     16
  [ 5] .rodata           PROGBITS         ffffffff81e00000  01000000
       000000000031ffc2  0000000000000000 WAMS       0     0     4096
  [ 6] ".discard.address PROGBITS         ffffffff8211ffc8  0131ffc8
       0000000000013f88  0000000000000000  WA       0     0     8
  [ 7] .rodata1          PROGBITS         ffffffff82133f50  01333f50
       0000000000000000  0000000000000000  WA       0     0     1

The .rodata..c_jump_table section shouldn't be there. The linker should have put it in .rodata because of the RO_DATA_SECTION macro in include/asm-generic/vmlinux.lds.h has:

                *(.rodata) *(.rodata.*)                                 \

And a similar issue exists for the .discard.addressable section. It should have been discarded before the final linking.

The problem is most likely caused by the fact that the section names have quotes in them.

Probable fix:

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f0fd5636fddb..a1955482e064 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -118,7 +118,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
        ".popsection\n\t"

 /* Annotate a C jump table to allow objtool to follow the code flow */
-#define __annotate_jump_table __section(".rodata..c_jump_table")
+#define __annotate_jump_table __section(.rodata..c_jump_table)

 #else
 #define annotate_reachable()

But this is a treewide issue. I guess GCC ignores the quotes and clang takes them literally.

I have re-added the parts of the reverted commit and removed quotes from two relevant section defines:

--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -118,7 +118,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
        ".popsection\n\t"

 /* Annotate a C jump table to allow objtool to follow the code flow */
-#define __annotate_jump_table __section(".rodata..c_jump_table")
+#define __annotate_jump_table __section(.rodata..c_jump_table)

 #else
 #define annotate_reachable()
@@ -298,7 +298,7 @@ unsigned long read_word_at_a_time(const void *addr)
  * visible to the compiler.
  */
 #define __ADDRESSABLE(sym) \
-       static void * __section(".discard.addressable") __used \
+       static void * __section(.discard.addressable) __used \
                __PASTE(__addressable_##sym, __LINE__) = (void *)&sym;

 /**
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1299,7 +1299,7 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6
 {
 #define BPF_INSN_2_LBL(x, y)    [BPF_##x | BPF_##y] = &&x##_##y
 #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z
-       static const void *jumptable[256] = {
+       static const void * const jumptable[256] __annotate_jump_table = {
                [0 ... 255] = &&default_label,
                /* Now overwrite non-defaults ... */
                BPF_INSN_MAP(BPF_INSN_2_LBL, BPF_INSN_3_LBL),
@@ -1558,7 +1558,6 @@ static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u6
                BUG_ON(1);
                return 0;
 }
-STACK_FRAME_NON_STANDARD(___bpf_prog_run); /* jump table */

 #define PROG_NAME(stack_size) __bpf_prog_run##stack_size
 #define DEFINE_BPF_PROG_RUN(stack_size) \

Still building... Will report later.

@jpoimboe @MaskRay @GeorgiiR @smithp35

Above snippet fixes the issue for me.

From attached file:

$ egrep 'rodata|discard' readelf-S_vmlinux.txt
  [ 4] .rodata           PROGBITS         ffffffff81e00000  01000000
  [ 5] .rodata1          PROGBITS         ffffffff820edfb2  012edfb2
  [19] __init_rodata     PROGBITS         ffffffff8213671a  0133671a
  [73] .rela.rodata      RELA             0000000000000000  17a49330

Will you cook up a patch, Josh?
Or do you wanna do a treewide fixup?

Any fix needed for LLD?

Thanks for your vital help!

readelf-S_vmlinux.txt.gz

I guess GCC ignores the quotes and clang takes them literally.

https://godbolt.org/z/IF4DC_

https://bugs.llvm.org/show_bug.cgi?id=42950

➜  kernel-all git:(master) ✗ ag __section\\\(\"
arch/s390/boot/startup.c
49:static struct diag210 _diag210_tmp_dma __section(".dma.data");

arch/arm64/kernel/smp_spin_table.c
22:volatile unsigned long __section(".mmuoff.data.read")

include/linux/srcutree.h
127:        __section("___srcu_struct_ptrs") = &name

include/linux/compiler.h
27:             __section("_ftrace_annotated_branch")   \
63:     __section("_ftrace_branch")     \
121:#define __annotate_jump_table __section(".rodata..c_jump_table")
158:    __section("___kentry" "+" #sym )            \
301:    static void * __section(".discard.addressable") __used \

include/linux/export.h
107:    static int __ksym_marker_##sym[0] __section(".discard.ksym") __used

Will you cook up a patch, Josh?
Or do you wanna do a treewide fixup?

How about if I break the above list into 5 patches, and send them with @dileks 's reported by tag, and @jpoimboe 's suggested by tag?

There are a lot more than that:

arch/arc/include/asm/linkage.h:#define __arcfp_code __attribute__((__section__(".text.arcfp")))
arch/arc/include/asm/linkage.h:#define __arcfp_code __attribute__((__section__(".text")))
arch/arc/include/asm/linkage.h:#define __arcfp_data __attribute__((__section__(".data.arcfp")))
arch/arc/include/asm/linkage.h:#define __arcfp_data __attribute__((__section__(".data")))
arch/arc/include/asm/mach_desc.h:__attribute__((__section__(".arch.info.init"))) = {    \
arch/arm/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/arm/include/asm/mach/arch.h: __attribute__((__section__(".arch.info.init"))) = {   \
arch/arm/include/asm/mach/arch.h: __attribute__((__section__(".arch.info.init"))) = {   \
arch/arm/include/asm/setup.h:#define __tag __used __attribute__((__section__(".taglist.init")))
arch/arm64/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/arm64/kernel/smp_spin_table.c:volatile unsigned long __section(".mmuoff.data.read")
arch/ia64/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/ia64/include/asm/machvec_init.h: struct ia64_machine_vector machvec_##name __attribute__ ((unused, __section__ (".machvec")))  \
arch/mips/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/parisc/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/parisc/include/asm/ldcw.h:# define __lock_aligned __attribute__((__section__(".data..lock_aligned")))
arch/parisc/kernel/ftrace.c:#define __hot __attribute__ ((__section__ (".text.hot")))
arch/parisc/mm/init.c:pmd_t pmd0[PTRS_PER_PMD] __attribute__ ((__section__ (".data..vm0.pmd"), aligned(PAGE_SIZE)));
arch/parisc/mm/init.c:pgd_t swapper_pg_dir[PTRS_PER_PGD] __attribute__ ((__section__ (".data..vm0.pgd"), aligned(PAGE_SIZE)));
arch/parisc/mm/init.c:pte_t pg0[PT_INITIAL * PTRS_PER_PTE] __attribute__ ((__section__ (".data..vm0.pte"), aligned(PAGE_SIZE)));
arch/powerpc/boot/main.c:   __attribute__((__section__("__builtin_cmdline")));
arch/powerpc/boot/ps3.c:    __attribute__((__section__("__builtin_cmdline")));
arch/powerpc/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/powerpc/include/asm/machdep.h:#define __machine_desc __attribute__ ((__section__ (".machine.desc")))
arch/powerpc/kernel/btext.c:#define __force_data __attribute__((__section__(".data")))
arch/s390/boot/startup.c:static struct diag210 _diag210_tmp_dma __section(".dma.data");
arch/sh/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/sparc/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/sparc/kernel/btext.c:#define __force_data __attribute__((__section__(".data")))
arch/um/include/shared/init.h:  __attribute__((__section__(".initcall" level ".init"))) = fn
arch/um/kernel/skas/clone.c:void __attribute__ ((__section__ (".__syscall_stub")))
arch/um/kernel/um_arch.c:   __attribute__((__section__(".data..init_irqstack"))) =
arch/x86/include/asm/cache.h:#define __read_mostly __attribute__((__section__(".data..read_mostly")))
arch/x86/include/asm/intel-mid.h:   __attribute__((__section__(".x86_intel_mid_dev.init"))) = &i
arch/x86/include/asm/iommu_table.h: __attribute__ ((unused, __section__(".iommu_table"),        \
arch/x86/include/asm/irqflags.h:#define __cpuidle __attribute__((__section__(".cpuidle.text")))
arch/x86/include/asm/mem_encrypt.h:#define __bss_decrypted __attribute__((__section__(".bss..decrypted")))
arch/x86/kernel/cpu/cpu.h:  __attribute__((__section__(".x86_cpu_dev.init"))) = \
arch/x86/um/stub_segv.c:void __attribute__ ((__section__ (".__syscall_stub")))
include/asm-generic/error-injection.h:  __attribute__((__section__("_error_injection_whitelist")))  \
include/asm-generic/kprobes.h:  __attribute__((__section__("_kprobe_blacklist")))   \
include/asm-generic/kprobes.h:# define __kprobes    __attribute__((__section__(".kprobes.text")))
include/linux/cache.h:#define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
include/linux/cache.h:       __section__(".data..cacheline_aligned")))
include/linux/compiler.h:               __section("_ftrace_annotated_branch")   \
include/linux/compiler.h:       __section("_ftrace_branch")     \
include/linux/compiler.h:   __section("___kentry" "+" #sym )            \
include/linux/compiler.h:   static void * __section(".discard.addressable") __used \
include/linux/cpu.h:#define __cpuidle   __attribute__((__section__(".cpuidle.text")))
include/linux/export.h: static int __ksym_marker_##sym[0] __section(".discard.ksym") __used
include/linux/init.h:       __attribute__((__section__(#__sec ".init"))) = fn;
include/linux/init_task.h:#define __init_task_data __attribute__((__section__(".data..init_task")))
include/linux/init_task.h:#define __init_thread_info __attribute__((__section__(".data..init_thread_info")))
include/linux/interrupt.h:#define __irq_entry        __attribute__((__section__(".irqentry.text")))
include/linux/interrupt.h:  __attribute__((__section__(".softirqentry.text")))
include/linux/module.h: __used __attribute__ ((__section__ ("__modver")))       \
include/linux/moduleparam.h:    __attribute__ ((unused,__section__ ("__param"),aligned(sizeof(void *)))) \
include/linux/mtd/xip.h:#define __xipram noinline __attribute__ ((__section__ (".xiptext")))
include/linux/sched/debug.h:#define __sched     __attribute__((__section__(".sched.text")))
include/linux/srcutree.h:       __section("___srcu_struct_ptrs") = &name

Looks like those should all be using __section to properly check for compiler support for __attribute__((section)).

$ grep -e __section\(\" -e __section__\(\" -r

I have tested with my patchset "for-5.3/x86-section-name-escaping" (5 patches):

16/16 compiler_attributes.h: add note about __section
15/16 include/linux/compiler.h: remove unused KENTRY macro
14/16 include/linux: prefer __section from compiler_attributes.h
13/16 include/asm-generic: prefer __section from compiler_attributes.h
11/16 x86: prefer __section from compiler_attributes.h

...against Linux v5.3-rc5 on Debian/buster AMD64.

LLVM-Toolchain: Clang compiler and LLD linker version 9.0.0-rc2 (from
Debian/experimental)

Original patchset by @nickdesaulniers "[PATCH 00/16] treewide: prefer __section from compiler_attributes.h".

[1] https://lkml.org/lkml/2019/8/12/1173
[2] https://marc.info/?l=linux-netdev&m=156564680015261&w=2
[3] https://lore.kernel.org/patchwork/project/lkml/list/?series=406161

Just to clarify the important patch [1] to fix this issue here is...

14/16 include/linux: prefer __section from compiler_attributes.h

...which includes the above snippet.

[1] https://lore.kernel.org/patchwork/patch/1114125/

I'll separate out that hunk from that patch of that series and try to fast track that back through stable, then send everything else in v2.

@nickdesaulniers
Cool, thanks.

PR sent to Linus

Linus preferred to take only the Oops-fixing commit for 5.3, so I re-sent him another PR which has been just merged: https://github.com/torvalds/linux/commit/983f700eab89c73562f308fc49b1561377d3920e

For 5.4 we will send the rest of the series, i.e. the cleanup. For this, we should consider going the opposite way: we can change __section to avoid stringification. This makes it easier for everyone to understand and allows us to use __section for all use cases.

I agree. Thanks @ojeda . Closing this issue as @dileks 's observed Oops should now be resolved.

Was this page helpful?
0 / 5 - 0 ratings