Linux: error: invalid reassignment of non-absolute variable 'var_ddq_add'

Created on 28 Apr 2020  Â·  30Comments  Â·  Source: ClangBuiltLinux/linux

Reproduce:

make LLVM=1 LLVM_IAS=1 O=/tmp/out/x86_64 -j 50 defconfig
echo 'CONFIG_CRYPTO_AES_NI_INTEL=y' >> /tmp/out/x86_64/.config
make LLVM=1 LLVM_IAS=1 O=/tmp/out/x86_64 -j 50 arch/x86/crypto/aes_ctrby8_avx-x86_64.o
<instantiation>:1:15: error: invalid reassignment of non-absolute variable 'var_ddq_add'
var_ddq_add = ddq_add_1
              ^
<instantiation>:3:3: note: while in macro instantiation
  setddq %i
  ^
<instantiation>:1:1: note: while in macro instantiation
club 0, i
^
<instantiation>:12:2: note: while in macro instantiation
 .rept (by - 1)
 ^
<instantiation>:1:1: note: while in macro instantiation
do_aes 2, 1, 1
^
<instantiation>:30:2: note: while in macro instantiation
 do_aes_load 2, 1
 ^
Reported upstream [ARCH] x86_64 [FIXED][LINUX] 5.9 [TOOL] integrated-as

Most helpful comment

@nickdesaulniers Sorry about that. Thanks for the clarification.

All 30 comments

The error message is reminiscent of https://github.com/ClangBuiltLinux/linux/issues/920 and probably failed the same LLVM code path. However, different from 920, the various values assigned to var_ddq_add (ddq_add_*) are all constant values and this should not have failed. Will start to investigate.

Reduced repro code:

$cat bad.s
ddq_add_1:
.octa 0x00000000000000000000000000000001

.text

.macro setddq n
var_ddq_add = ddq_add_\n
.endm

.set i, 1
setddq 1
vpaddq var_ddq_add(%rip), %xmm8, %xmm8
setddq 1

$ llvm-mc -triple=x86_64 -filetype=obj bad.s -o bad.o
:1:15: error: invalid reassignment of non-absolute variable 'var_ddq_add'
var_ddq_add = ddq_add_1
^
bad.s:13:1: note: while in macro instantiation
setddq 1

.set i, 1
setddq 1
vpaddq var_ddq_add(%rip), %xmm8, %xmm8
setddq 1

Looking at the code again, I realized var_ddq_add were assigned different labels multiple times, not the constant values they refer to. Clang treats such assignment similar to .set, and disallows the reassignment of a symbol with its current value referring to other symbols. GNU as supports such behavior on some targets, although it did not explicitly specify which targets it supports. According to its doc, "Values that are based on expressions involving other symbols are allowed, but some targets may restrict this to only being done once per assembly".

The code in arch/x86/crypto/aes_ctrby8_avx-x86_64 appeared to use the address of the the labels to retrieve corresponding relative values, as demonstrated by the following code example.

$ cat foo.s
ddq_high_add_1:
.octa 0x00000000000000010000000000000000
ddq_add_1:
.octa 0x00000000000000000000000000000001

.text

var_ddq_add = ddq_add_1
vpaddq var_ddq_add(%rip), %xmm8, %xmm8
var_ddq_add = ddq_high_add_1
vpaddq var_ddq_add(%rip), %xmm8, %xmm8

$ gcc -c foo.s -o foo.o
$ objdump -d foo.o

foo.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <ddq_high_add_1>:
    ...
   8:   01 00                   add    %eax,(%rax)
   a:   00 00                   add    %al,(%rax)
   c:   00 00                   add    %al,(%rax)
    ...

0000000000000010 <ddq_add_1>:
  10:   01 00                   add    %eax,(%rax)
    ...
  1e:   00 00                   add    %al,(%rax)
  20:   c5 39 d4 05 e8 ff ff    vpaddq -0x18(%rip),%xmm8,%xmm8        # 10 <ddq_add_1>
  27:   ff 
  28:   c5 39 d4 05 d0 ff ff    vpaddq -0x30(%rip),%xmm8,%xmm8        # 0 <ddq_high_add_1>

I think we can fix this in the kernel with the following change,

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index ec437db1fa54..6794c54e9699 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -106,22 +106,15 @@ ddq_low_msk:
    .octa 0x0000000000000000FFFFFFFFFFFFFFFF
 ddq_high_add_1:
    .octa 0x00000000000000010000000000000000
-ddq_add_1:
-   .octa 0x00000000000000000000000000000001
-ddq_add_2:
-   .octa 0x00000000000000000000000000000002
-ddq_add_3:
-   .octa 0x00000000000000000000000000000003
-ddq_add_4:
-   .octa 0x00000000000000000000000000000004
-ddq_add_5:
-   .octa 0x00000000000000000000000000000005
-ddq_add_6:
-   .octa 0x00000000000000000000000000000006
-ddq_add_7:
-   .octa 0x00000000000000000000000000000007
-ddq_add_8:
-   .octa 0x00000000000000000000000000000008
+
+.set ddq_add_1, 0x00000000000000000000000000000001
+.set ddq_add_2, 0x00000000000000000000000000000002
+.set ddq_add_3, 0x00000000000000000000000000000003
+.set ddq_add_4, 0x00000000000000000000000000000004
+.set ddq_add_5, 0x00000000000000000000000000000005
+.set ddq_add_6, 0x00000000000000000000000000000006
+.set ddq_add_7, 0x00000000000000000000000000000007
+.set ddq_add_8, 0x00000000000000000000000000000008

 .text

@@ -167,7 +160,7 @@ ddq_add_8:
    .rept (by - 1)
        club DDQ_DATA, i
        club XDATA, i
-       vpaddq  var_ddq_add(%rip), xcounter, var_xdata
+       vpaddq  var_ddq_add, xcounter, var_xdata
        vptest  ddq_low_msk(%rip), var_xdata
        jnz 1f
        vpaddq  ddq_high_add_1(%rip), var_xdata, var_xdata
@@ -181,7 +174,7 @@ ddq_add_8:

    vpxor   xkey0, xdata0, xdata0
    club DDQ_DATA, by
-   vpaddq  var_ddq_add(%rip), xcounter, xcounter
+   vpaddq  var_ddq_add, xcounter, xcounter
    vptest  ddq_low_msk(%rip), xcounter
    jnz 1f
    vpaddq  ddq_high_add_1(%rip), xcounter, xcounter

@jcai19
I have tested the above snippet and it fixes the issue for me.

clang-10 -Wp,-MD,arch/x86/crypto/.aes_ctrby8_avx-x86_64.o.d -nostdinc -isystem /home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -Wa,-gdwarf-4 -m64 -Wa,-gdwarf-4 -Wa,--compress-debug-sections=zlib -DCC_USING_FENTRY -DMODULE -c -o arch/x86/crypto/aes_ctrby8_avx-x86_64.o arch/x86/crypto/aes_ctrby8_avx-x86_64.S

It produces a aes_ctrby8_avx-x86_64.o:

$ ll arch/x86/crypto/aes_ctrby8_avx-x86_64.o
-rw-r--r-- 1 dileks dileks 26K Jun 12 15:43 arch/x86/crypto/aes_ctrby8_avx-x86_64.o

@jcai19

Can you look at this, please?

Unsure, what exactly causes the problem here.

Workaround: scripts/config -e CRYPTO_MANAGER_DISABLE_TESTS

With your snippet applied I see in dmesg:

[Mon Jun 15 10:25:36 2020] AVX version of gcm_enc/dec engaged.
[Mon Jun 15 10:25:36 2020] AES CTR mode by8 optimization enabled
[Mon Jun 15 10:25:36 2020] BUG: kernel NULL pointer dereference, address: 0000000000000001
[Mon Jun 15 10:25:36 2020] #PF: supervisor read access in kernel mode
[Mon Jun 15 10:25:36 2020] #PF: error_code(0x0000) - not-present page
[Mon Jun 15 10:25:36 2020] PGD 0 P4D 0 
[Mon Jun 15 10:25:36 2020] Oops: 0000 [#1] SMP PTI
[Mon Jun 15 10:25:36 2020] CPU: 3 PID: 428 Comm: cryptomgr_test Tainted: G            E     5.7.2-10-amd64-clang #10~bullseye+dileks1
[Mon Jun 15 10:25:36 2020] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
[Mon Jun 15 10:25:36 2020] RIP: 0010:aes_ctr_enc_128_avx_by8+0x3d2/0x1270 [aesni_intel]
[Mon Jun 15 10:25:36 2020] Code: c5 fa 7f 01 c5 fa 7f 49 10 c5 fa 7f 51 20 48 83 c1 30 49 83 e0 80 0f 84 9f 0e 00 00 e9 e7 0a 00 00 c5 79 6f 12 c4 c2 39 00 c1 <c5> b9 d4 0c 25 01 00 00 00 c4 e2 79 17 0d 9c 4a 00 00 75 10 c5 f1
[Mon Jun 15 10:25:36 2020] RSP: 0000:ffffb03b00a57a78 EFLAGS: 00010246
[Mon Jun 15 10:25:36 2020] RAX: ffff9e4b5499e000 RBX: 0000000000000040 RCX: ffff9e4b5499e000
[Mon Jun 15 10:25:36 2020] RDX: ffff9e4b5615e420 RSI: ffffb03b00a57d00 RDI: ffff9e4b5499e000
[Mon Jun 15 10:25:36 2020] RBP: ffffb03b00a57b70 R08: 0000000000000040 R09: ffffb03b00a57d00
[Mon Jun 15 10:25:36 2020] R10: 0000000000000040 R11: ffffffffc07db920 R12: ffffb03b00a57a88
[Mon Jun 15 10:25:36 2020] R13: ffffb03b00a57d00 R14: ffff9e4b5615e420 R15: 0000000000000000
[Mon Jun 15 10:25:36 2020] FS:  0000000000000000(0000) GS:ffff9e4b57ac0000(0000) knlGS:0000000000000000
[Mon Jun 15 10:25:36 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jun 15 10:25:36 2020] CR2: 0000000000000001 CR3: 0000000210758005 CR4: 00000000000606e0
[Mon Jun 15 10:25:36 2020] Call Trace:
[Mon Jun 15 10:25:36 2020]  ? ctr_crypt+0x96/0x140 [aesni_intel]
[Mon Jun 15 10:25:36 2020]  ? test_skcipher+0x284/0xa20
[Mon Jun 15 10:25:36 2020]  test_skcipher+0x4c5/0xa20
[Mon Jun 15 10:25:36 2020]  ? pcpu_get_vm_areas+0x470/0x1180
[Mon Jun 15 10:25:36 2020]  ? __get_free_pages+0x11/0x30
[Mon Jun 15 10:25:36 2020]  alg_test_skcipher+0xd4/0x2f0
[Mon Jun 15 10:25:36 2020]  alg_test+0x21c/0x650
[Mon Jun 15 10:25:36 2020]  ? __switch_to_asm+0x34/0x70
[Mon Jun 15 10:25:36 2020]  ? __switch_to_asm+0x34/0x70
[Mon Jun 15 10:25:36 2020]  ? __switch_to_asm+0x40/0x70
[Mon Jun 15 10:25:36 2020]  ? __switch_to+0x78/0x2e0
[Mon Jun 15 10:25:36 2020]  ? __schedule+0x405/0x520
[Mon Jun 15 10:25:36 2020]  cryptomgr_test+0x31/0x50
[Mon Jun 15 10:25:36 2020]  kthread+0x131/0x140
[Mon Jun 15 10:25:36 2020]  ? crypto_alg_put+0x40/0x40
[Mon Jun 15 10:25:36 2020]  ? kthread_blkcg+0x30/0x30
[Mon Jun 15 10:25:36 2020]  ret_from_fork+0x35/0x40
[Mon Jun 15 10:25:36 2020] Modules linked in: ecdh_generic(E+) aesni_intel(E+) libarc4(E) ecc(E) libaes(E) iwlwifi(E) mc(E) crypto_simd(E) cryptd(E) glue_helper(E) snd_hda_intel(E) snd_intel_dspcfg(E) intel_cstate(E) intel_uncore(E) snd_hda_codec(E) cfg80211(E) snd_hda_core(E) intel_rapl_perf(E) samsung_laptop(E) snd_hwdep(E) evdev(E) joydev(E) drm_kms_helper(E) snd_pcm(E) iTCO_wdt(E) rfkill(E) mei_me(E) serio_raw(E) cec(E) sg(E) iTCO_vendor_support(E) watchdog(E) snd_timer(E) i2c_algo_bit(E) mei(E) snd(E) soundcore(E) ac(E) button(E) pcspkr(E) parport_pc(E) ppdev(E) lp(E) drm(E) parport(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) mbcache(E) crc16(E) jbd2(E) sd_mod(E) t10_pi(E) crc_t10dif(E) crct10dif_generic(E) hid_generic(E) usbhid(E) hid(E) uas(E) usb_storage(E) crct10dif_pclmul(E) crct10dif_common(E) xhci_pci(E) ahci(E) crc32_pclmul(E) libahci(E) ehci_pci(E) ehci_hcd(E) r8169(E) libata(E) realtek(E) crc32c_intel(E) xhci_hcd(E) psmouse(E) scsi_mod(E) lpc_ich(E) i2c_i801(E)
[Mon Jun 15 10:25:36 2020]  mfd_core(E) libphy(E) usbcore(E) fan(E) battery(E) video(E) wmi(E)
[Mon Jun 15 10:25:36 2020] CR2: 0000000000000001
[Mon Jun 15 10:25:36 2020] ---[ end trace 0231bb2a6956914c ]---
[Mon Jun 15 10:25:36 2020] RIP: 0010:aes_ctr_enc_128_avx_by8+0x3d2/0x1270 [aesni_intel]
[Mon Jun 15 10:25:36 2020] Code: c5 fa 7f 01 c5 fa 7f 49 10 c5 fa 7f 51 20 48 83 c1 30 49 83 e0 80 0f 84 9f 0e 00 00 e9 e7 0a 00 00 c5 79 6f 12 c4 c2 39 00 c1 <c5> b9 d4 0c 25 01 00 00 00 c4 e2 79 17 0d 9c 4a 00 00 75 10 c5 f1
[Mon Jun 15 10:25:36 2020] RSP: 0000:ffffb03b00a57a78 EFLAGS: 00010246
[Mon Jun 15 10:25:36 2020] RAX: ffff9e4b5499e000 RBX: 0000000000000040 RCX: ffff9e4b5499e000
[Mon Jun 15 10:25:36 2020] RDX: ffff9e4b5615e420 RSI: ffffb03b00a57d00 RDI: ffff9e4b5499e000
[Mon Jun 15 10:25:36 2020] RBP: ffffb03b00a57b70 R08: 0000000000000040 R09: ffffb03b00a57d00
[Mon Jun 15 10:25:36 2020] R10: 0000000000000040 R11: ffffffffc07db920 R12: ffffb03b00a57a88
[Mon Jun 15 10:25:36 2020] R13: ffffb03b00a57d00 R14: ffff9e4b5615e420 R15: 0000000000000000
[Mon Jun 15 10:25:36 2020] FS:  0000000000000000(0000) GS:ffff9e4b57ac0000(0000) knlGS:0000000000000000
[Mon Jun 15 10:25:36 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Jun 15 10:25:36 2020] CR2: 0000000000000001 CR3: 0000000210758005 CR4: 00000000000606e0
[Mon Jun 15 10:25:36 2020] videodev: Linux video capture interface: v2.00
[Mon Jun 15 10:25:36 2020] i915 0000:00:02.0: vgaarb: deactivate vga console
[Mon Jun 15 10:25:36 2020] Console: switching to colour dummy device 80x25
[Mon Jun 15 10:25:36 2020] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[Mon Jun 15 10:25:36 2020] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[Mon Jun 15 10:25:36 2020] [drm] Initialized i915 1.6.0 20200313 for 0000:00:02.0 on minor 0
[Mon Jun 15 10:25:36 2020] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[Mon Jun 15 10:25:36 2020] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input9
[Mon Jun 15 10:25:36 2020] snd_hda_intel 0000:00:1b.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[Mon Jun 15 10:25:36 2020] fbcon: i915drmfb (fb0) is primary device
[Mon Jun 15 10:25:36 2020] alg: No test for fips(ansi_cprng) (fips_ansi_cprng)
[Mon Jun 15 10:25:36 2020] ------------[ cut here ]------------

I can provide full dmesg-output and/or objdumps if needed.

@dileks Thank you for the trying out the patch. Essentially Clang's integrated assembler does not support reassignment of symbols with non-absolute value, in this case, var_ddq_add. If we could simply replace var_ddq_add(%rip) with something similar to ddq_add_\i(%rip) then we would have a cleaner fix. Unfortunately it seems neither IAS nor GAS supports such syntax and that is why I ended up with the current patch. Can you please verify if this happens with GNU as after applying the patch? If so my guess would be that the removal of .octa caused this change, and in that case we may have to investigate for a different fix.

Thanks @jcai19

I have reverted this snippet and jailed aes_ctrby8_avx-x86_64.o with LLVM_IAS.

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -53,6 +53,9 @@ chacha-x86_64-$(CONFIG_AS_AVX512) += chacha-avx512vl-x86_64.o
 obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
 aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
 aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o
+ifdef LLVM_IAS
+AFLAGS_aes_ctrby8_avx-x86_64.o += -no-integrated-as
+endif

 obj-$(CONFIG_CRYPTO_SHA1_SSSE3) += sha1-ssse3.o
 sha1-ssse3-y := sha1_avx2_x86_64_asm.o sha1_ssse3_asm.o sha1_ssse3_glue.o
diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index bdff3be5fba7..ec437db1fa54 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -106,14 +106,22 @@ ddq_low_msk:
        .octa 0x0000000000000000FFFFFFFFFFFFFFFF
 ddq_high_add_1:
        .octa 0x00000000000000010000000000000000
-.set ddq_add_1, 0x00000000000000000000000000000001
-.set ddq_add_2, 0x00000000000000000000000000000002
-.set ddq_add_3, 0x00000000000000000000000000000003
-.set ddq_add_4, 0x00000000000000000000000000000004
-.set ddq_add_5, 0x00000000000000000000000000000005
-.set ddq_add_6, 0x00000000000000000000000000000006
-.set ddq_add_7, 0x00000000000000000000000000000007
-.set ddq_add_8, 0x00000000000000000000000000000008
+ddq_add_1:
+       .octa 0x00000000000000000000000000000001
+ddq_add_2:
+       .octa 0x00000000000000000000000000000002
+ddq_add_3:
+       .octa 0x00000000000000000000000000000003
+ddq_add_4:
+       .octa 0x00000000000000000000000000000004
+ddq_add_5:
+       .octa 0x00000000000000000000000000000005
+ddq_add_6:
+       .octa 0x00000000000000000000000000000006
+ddq_add_7:
+       .octa 0x00000000000000000000000000000007
+ddq_add_8:
+       .octa 0x00000000000000000000000000000008

 .text

@@ -159,7 +167,7 @@ ddq_high_add_1:
        .rept (by - 1)
                club DDQ_DATA, i
                club XDATA, i
-               vpaddq  var_ddq_add, xcounter, var_xdata
+               vpaddq  var_ddq_add(%rip), xcounter, var_xdata
                vptest  ddq_low_msk(%rip), var_xdata
                jnz 1f
                vpaddq  ddq_high_add_1(%rip), var_xdata, var_xdata
@@ -173,7 +181,7 @@ ddq_high_add_1:

        vpxor   xkey0, xdata0, xdata0
        club DDQ_DATA, by
-       vpaddq  var_ddq_add, xcounter, xcounter
+       vpaddq  var_ddq_add(%rip), xcounter, xcounter
        vptest  ddq_low_msk(%rip), xcounter
        jnz     1f
        vpaddq  ddq_high_add_1(%rip), xcounter, xcounter

Before doing any further tests i will inspect above .o file.

Sounds good! Thanks.

@jcai19 @MaskRay

Is this the same root cause as seen in #1043?

If yes, should this be fixed in the llvm-toolchain?

Yesterday, I looked into llvm-project Git and found the corresponding warning/error comment in llvm/lib/MC/MCParser/AsmParser.cpp.

But hey, you are the experts.
What do you think?

Looks similar to #1043. It is easy to relax https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCParser/AsmParser.cpp and call it a fix, but I think we should have a better understanding about what GNU as allows and relax llvm-mc accordingly if reasonable. It is not uncommon that GNU as have backend differences where one backend allows something while another rejects something. It is also possible that the kernel assembly should just be rewritten in a more portable way with less re-assignments, for example.

Please see my comment at https://github.com/ClangBuiltLinux/linux/issues/1043#issuecomment-641571200. It's not easy to add this feature due to IAS's one-pass design. Another complexity is that symbol reassignment is implemented as machine-independent on IAS, but according to GNU's documention the behaviors may vary depending on targets. Personally I prefer to rewrite this kind of code in the kernel to add compatibility with different toolchains.

@MaskRay
What do you mean by "It is easy to relax https://github.com/llvm/llvm-project/blob/master/llvm/lib/MC/MCParser/AsmParser.cpp and call it a fix"?
Currently, I use https://github.com/llvm/llvm-project/commit/8a5aea7b50429cd4a459511286a7a9f1a7f4f5e2 as llvm-toolchain.
Do you propose to use a higher Git version?

@jcai19
If it is fixable in the Linux-kernel I am fine with it.
But currently we have not a fix for this one here.
A similiar issue is in Linux v5.8-rc1 with #1043.
That's why I asked to probably fix it in the llvm-toolchain.
Yesterday, I wanted to start using LLVM_IAS=1 with Linux v5.8-rc1+but was too tired and unmotivated.

Sorry, if I am so pedantic and/or impatient.

Agreed. We should think about what would be a good solution in long term. I was just pointing out there wasn't an easy fix on LLVM due to the issues I mentioned. My understanding about the original code of interest was that it was trying to replace var_ddq_add(%rip) with the result of one of those .octa calls in vpaddq instructions. I am not familiar with the code enough to suggest other changes. Maybe @nickdesaulniers knows someone we can ask help for.

I remember Eric Biggers from Google helped once in a crypto issue. So he is familiar with the code.

@dileks

Can you please verify if the above issue still happens with the following patch? I've verified it assembled with IAS but don't know how to test it. Thanks.

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index ec437db1fa54..494a3bda8487 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -127,10 +127,6 @@ ddq_add_8:

 /* generate a unique variable for ddq_add_x */

-.macro setddq n
-   var_ddq_add = ddq_add_\n
-.endm
-
 /* generate a unique variable for xmm register */
 .macro setxdata n
    var_xdata = %xmm\n
@@ -140,9 +136,7 @@ ddq_add_8:

 .macro club name, id
 .altmacro
-   .if \name == DDQ_DATA
-       setddq %\id
-   .elseif \name == XDATA
+   .if \name == XDATA
        setxdata %\id
    .endif
 .noaltmacro
@@ -165,9 +159,8 @@ ddq_add_8:

    .set i, 1
    .rept (by - 1)
-       club DDQ_DATA, i
        club XDATA, i
-       vpaddq  var_ddq_add(%rip), xcounter, var_xdata
+       vpaddq  (ddq_add_1 + 16 * (i - 1))(%rip), xcounter, var_xdata
        vptest  ddq_low_msk(%rip), var_xdata
        jnz 1f
        vpaddq  ddq_high_add_1(%rip), var_xdata, var_xdata
@@ -180,8 +173,7 @@ ddq_add_8:
    vmovdqa 1*16(p_keys), xkeyA

    vpxor   xkey0, xdata0, xdata0
-   club DDQ_DATA, by
-   vpaddq  var_ddq_add(%rip), xcounter, xcounter
+   vpaddq  (ddq_add_1 + 16 * (by - 1))(%rip), xcounter, xcounter
    vptest  ddq_low_msk(%rip), xcounter
    jnz 1f
    vpaddq  ddq_high_add_1(%rip), xcounter, xcounter

@jcai19

Sorry for the late response.
Last WE I cleaned up my Debian system.

I tested against my Linux Git tree from last Friday with v5.7.5-rc1.
Cool your patch fixes the issue for me.

My diff:

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -53,9 +53,6 @@ chacha-x86_64-$(CONFIG_AS_AVX512) += chacha-avx512vl-x86_64.o
 obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
 aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
 aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o
-ifdef LLVM_IAS
-AFLAGS_aes_ctrby8_avx-x86_64.o += -no-integrated-as
-endif

 obj-$(CONFIG_CRYPTO_SHA1_SSSE3) += sha1-ssse3.o
 sha1-ssse3-y := sha1_avx2_x86_64_asm.o sha1_ssse3_asm.o sha1_ssse3_glue.o
diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -127,10 +127,6 @@ ddq_add_8:

 /* generate a unique variable for ddq_add_x */

-.macro setddq n
-       var_ddq_add = ddq_add_\n
-.endm
-
 /* generate a unique variable for xmm register */
 .macro setxdata n
        var_xdata = %xmm\n
@@ -140,9 +136,7 @@ ddq_add_8:

 .macro club name, id
 .altmacro
-       .if \name == DDQ_DATA
-               setddq %\id
-       .elseif \name == XDATA
+       .if \name == XDATA
                setxdata %\id
        .endif
 .noaltmacro
@@ -165,9 +159,8 @@ ddq_add_8:

        .set i, 1
        .rept (by - 1)
-               club DDQ_DATA, i
                club XDATA, i
-               vpaddq  var_ddq_add(%rip), xcounter, var_xdata
+               vpaddq  (ddq_add_1 + 16 * (i - 1))(%rip), xcounter, var_xdata
                vptest  ddq_low_msk(%rip), var_xdata
                jnz 1f
                vpaddq  ddq_high_add_1(%rip), var_xdata, var_xdata
@@ -180,8 +173,7 @@ ddq_add_8:
        vmovdqa 1*16(p_keys), xkeyA

        vpxor   xkey0, xdata0, xdata0
-       club DDQ_DATA, by
-       vpaddq  var_ddq_add(%rip), xcounter, xcounter
+       vpaddq  (ddq_add_1 + 16 * (by - 1))(%rip), xcounter, xcounter
        vptest  ddq_low_msk(%rip), xcounter
        jnz     1f
        vpaddq  ddq_high_add_1(%rip), xcounter, xcounter

I kept both .o files if you want a look at it.

This means I do not need any -no-integrated-as workaround to build/assemble with LLVM_IAS=1.

@jcai19

Time to cook up your 2nd patch :-)?

Please feel free to add appropriate credits:
Reported-by: Sedat Dilek sedat.dilek@gmail.com
Tested-by: Sedat Dilek sedat.dilek@gmail.com # build+boot Linux v5.7.5; clang v11.0.0-git

[1] https://git.kernel.org/linus/a780e485b5768e78aef087502499714901b68cc4

@jcai19

For testing and getting informations when CONFIG_CRYPTO_AES_NI_INTEL=m (built-as-module):

[ arch/x86/crypto/Makefile ]

obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
aesni-intel-y := aesni-intel_asm.o aesni-intel_glue.o
aesni-intel-$(CONFIG_64BIT) += aesni-intel_avx-x86_64.o aes_ctrby8_avx-x86_64.o

Info kernel-module:
roo# modinfo aesni_intel
filename:       /lib/modules/5.7.5-rc1-1-amd64-clang/kernel/arch/x86/crypto/aesni-intel.ko
description:    Rijndael (AES) Cipher Algorithm, Intel AES-NI instructions optimized
license:        GPL
alias:          aes
alias:          crypto-aes
vermagic:       5.7.5-rc1-1-amd64-clang SMP mod_unload modversions 
name:           aesni_intel
intree:         Y
retpoline:      Y
depends:        crypto_simd,glue_helper,libaes
alias:          cpu:type:x86,ven*fam*mod*:feature:*0099*

Load and unload/remove kernel-module:
root# modprobe -v aesni_intel
root# modprobe -v -r aesni_intel

List specific kernel-modules:
root# lsmod | egrep 'aes|crypt|crc32c' | sort
aesni_intel           368640  4
crc32c_generic         16384  0
crc32c_intel           24576  4
cryptd                 28672  2 crypto_simd,ghash_clmulni_intel
crypto_simd            16384  1 aesni_intel
glue_helper            16384  1 aesni_intel
libaes                 20480  2 bluetooth,aesni_intel

@jcai19

Checking dmesg with suggested CONFIG_CRYPTO_TEST=m:

root# LC_ALL=C dmesg -T | egrep -i 'alg:|aes|crypt' | egrep -v 'systemd|fscrypt'
[Mon Jun 22 15:40:00 2020] cryptd: max_cpu_qlen set to 1000
[Mon Jun 22 15:40:00 2020] AES CTR mode by8 optimization enabled <--- XXX: **aes_ctrby8_avx-x86_64.o**
[Mon Jun 22 15:40:00 2020] alg: aead: rfc4106-gcm-aesni encryption test failed (wrong result) on test vector 0, cfg="two even aligned splits"
[Mon Jun 22 15:40:00 2020] alg: aead: generic-gcm-aesni encryption test failed (wrong result) on test vector 1, cfg="two even aligned splits"
[Mon Jun 22 15:40:01 2020] alg: No test for fips(ansi_cprng) (fips_ansi_cprng)

[1] https://cateee.net/lkddb/web-lkddb/CRYPTO_TEST.html

@jcai19

Time to cook up your 2nd patch :-)?

Please feel free to add appropriate credits:
Reported-by: Sedat Dilek sedat.[email protected]
Tested-by: Sedat Dilek sedat.[email protected] # build+boot Linux v5.7.5; clang v11.0.0-git

[1] https://git.kernel.org/linus/a780e485b5768e78aef087502499714901b68cc4

Thank you for the verification! Will send the patch.

upstream bug open for this TU: https://bugs.llvm.org/show_bug.cgi?id=24494

On Fri, Jun 26, 2020 at 10:42 PM Nick Desaulniers notifications@github.com
wrote:

upstream bug open for this TU: https://bugs.llvm.org/show_bug.cgi?id=24494

Thanks you looked at this old BR and updated with new infos.

  • Sedat -

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ClangBuiltLinux/linux/issues/1008#issuecomment-650391171,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAFQTSGRY5P365ZWAQ3JYNDRYUBZ5ANCNFSM4MSMLNUA
.

Applied to cryptodev-2.6.git.

commit 44069737ac9625a0f02f0f7f5ab96aae4cd819bc
"crypto: aesni - add compatibility with IAS"

[1] https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/commit/?id=44069737ac9625a0f02f0f7f5ab96aae4cd819bc

Thanks for the update!

Hi @jcai19 please don't close bugs affecting the kernel until the patch hits mainline. We try to carefully track what the final SHA was and what the latest git tag that contains the fix is, in case we need to backport it in the future. @dileks comment above was that the patch was accepted into a maintainers tree, but patches can take an indeterminate amount of time to hit mainline from there, based on the discretion of the maintainer who still has to send a pull request to Linus.

@nickdesaulniers Sorry about that. Thanks for the clarification.

Updated LLVM bug #24494.

[1] https://bugs.llvm.org/show_bug.cgi?id=24494#c6

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tpgxyz picture tpgxyz  Â·  4Comments

nathanchance picture nathanchance  Â·  3Comments

nickdesaulniers picture nickdesaulniers  Â·  4Comments

nickdesaulniers picture nickdesaulniers  Â·  3Comments

tpgxyz picture tpgxyz  Â·  4Comments