arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxw'
movzxw (%rdi, %rax, 2), %rsi
^~~~~~
AR certs/built-in.a
CC arch/x86/realmode/rm/video-mode.o
make[2]: *** [scripts/Makefile.build:350: arch/x86/crypto/crc32c-pcl-intel-asm_64.o] Error 1
make[1]: *** [scripts/Makefile.build:505: arch/x86/crypto] Error 2
make[1]: *** Waiting for unfinished jobs....
based on
movzxw is a psuedo instruction that expands to different instructions based on targeting 32b vs 64b x86. cc @jcai19 The gcc support of movzxw seems to be a hack as it does not support the other suffixes of movzx,
$ cat bad.s
movzxw (%rdi, %rax, 2), %rsi
$ gcc -c good.s -o good.o
$ echo $?
0
$ objdump -d good.o
good.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 48 0f b7 34 47 movzwq (%rdi,%rax,2),%rsi
$ cat bad.s
movzxq (%rdi, %rax, 2), %rsi
$ gcc -c bad.s -o bad.o
bad.s: Assembler messages:
bad.s:1: Error: unsupported syntax for `movzx'
Maybe we should fix this in the kernel instead of the integrated assembler. Also, may I know which config I should use to reproduce? Thanks.
In bad.s, you have movzxq (no w), which is not the same as the disassembly of good.o's movzwq (yes w). Please triple check.
Yes, that was my point. GCC supports movzx with 'w' suffix, but not other suffix such as 'q'. Is that intentional?
@jcai19 maybe I'm wrong but doesn't movzxw expand to movzwl _or_ movzwq depending on the platform?
@jcai19 maybe I'm wrong but doesn't movzxw expand to movzwl or movzwq depending on the platform?
That's precisely how this psuedo behaves.
➜ /tmp gcc -m64 foo.S -o foo.o -c
➜ /tmp objdump -d foo.o
foo.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 48 0f b7 34 47 movzwq (%rdi,%rax,2),%rsi
➜ /tmp gcc -m32 foo.S -o foo.o -c
➜ /tmp objdump -d foo.o
foo.o: file format elf32-i386
Disassembly of section .text:
00000000 <.text>:
0: 0f b7 34 47 movzwl (%edi,%eax,2),%esi
➜ /tmp cat foo.S
#ifdef __x86_64__
movzxw (%rdi, %rax, 2), %rsi
#else
movzxw (%edi, %eax, 2), %esi
#endif
I have a possible patch https://github.com/ZestProjects/linux/commit/0648fc8e0dcfbad8b8a35d804d9197034551e641.
Thanks @lazerl0rd and @nickdesaulniers for the clarification and example. I guess my confusion came from the fact gnu as supported different suffixes of instructions like mov (e.g. movw and movl) and generated consistent opcodes, but for movzw it only supported its 'w' form but not the others, and treated it as a pseudo instruction and lowered it into different opcodes, depending on the target architecture and the type of operands -- on x64, gnu as assembles movzxw into movzxl if the registers used are 32-bit, or movzxq for 64-bit.
$ cat foo32.s
movw 0x8, %ax
movl $0x000F, %eax
movzxw (%edi, %eax, 2), %esi
$ gcc -m32 -c foo32.s -o foo32.o
$ objdump -d foo32.o
foo32.o: file format elf32-i386
Disassembly of section .text:
00000000 <.text>:
0: 66 a1 08 00 00 00 mov 0x8,%ax
6: b8 0f 00 00 00 mov $0xf,%eax
b: 0f b7 34 47 movzwl (%edi,%eax,2),%esi
$ cat foo64.s
movw 0x8, %ax
movl $0x000F, %eax
movzxw (%edi, %eax, 2), %esi
movzxw (%rdi, %rax, 2), %rsi
$ gcc -m64 -c foo64.s -o foo64.o
$ objdump -d foo64.o
foo64.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 66 8b 04 25 08 00 00 mov 0x8,%ax
7: 00
8: b8 0f 00 00 00 mov $0xf,%eax
d: 67 0f b7 34 47 movzwl (%edi,%eax,2),%esi
12: 48 0f b7 34 47 movzwq (%rdi,%rax,2),%rsi
IMO it is clearer to change this in the kernel to explicitly use an instruction instead of a pseudo one, as suggested in the last comment. Also, not sure if this is the right source but it seems movzxw is not explicitly supported by x86 assembly.
After multiple formatting tries, I seem to sit comfortably at https://github.com/ZestProjects/linux/commit/65d285ac51d89a02d8acd4901b763211de6706da.
Thanks for the update! FWIW, gnu as assembles movzxw into either movzxl or movzxq if the operands are 32-bit or 64-bit respectively, as in the example.
nice @lazerl0rd , please send it.
Arnd independently sent a patch: https://lore.kernel.org/lkml/[email protected]/
Arnd independently sent a patch: https://lore.kernel.org/lkml/[email protected]/
No worries, I've been having email issues and will be for a week or two so I'm kinda MIA.
Hmm, with clang-10 and @arndb patch I do not get this fixed...
diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -170,7 +170,7 @@ continue_block:
## branch into array
lea jump_table(%rip), %bufp
- movzxw (%bufp, %rax, 2), len
+ movzxq (%bufp, %rax, 2), len
lea crc_array(%rip), %bufp
lea (%bufp, len, 1), %bufp
JMP_NOSPEC bufp
...and I see now:
> clang-10 -Wp,-MD,arch/x86/crypto/.crc32c-pcl-intel-asm_64.o.d -nostdinc -isystem /home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -Wa,-gdwarf-4 -m64 -Wa,-gdwarf-4 -Wa,--compress-debug-sections=zlib -DCC_USING_FENTRY -DMODULE -c -o arch/x86/crypto/crc32c-pcl-intel-asm_64.o arch/x86/crypto/crc32c-pcl-intel-asm_64.S
arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173:2: error: invalid instruction mnemonic 'movzxq'
movzxq (%rdi, %rax, 2), %rsi
^~~~~~
@jcai19
Can you look at this please?
I am here on Linux v5.7.2 and llvm-toolchain v10.0.1-rc+.
I build with -no-integrated-as and looked at the generated object-file.
0000000000000414 <continue_block>:
414: 48 81 fe c8 00 00 00 cmp rsi,0xc8
41b: 0f 82 4d 0f 00 00 jb 136e <small>
421: 48 c7 c0 ab 0a 00 00 mov rax,0xaab
428: f7 e6 mul esi
42a: 48 c1 e8 10 shr rax,0x10
42e: 48 8d 0c c1 lea rcx,[rcx+rax*8]
432: 48 8d 14 c1 lea rdx,[rcx+rax*8]
436: 4c 8d 1c c2 lea r11,[rdx+rax*8]
43a: 4d 31 c9 xor r9,r9
43d: 4d 31 d2 xor r10,r10
440: 48 8d 3d 00 00 00 00 lea rdi,[rip+0x0] # 447 <continue_block+0x33>
447: 48 0f b7 34 47 movzx rsi,WORD PTR [rdi+rax*2]
44c: 48 8d 3d 29 00 00 00 lea rdi,[rip+0x29] # 47c <crc_128>
453: 48 8d 3c 37 lea rdi,[rdi+rsi*1]
457: ff e7 jmp rdi
459: 90 nop
45a: 90 nop
45b: 90 nop
000000000000045c <full_block>:
...
Hmm, when using make OBJDUMP=llvm-objdump:
$ llvm-objdump -d ./arch/x86/crypto/crc32c-intel.o | grep -A50 continue_block
0000000000000414 continue_block:
414: 48 81 fe c8 00 00 00 cmpq $200, %rsi
41b: 0f 82 4d 0f 00 00 jb 3917 <small>
421: 48 c7 c0 ab 0a 00 00 movq $2731, %rax
428: f7 e6 mull %esi
42a: 48 c1 e8 10 shrq $16, %rax
42e: 48 8d 0c c1 leaq (%rcx,%rax,8), %rcx
432: 48 8d 14 c1 leaq (%rcx,%rax,8), %rdx
436: 4c 8d 1c c2 leaq (%rdx,%rax,8), %r11
43a: 4d 31 c9 xorq %r9, %r9
43d: 4d 31 d2 xorq %r10, %r10
440: 48 8d 3d 00 00 00 00 leaq (%rip), %rdi
447: 48 0f b7 34 47 movzwq (%rdi,%rax,2), %rsi
44c: 48 8d 3d 29 00 00 00 leaq 41(%rip), %rdi
453: 48 8d 3c 37 leaq (%rdi,%rsi), %rdi
457: ff e7 jmpq *%rdi
459: 90 nop
45a: 90 nop
45b: 90 nop
000000000000045c full_block:
OK, with make LLVM_IAS=1 OBJDUMP=llvm-objdump I am able to compile with @arndb patch.
$ ll arch/x86/crypto/crc32c-pcl-intel-asm_64.o
-rw-r--r-- 1 dileks dileks 18K Jun 12 16:24 arch/x86/crypto/crc32c-pcl-intel-asm_64.o
Objdump:
$ llvm-objdump -d arch/x86/crypto/crc32c-pcl-intel-asm_64.o
0000000000000044 continue_block:
44: 48 81 fe c8 00 00 00 cmpq $200, %rsi
4b: 0f 82 4d 0f 00 00 jb 3917 <small>
51: 48 c7 c0 ab 0a 00 00 movq $2731, %rax
58: f7 e6 mull %esi
5a: 48 c1 e8 10 shrq $16, %rax
5e: 48 8d 0c c1 leaq (%rcx,%rax,8), %rcx
62: 48 8d 14 c1 leaq (%rcx,%rax,8), %rdx
66: 4c 8d 1c c2 leaq (%rdx,%rax,8), %r11
6a: 4d 31 c9 xorq %r9, %r9
6d: 4d 31 d2 xorq %r10, %r10
70: 48 8d 3d 00 00 00 00 leaq (%rip), %rdi
77: 48 0f b7 34 47 movzwq (%rdi,%rax,2), %rsi
7c: 48 8d 3d 29 00 00 00 leaq 41(%rip), %rdi
83: 48 8d 3c 37 leaq (%rdi,%rsi), %rdi
87: ff e7 jmpq *%rdi
89: 90 nop
8a: 90 nop
8b: 90 nop
000000000000008c full_block:
Hmm, I have respinned my LLVM_IAS=1 patches and see this again:
clang-10 -Wp,-MD,arch/x86/crypto/.crc32c-pcl-intel-asm_64.o.d -nostdinc -isystem /home/dileks/src/llvm-toolchain/install/lib/clang/10.0.1/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -no-integrated-as -D__ASSEMBLY__ -fno-PIE -Werror=unknown-warning-option -Wa,-gdwarf-4 -m64 -Wa,-gdwarf-4 -Wa,--compress-debug-sections=zlib -DCC_USING_FENTRY -DMODULE -c -o arch/x86/crypto/crc32c-pcl-intel-asm_64.o arch/x86/crypto/crc32c-pcl-intel-asm_64.S
arch/x86/crypto/crc32c-pcl-intel-asm_64.S: Assembler messages:
arch/x86/crypto/crc32c-pcl-intel-asm_64.S:173: Error: unsupported syntax for `movzx'
make[5]: *** [scripts/Makefile.build:349: arch/x86/crypto/crc32c-pcl-intel-asm_64.o] Error 1
UPDATE: Aaargh forget to throw out -no-integrated-as.
NO, still seeing the Error.
More coffee. Everything OK.
@arndb
What is the status of this patch?
I do not see it in any crypto Git tree or latest Linux-next.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git/log/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
[2] https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git/log/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
[3] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/log/arch/x86/crypto/crc32c-pcl-intel-asm_64.S?h=next-20200703
@dileks go chase it upstream ;) it was sent to our mailing list if, you're subscribed.
@nickdesaulniers @arndb
Reported upstream:
[1] https://marc.info/?l=linux-crypto-vger&m=159437266425164&w=2
[2] https://marc.info/?t=159378704700001&r=1&w=2
@arndb
You want to send a v2 with some updates?
Link: https://github.com/ClangBuiltLinux/linux/issues/1010
Reviewed-by: Nathan Chancellor <[email protected]>
Reported-by: Sedat Dilek <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Credits for (Suggested-by: ?) : Diab Neiroukh <[email protected]>
Patch accepted in <cryptodev-2.6.git>.
@lazerl0rd
Sorry if I have ignored your patch and of course this should have your S-o-b.
@lazerl0rd
Sorry if I have ignored your patch and of course this should have your S-o-b.
No worries, it's just great to see this bug resolved :).
Now in Linus Git.
[1] https://git.kernel.org/linus/44623b2818f4a442726639572f44fd9b6d0ef68c
@dileks thanks for noticing, commenting, and tagging. Don't forget to close out the bug once a FIXED tag is applied!
No I did not forget. I thought when I close you will not notice. But on "closed" you get a message via GitHub, so next time I close myself.
Most helpful comment
@arndb
You want to send a v2 with some updates?