Zfs: SSE register return with SSE disabled

Created on 20 Aug 2020  Â·  21Comments  Â·  Source: openzfs/zfs

System information


Type | Version/Name
--- | ---
Distribution Name | Gentoo
Distribution Version |
Linux Kernel | 5.8.2 (compiling on 5.8.0-gentoo-r1)
Architecture | x86_64

Describe the problem you're observing

CC [M] /var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zstd/lib/zstd.o
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/immintrin.h:99,
from /var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zstd/lib/zstd.c:1336:
/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/shaintrin.h: In function '_mm_sha1msg1_epu32':
/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/include/shaintrin.h:40:1: error: SSE register return with SSE disabled
40 | {

Describe how to reproduce the problem

emerge zfs-kmod

Building

All 21 comments

this is likely user error.
User has CFLAGS configured as -march=znver1 -O3 -fgraphite-identity -floop-nest-optimize -fdevirtualize-at-ltrans -fipa-pta -fno-semantic-interposition -flto=7 -fuse-linker-plugin -pipe
and seems to be passing those to the build system (which is unsupported in gentoo and not a default configuration).

Still waiting for full build log.

Downstream bug here: https://bugs.gentoo.org/738270

ok flags seems not to be leaking to the build and correctly filtered out.
I'm suspicious about systems using those flags globally, who knows what else is "optimized".
Can't reproduce on my systems.

FYI I just ran info this with 2.0.0-rc1 on gentoo as well.
My CFLAGS are: "-march=znver1 -O2 -pipe"

custom-cflags for the KMOD is NOT set

make[1]: Leaving directory '/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/zstd' make -C /usr/src/linux M=pwdCONFIG_ZFS=m modules make[1]: Entering directory '/usr/src/linux-5.4.60-gentoo' CC [M] /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/zstd/lib/zstd.o In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/immintrin.h:103, from /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/zstd/lib/zstd.c:1336: /usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/shaintrin.h: In function ‘_mm_sha1msg1_epu32’: /usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include/shaintrin.h:40:1: error: SSE register return with SSE disabled 40 | { | ^ make[3]: *** [scripts/Makefile.build:266: /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/zstd/lib/zstd.o] Error 1 make[2]: *** [scripts/Makefile.build:500: /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/zstd] Error 2 make[1]: *** [Makefile:1706: /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module] Error 2 make[1]: Leaving directory '/usr/src/linux-5.4.60-gentoo' make: *** [Makefile:48: modules-Linux] Error 2

gcc -Wp,-MD,/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/icp/api/.kcf_ctxops.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/incl ude/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -Werror=implic it-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -march=znver1 -mno-red-zone -mcmodel=ker nel -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -DCONFIG_AS_ADX=1 -Wno-sign-compare -fno-asynchronous -unwind-tables -fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation -Wno-format-overflow -Wno-address-of-packed-member -O2 -fno-allow-store-data-races -Wframe-larger-than=1024 -fstack-protector-strong -Wno-unused-but-set-variable -Wimplicit-fallthrough -Wno -unused-const-variable -fomit-frame-pointer -fno-var-tracking-assignments -g -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation -Wno-zero-length-bounds -Wno-array-bounds -Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -fno-strict-overfl ow -fno-merge-all-constants -fmerge-constants -fno-stack-check -fconserve-stack -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -fmacro-prefix-map=./= -Wno-packed-not-aligned -std=gnu99 -Wno-declaration-after-statement -Wmissing-prototypes -Wno-form at-zero-length -include /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/zfs_config.h -I/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/include -I/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/include/os/linux/kernel -I/var/tmp/portage/sys-fs/zfs-k mod-2.0.0_rc1/work/zfs-2.0.0/include/os/linux/spl -I/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/include/os/linux/zfs -I/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/include -D_KERNEL -UDEBUG -DNDEBUG -I/var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zf s-2.0.0/module/icp/include -DMODULE -DKBUILD_BASENAME='"kcf_ctxops"' -DKBUILD_MODNAME='"icp"' -c -o /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/icp/api/kcf_ctxops.o /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/icp/api/kcf_ctxops.c ./tools/objtool/objtool orc generate --module --no-fp --uaccess /var/tmp/portage/sys-fs/zfs-kmod-2.0.0_rc1/work/zfs-2.0.0/module/icp/api/kcf_mac.o

I don't know what the "correct" flags are, but this is pretty much vanilla gentoo installation (~amd64 though).

One diversion that comes to mind is that I am using the "experimental" gentoo patches to get -march=znver1 for kernel as well, if that matters. 0.8.4 builds correctly. Also, CCACHE.

Let me know if you want me to try something...

Same issue here with 2.0.0_rc1, USE="experimental" on gentoo-sources for building kernel with -march=znver2 and global CFLAGS="-O2 -pipe -march=znver2".

While not a developer at all, I dug around and AFAIK the problem is that there's no -msoft-float argument passed to the compiler (it is there while building the kernel).

Makefiles have CFLAGS set correctly (well, at least it inherits -msoft-float if I put it temporarily in make.conf) and it does use it when I run make in subdirs by hand.

However, kbuild seems to use ZFS_MODULE_CFLAGS which don't contain it (I'm not even sure where -march=znver1 that actually ends up in the arguments comes from). I think this is simply a case of gentoo doing something different, and openzfs not being careful enough to explicitly put -msoft-float in there?

P.S. For the love of god, don't try to run it with -msoft-float just because it compiles (yes it does). It might be a terrible idea which will corrupt your memory/registers. Apparently floating point math in kernel is not trivial and this might absolutely be a very bad solution.

So after digging some more, I don't think the offending include (immintrin.h) or any functions within are needed anymore, as usage of bextr instruction was removed from zstd some time ago (and yes, it compiles just fine without it).
This include is coming from upstream.
I'm not sure whether I should create an issue with zstd project for the removal, in case my conclusion is wrong. It would be better if a dev took a look at it.

Also not sure why other distros are not complaining. Maybe gentoo doesn't do some optimization that would just ignore this dead code so it doesn't trigger error? LTO? Is it connected to KBUILD_CFLAGS? No idea really.

@zviratko I believe you're right. It looks like on the Gentoo builds __BMI__ is getting defined so the header is included, it's not defined for the kernel build on other distributions. I'd suggest we remove this since as you point out bextr is no longer used in upstream zstd.

Same issue for me on Ubuntu 20.04 with kernel 5.8.2 and GCC 10.2.

Making all in module
In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:103,
                 from /home/ghost/SRC/openzfs/module/zstd/lib/zstd.c:1336:
/usr/lib/gcc/x86_64-linux-gnu/10/include/shaintrin.h: In function ‘_mm_sha1msg1_epu32’:
/usr/lib/gcc/x86_64-linux-gnu/10/include/shaintrin.h:40:1: error: SSE register return with SSE disabled
   40 | {
      | ^
make[5]: *** [scripts/Makefile.build:281: /home/ghost/SRC/openzfs/module/zstd/lib/zstd.o] Error 1
make[4]: *** [scripts/Makefile.build:497: /home/ghost/SRC/openzfs/module/zstd] Error 2
make[3]: *** [Makefile:1757: /home/ghost/SRC/openzfs/module] Error 2

I have a feeling gcc 10 + znver* causes that. I can't reproduce on gcc9.

I have a feeling gcc 10 + znver* causes that. I can't reproduce on gcc9.

I can reproduce it on 9.3.

I can reproduce it on 9.3.

So can I, same GCC version 9.3.

ok so experimental patchset and znver for kernel triggers it.

what happens is -march=znver1 gets passed to kernel which effectively defines __BMI__ even with -mno-sse -mno-sse2

but passing -mno-bmi should prevent this define.

what's strange build with -march=skylake actually succeeds, yet it also defines __BMI__.

toolchain guys pointed out that part of BMI is amd-specific, that may explain the difference.

In FreeBSD we had to add -U__BMI__ to CFLAGS for zstd.c as a workaround. We actually fail to build when it is defined because kernel modules are built with -nostdinc so the immintrin.h header isn't even found.

@freqlabs thanks for suggestion! opened a PR to address this without modifying zstd source.
patch already pushed to gentoo as well. build failures with 2.0.0-rc1 should be gone.

9999(git master) ebuild remains unpatched.

@gyakovlev thanks - your patch worked fine for me on Ubuntu 20.04 as well.

Works perfectly for me on my Gentoo LTO/GCC10 + zen system, thanks a billion c:

Is it really a sufficient fix? From what I understood gcc can use floating point math and use SSE even with -mno-sse, and I actually found one function that used the intrinsic function from gcc which uses the double data type. Can I suggest someone who actually knows C and what functions from zstd are used in openzfs take a deeper look at it? It would be pretty bad if those registers got mingled with what userland is doing...

Yes, it should be. My understanding is that this https://github.com/facebook/zstd/issues/1183#issuecomment-396694906 is still accurate and zstd does not contain any intrinsic functions. The kernel compiler options should also prevent any auto-vectorization because as you correctly noted some special handling is required to safely use these instructions at all in the kernel.

I would guess that undefining __BMI__ doesn't stop the compiler from generating BMI instructions, or disable BMI builtins like __builtin_clz(), but I don't know for sure. If it does that would destroy zstd performance, so it is probably worth testing.

Additionally, I have a PR up (https://github.com/facebook/zstd/pull/2289) to prepare upstream zstd for use as-is in the Linux Kernel. As part of that change I am adding a macro ZSTD_NO_INTRINSICS that when defined avoids all explicit intrinsics, including the #include <immintrin.h>.

Since that code is currently unused, I would also accept a PR that just removes that include.

Was this page helpful?
0 / 5 - 0 ratings