When building an arm32 allyesconfig build, this error pops up because __ARM_NEON__ is not defined:
arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
^
These flags are very clearly supplied to Clang:
/usr/bin/ccache clang -Wp,-MD,arch/arm/lib/.xor-neon.o.d -nostdinc -isystem /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include -I./arch/arm/include -I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi -I./arch/arm/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -include ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 --target=arm-linux-gnueabi --prefix=/home/nathan/cbl/prebuilt/bin/ --gcc-toolchain=/home/nathan/cbl/prebuilt -no-integrated-as -fno-PIE -fno-dwarf2-cfi-asm -mabi=aapcs-linux -mfpu=vfp -funwind-tables -marm -Wa,-mno-warn-deprecated -D__LINUX_ARM_ARCH__=6 -march=armv6k -mtune=arm1136j-s -msoft-float -Uarm -fno-delete-null-pointer-checks -O2 -Wframe-larger-than=1024 -fstack-protector-strong -Wno-format-invalid-specifier -Wno-gnu -Wno-address-of-packed-member -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fno-inline-functions -pg -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -mfloat-abi=softfp -mfpu=neon -fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp -DKBUILD_BASENAME='"xor_neon"' -DKBUILD_MODNAME='"xor_neon"' -c -o arch/arm/lib/xor-neon.o arch/arm/lib/xor-neon.c
However, it looks like the -march=armv6k is probably the problematic flag, according to lib/Basic/Targets/ARM.cpp in Clang (link):
// This only gets set when Neon instructions are actually available, unlike
// the VFP define, hence the soft float and arch check. This is subtly
// different from gcc, we follow the intent which was that it should be set
// when Neon instructions are actually available.
if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
Builder.defineMacro("__ARM_NEON", "1");
Builder.defineMacro("__ARM_NEON__");
// current AArch32 NEON implementations do not support double-precision
// floating-point even when it is present in VFP.
Builder.defineMacro("__ARM_NEON_FP",
"0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
}
No idea how to reconcile this.
I'm sure this is the same issue:
In file included from lib/raid6/neon1.c:27:
/home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2: error: "NEON support not enabled"
#error "NEON support not enabled"
^
/usr/bin/ccache clang -Wp,-MD,lib/raid6/.neon1.o.d -nostdinc -isystem /home/nathan/cbl/prebuilt/lib/clang/8.0.0/include -I./arch/arm/include -I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi -I./arch/arm/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -incl
ude ./include/linux/compiler_types.h -D__KERNEL__ -mlittle-endian -Qunused-arguments -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 --target=arm-linux-gnueabi --prefix=/home/nathan/cbl/prebuilt/bin/ --gcc-toolchain=
/home/nathan/cbl/prebuilt -no-integrated-as -fno-PIE -fno-dwarf2-cfi-asm -mabi=aapcs-linux -mfpu=vfp -funwind-tables -marm -Wa,-mno-warn-deprecated -D__LINUX_ARM_ARCH__=6 -march=armv6k -mtune=arm1136j-s -msoft-float -Uarm -fno-delete-null-pointer-checks -O2 -Wframe-larger-than=1024 -fstack-protector-strong -Wno-format-invalid-specif
ier -Wno-gnu -Wno-address-of-packed-member -Wno-tautological-compare -mno-global-merge -Wno-unused-const-variable -fno-inline-functions -pg -Wdeclaration-after-statement -Wvla -Wno-pointer-sign -fno-strict-overflow -fno-merge-all-constants -fno-stack-check -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=inco
mpatible-pointer-types -Wno-initializer-overrides -Wno-unused-value -Wno-format -Wno-sign-compare -Wno-format-zero-length -Wno-uninitialized -ffreestanding -mfloat-abi=softfp -mfpu=neon -fsanitize-coverage=trace-pc -fsanitize-coverage=trace-cmp -DKBUILD_BASENAME='"neon1"' -DKBUILD_MODNAME='"raid6_pq"' -c -o lib/raid6/neon1.o li
b/raid6/neon1.c
Which config(s) flip this?
CONFIG_XOR_BLOCKS for the first one and CONFIG_RAID6_PQ for the second (which gets selected by a few various drivers).
CONFIG_KERNEL_MODE_NEON is required, which is on in multi_v7_defconfig.
allyesconfig has these configs enabled:
CONFIG_CPU_32v6=y
CONFIG_CPU_32v6K=y
CONFIG_CPU_32v7=y
They specify which ISA to use
https://github.com/torvalds/linux/blob/master/arch/arm/Makefile#L62
I'm not sure how well-defined it is to set multiple of those. We could remove CONFIG_CPU_32v6 and CONFIG_CPU_32v6K out of the config or change the Makefile so that in case multiple ISAs are specified highest version is set. WDYT?
I'm not sure how well-defined it is to set multiple of those.
I'm not sure that make allyesconfig is selecting all of those, otherwise the command line flags @nathanchance posted in his first comment would have multiple -march= options, yet I only see -march=armv6k. I'll bet that Kconfig already marks them as mutually exclusive.
#error You should compile this file with '-mfloat-abi=softfp -mfpu=neon'
Well that seems very GCC specific, and at odds with Clang:
if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
specifically SoftFloat which I'll guess is set by -mfloat-abi=softfp. So either Clang is wrong (the comment explicitly says it tries to be different from GCC). Note that Clang also expect armv7, which is maybe too conservative? I have a meeting tomorrow morning with some LLVM folks @ ARM who might be able to clarify. cc @kbeyls @smithp35
Side note, all of those cc-options can probably be removed. What year is it?! -march=armv7-a is supported by arm gcc 4.5.4, and the kernel supports minimally gcc 4.6.
I'm sure selecting multiple ISAs should be invalid. I will check the Makefile to see how it selects appropriate -march= option.
From a purely architectural perspective neon (referred to as Advanced SIMD) in the architecture is an optional extension for Arm v7-A and Arm v7-R, and (less optionally) in Arm v8. It is not defined for Arm v6 which is probably why gcc is giving an error message for trying to use Neon with architecture v6k (something like arm1176jzf-s).
SoftFloat will come from -mfloat-abi=soft (use no hardware floating point), this is different to -mfloat-abi=softfp (use hardware floating point, but use the software floating point calling convention).
The only subtle difference I could find with clang is that __ARM_NEON, __ARM_NEON__ and __ARM_NEON_FP are defined for clang with -march=armv7-a even without -mfpu=neon, whereas gcc needs -mfpu=neon.
Neither gcc or clang will enable neon for Arm v6k.
Yes, from what I can find, the following is true: ARMv8 – NEON is mandatory, ARMv7 – NEON is optional (but usually implemented), ARMv6 – NEON is impossible. Thus config parameter should have similar behavior: force NEON=y, if arch is ARMv8, force NEON=n if arch is ARMv6 and provide choice if arch is ARMv7. Selecting multiple arches in one kernel build should not be possible.
Selecting multiple arches in one kernel build should not be possible.
$ make ARCH=arm allyesconfig \
...
$ cat .config | grep CONFIG_CPU_32
CONFIG_CPU_32v6=y
CONFIG_CPU_32v6K=y
CONFIG_CPU_32v7=y
You're right, multiple ISAs should be invalid. allyesconfig does select multiple though. However, in the Makefile CONFIG_CPU_32v6K just happens to override the other two.
ARMv6 – NEON is impossible
If so, a build-time error we see with clang is a reasonable behavior. GCC on the other hand sets __ARM_NEON__ with these flags -march=armv6k -mfpu=neon -mfloat-abi=softfp. See example. Since ARMv6 doesn't support NEON instructions, it looks like gcc can miscompile in this case. Although I'm not sure if gcc somehow handles things further down the pipeline to avoid miscompilation.
Expanding this guard to check for target arch might be a solution.
I'd be curious see what Ard Biesheuvel thinks of this conundrum given he wrote all of the files that Clang is warning about here. I assume it will probably be along the lines of only building these files when JUST building 32_v7 but I wonder if there is another solution.
cc @ardbiesheuvel
The NEON code is only executed on cores where NEON is detected at runtime, so it does make sense to mix ARMv6 and v7 in this way, e.g., for armhf distro kernels that can run on a wide range of hardware.
Just adding -march=armv7-a to any occurrence of -mfpu=neon should do the trick afaict
I wonder how do gcc and clang handle multiple march options. If the later options override all the previous occurrences (which is probably the case), we can add -march=armv7-a to NEON_FLAGS as a workaround.
This diff resolves both errors, thanks Ard! I can draft up commits and send them upstream unless someone else (@vo4?) wants to (if so, add Tested-by: Nathan Chancellor <[email protected]>).
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index ad25fd1872c7..0bff0176db2c 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -39,7 +39,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
- NEON_FLAGS := -mfloat-abi=softfp -mfpu=neon
+ NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif
diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 2f8b61dfd9b0..bfec7c87c61e 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -25,7 +25,7 @@ endif
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -ffreestanding
ifeq ($(ARCH),arm)
-NEON_FLAGS += -mfloat-abi=softfp -mfpu=neon
+NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
endif
CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
ifeq ($(ARCH),arm64)
@nathanchance Please go ahead and upstream. Thanks!
With the above diff the following combination of configs will result in -march=armv7-a being used.
CONFIG_KERNEL_MODE_NEON=y
CONFIG_CPU_32v6=y
# CONFIG_CPU_32v7 is not set
Not very intuitive, but I suppose it's OK since we expect the kernel to handle these arch-specific details at runtime anyway?
Maybe we should only add this flag when building with Clang and not GCC?
No, it is perfectly appropriate to always pass -march=armv7-a and -mfpu=neon at the same time, and I prefer to have a single setting for all compilers.
Patch sent: https://lore.kernel.org/lkml/[email protected]/
KernelCI is now reporting this regression. @nathanchance what's the status of the patch submission?
@nickdesaulniers I will go ahead and resend it tonight to hopefully collect some more review/ack/testing tags then hopefully submit it to the patch system later this week but it seems like core arm32 maintenance has super slowed down so I don't know how quickly it will be queued up.
Patch resent now I found a workaround for the error in #325: https://lore.kernel.org/lkml/[email protected]/
Submitted to the ARM patch system: https://www.armlinux.org.uk/developer/patches/viewpatch.php?id=8833/1
Merged into mainline: https://git.kernel.org/torvalds/c/de9c0d49d85dc563549972edc5589d195cd5e859