I figured it would be worth having a single issue to track all the known issues on Apple Silicon. I'll try to keep this list updated as things get fixed or people encounter additional issues.
MacOS(:aarch64)
as a valid platform (https://github.com/JuliaLang/Pkg.jl/pull/1916)sysctl hw.optional
) From worker 14: While deleting: i8* %splitgep
From worker 14: An asserting value handle still pointed to this value!
From worker 14: UNREACHABLE executed at /Users/julia/julia/deps/srccache/llvm-10.0.0/lib/IR/Value.cpp:917!
From worker 14:
From worker 14: signal (6): Abort trap: 6
From worker 14: in expression starting at /Users/julia/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/test/diagonal.jl:11
Generating REPL precompile statements... 22/28ERROR: LoadError: IOError: stream is closed or unusable
worlds
test:worlds (4) | failed at 2020-11-13T00:31:04.270
On worker 4:
BoundsError: attempt to access 3-element BitVector at index [0:3]
numbers
test (related to SIGFPE handling):Worker 6 terminated.
numbers (6) | failed at 2020-11-13T00:31:34.703
ProcessExitedException(6)
complex
test: From worker 2:
From worker 2: signal (11): Segmentation fault: 11
From worker 2: in expression starting at /Users/julia/julia23/test/complex.jl:30
From worker 2: jl_method_error_bare at /Users/julia/julia23/usr/lib/libjulia.1.6.dylib (unknown line)
From worker 2: jl_method_error at /Users/julia/julia23/usr/lib/libjulia.1.6.dylib (unknown line)
From worker 2: jl_apply_generic at /Users/julia/julia23/usr/lib/libjulia.1.6.dylib (unknown line)
From worker 2: do_call at /Users/julia/julia23/usr/lib/libjulia.1.6.dylib (unknown line)
complex
test (filed as #38419)LinearAlgebra/triangular (running for 61 minutes)
LinearAlgebra/addmul (running for 55 minutes)
bitarray (running for 53 minutes)
iterators (running for 52 minutes)
ccall (running for 39 minutes)
loading (running for 39 minutes)
sorting (running for 24 minutes)
compiler/inference (5) | failed at 2020-11-13T01:24:18.980
Test Failed at /Users/julia/julia23/test/compiler/inference.jl:944
Expression: break_21369()
Expected: ErrorException
Thrown: BoundsError
Is the compiler enabling all the features available by default?
In another word, does it pass https://github.com/JuliaLang/julia/blob/a23a4ff08da5b6d95e9a35eee96e3260a452c02b/src/crc32c.c#L328 by default? Or do we have to do a +crc
one way or another ourselves.
Here's what's enabled by default:
#define __ARM64_ARCH_8__ 1
#define __ARM_64BIT_STATE 1
#define __ARM_ACLE 200
#define __ARM_ALIGN_MAX_STACK_PWR 4
#define __ARM_ARCH 8
#define __ARM_ARCH_ISA_A64 1
#define __ARM_ARCH_PROFILE 'A'
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_CRYPTO 1
#define __ARM_FEATURE_DIRECTED_ROUNDING 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_LDREX 0xF
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FP 0xE
#define __ARM_FP16_ARGS 1
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_NEON 1
#define __ARM_NEON_FP 0xE
#define __ARM_NEON__ 1
#define __ARM_PCS_AAPCS64 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_SIZEOF_WCHAR_T 4
Since I doubt there'll be a mac without crc32, we should just add that to the default feature flags in our Makefile. For everything else we can do runtime detection with sysctl.
I'm surprised that it enables crypto but not crc.... Yeah, I don't think it's worth doing a runtime detection here.
And from https://github.com/JuliaLang/julia/pull/36592#issuecomment-656984903 it doesn't seem to provide all the features that LLVM may use
The features detectable currently appears to be
hw.optional.neon_fp16
: fullfp16
hw.optional.armv8_1_atomics
: lse
hw.optional.armv8_crc32
: crc
hw.optional.armv8_2_fhm
: fp16fml
__ARM_FEATURE_CRYPTO
(compile time): aes
, sha2
The ones that should be supported on that CPU (all requirement from armv8.3-a) are jsconv
, complxnum
, rcpc
, ccpp
, rdm
. Some of the floating point ones are quite intereting.
Also intereting that since fp16fml
is reported the featureset is closer to that of a13 than a12. (that or the LLVM feature set for a12 is wrong...)
Anyway, this is probably a low priority item...
Looks like they're just shipping an old LLVM, e.g. if I try to build jsconv (just to see whether it would run) fatal error: error in backend: Cannot select: intrinsic %llvm.aarch64.fjcvtzs
Huh, which LLVM version do they have? Over at https://github.com/JuliaLang/julia/blob/a23a4ff08da5b6d95e9a35eee96e3260a452c02b/src/features_aarch64.h#L24 I was assuming as long as the feature is available in AArch64.td it's usable... Is that not the case? (and/or is that a mac only problem?)
Huh, which LLVM version do they have
I don't know. It claims to be LLVM 12, but Apple lies about versions. I'm building upstream clang now to try it out.
It also seems that although the feature was added in https://reviews.llvm.org/D54633 which is in LLVM 8.0 the intrinsic wasn't added until https://reviews.llvm.org/D64495 much later. Does that error mean that it's a recognized intrinsic but just isn't supported by the backend? I guess just writing inline assembly shoud be good enough for testing.
Fails upstream too.
Works with raw llc and +mattr though, so I'm gonna say it does exist.
... I thought the error you got is a backend one..... (so llc should behave the same as clang = = ....., unless clang emits the wrong IR...)
I manually added the correct mattr
to llc. I also managed to get it to work with -mcpu=apple-a12
at the clang level (appears to default to apple-a7). I filed an issue with Apple to get a better error message as well as bumping the default.
Ah, OK. So you didn't set the target when running with clang.
I tried, but mattr=armv8.3-a+jsconv
didn't seem to do it.
From worker 14: While deleting: i8* %splitgep From worker 14: An asserting value handle still pointed to this value! From worker 14: UNREACHABLE executed at /Users/julia/julia/deps/srccache/llvm-10.0.0/lib/IR/Value.cpp:917!
Ah, this is where I've seen this issue... It's not Darwin or ARM/AArch64 specific and it's fixed by https://reviews.llvm.org/D84031
Can we get a BB shard going without the Fortran compiler, and see how much of the BB ecosystem can be built?
Just thinking out aloud here. The major use of Fortran in the julia build is to build LAPACK (part of the openblas build). We could have a Fortran to Julia translator and move LAPACK to Julia. Of course BB has a bunch of other fortran libraries, and there's lot of commercial software packages that need fortran compilers.
We could have a Fortran to Julia translator and move LAPACK to Julia.
If anyone is interested in helping, I'll be happy to add and maintain Fortran to Julia translator in LFortran. We already have LLVM and C++ backends. It took us quite some time to get to this point, as a lot of infrastructure had to be figured out and implemented, but we now have a foundation of a production C++ implementation of the compiler and are making rapid progress in adding features. As an example of what works already, this Fortran code:
gets correctly translated to this C++ code (and it compiles and runs):
https://gitlab.com/lfortran/lfortran/-/blob/master/tests/reference/cpp-arrays_04-ae9bd17.stdout
The C++ translator itself is implemented here: https://gitlab.com/lfortran/lfortran/-/blob/7384b0ff81eaa2043281e48ae5158d34fcbf26f6/src/lfortran/codegen/asr_to_cpp.cpp, as you can see it is a simple visitor pattern over the Abstract Semantic Representation (ASR) which contains all the types and everything is figured out and ready for LLVM or C++ translation.
I don't like making predictions how long it will take us to be able to compile Lapack, but I am hoping it is in the range of months now.
Assuming we could translate Lapack to C++ (or Julia also) automatically and correctly and quickly in a few months, what would be the workflow?
I can imagine two workflows in the future:
You translate once and just maintain the resulting code in C++ (or Julia). We will try to ensure the translator produces a nice readable and maintainable C++ code.
You keep Lapack in Fortran, but translate each new version to C++ or Julia. That way when upstream makes some changes, you will get them.
Regarding speed and performance of the translated code, that is currently unclear to me whether there can be some obstacle that would prevent it to match the performance of the original Fortran code. But we will find out, and I would think it should be possible to translate in a way to keep the performance.
LAPACK will keep moving upstream. So we have to keep running the translator on any new version - perhaps could even be integrated into BinaryBuilder. Performance shouldn't be a major problem - since 90% of the performance is anyways from calling the BLAS. The main problem will be testing correctness. Presumably the LAPACK tests translated + Julia tests may be sufficient to get started.
@ViralBShah that makes sense. Regarding correctness: my goal is for people to use LFortran as a regular Fortran compiler via LLVM, which will ensure that the parsing -> AST -> ASR -> LLVM is all correct. The ASR -> C++ backend is thus starting from a well tested starting point (ASR) that has been exercised well via the LLVM route, so there will be bugs, but they will be well isolated, and engineering-wise I think this can be delivered and made robust. The ASR -> Julia backend would be similar.
I am very excited about this, and I will keep you updated. As I said, it will take us probably months to get something initially usable, and then it takes time to mature everything, so I don't want to give you false hope that it can fix your immediate problem; but I will work towards this, I think it will become very useful to a lot of people once it matures.
I think for actively developed upstream projects, we'd rather just use lfortran as a straight LLVM compiler. The automatic translation part mostly makes sense where people want to do new development in Julia.
Just learned that there’s some ongoing effort at porting the GCC backend: https://github.com/iains/gcc-darwin-arm64
Yep, we're on top of it (https://github.com/JuliaPackaging/Yggdrasil/pull/1626), thanks!
Should LLVM9 process ARM64 relocations incorrectly
be marked done, since the linked PR is merged?
I've updated the tracking list with all items I currently know about.
I wonder how well Julia will run on Rossetta 2.
Works ok, but at reduced perf of course.
Most helpful comment
Yep, we're on top of it (https://github.com/JuliaPackaging/Yggdrasil/pull/1626), thanks!