Julia: LLVM ERROR: Broken function found, compilation aborted! when precompiling Base.permutedims

Created on 23 Feb 2019  路  54Comments  路  Source: JuliaLang/julia

I get LLVM ERROR: Broken function found, compilation aborted! when including Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}}) for system image compilation. I can reproduce this in Julia 1.0 to 1.2.

First I created compile.bash:

#!/bin/bash

code_print_image_file="print(unsafe_string(Base.JLOptions().image_file))"

JULIA="${JULIA:-julia}"
IMAGE="$("$JULIA" -e "$code_print_image_file")"

OUTPUT_O=sys.a

set -ex

${JULIA} --output-o="$OUTPUT_O" \
    -g1 --startup-file=no --code-coverage=none \
    --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes \
    --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes \
    --cpu-target=native --track-allocation=none --sysimage-native-code=yes \
    --sysimage="$IMAGE" \
    --compiled-modules=yes --optimize=2 \
    ./precompile_wrapper.jl

and ./precompile_wrapper.jl:

atexit_hook_copy = copy(Base.atexit_hooks) # make backup
# clean state so that any package we use can carelessly call atexit
empty!(Base.atexit_hooks)
Base.__init__()
Sys.__init__() #fix https://github.com/JuliaLang/julia/issues/30479
using REPL
Base.REPL_MODULE_REF[] = REPL

Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})

Base._atexit() # run all exit hooks we registered during precompile
empty!(Base.atexit_hooks) # don't serialize the exit hooks we run + added
# atexit_hook_copy should be empty, but who knows what base will do in the future
append!(Base.atexit_hooks, atexit_hook_copy)

Then run:

$ julia --version
julia version 1.1.0

$ ./compile.bash
+ julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/takafumi/opt/julia/julia-1.1.0/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

$ julia-1.0 --version
julia version 1.0.3

$ JULIA=julia-1.0 ./compile.bash
+ julia-1.0 --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/takafumi/opt/julia/julia-1.0.3/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

$ ~/repos/watch/julia/usr/bin/julia -E VERSION
v"1.2.0-DEV.339"

$ JULIA=~/repos/watch/julia/usr/bin/julia ./compile.bash
+ /home/takafumi/repos/watch/julia/usr/bin/julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/takafumi/repos/watch/julia/usr/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

(precompile_wrapper.jl is taken from run_julia_code.jl generated by PackageCompiler.jl)

Most helpful comment

For those wondering how to do that, the steps are basically:

  1. Build LLVM with debug symbols
  2. Record the failing process with rr
  3. Run rr -a -M to find the timestamp when it prints the error
  4. Go there with rr -g
  5. Find the address of the failing instruction (call it I)
  6. watch *I (but put in the actual address for I - don't use a variable)
  7. Keep doing rc until you find the constructor of that value
  8. Set a breakpoint at the start of whatever pass you're in
  9. Use jl_write_bitcode_func to dump the module
  10. Verify that the dumped module fails under the problematic pass using opt
  11. Use llvm-extract to just get the problematic function
  12. Verify it still fails
  13. (Optional) Use llvm-reduce or bugpoint to reduce it even further

All 54 comments

I see a crash in libuv, which makes me think you're trying to print something without libuv being initialized properly.

+ /home/keno/julia/julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/keno/julia/usr/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl

signal (11): Segmentation fault
in expression starting at none:0
uv_write2 at /home/keno/julia/deps/srccache/libuv-2348256acf5759a544e5ca7935f638d2bc091d60/src/unix/stream.c:1434
jl_uv_write at /home/keno/julia/src/jl_uv.c:442
uv_write_async at ./stream.jl:877
uv_write at ./stream.jl:845
unsafe_write at ./stream.jl:901
macro expansion at ./gcutils.jl:87 [inlined]
write at ./strings/io.jl:177 [inlined]
print at ./strings/io.jl:179 [inlined]
#with_output_color#672 at ./util.jl:370
#with_output_color at ./tuple.jl:0 [inlined]
#printstyled#673 at ./util.jl:398 [inlined]
#printstyled at ./none:0
jl_fptr_trampoline at /home/keno/julia/src/gf.c:1895
jl_apply_generic at /home/keno/julia/src/gf.c:2250
display_error at ./client.jl:104
jl_fptr_trampoline at /home/keno/julia/src/gf.c:1895
jl_apply_generic at /home/keno/julia/src/gf.c:2250
display_error at ./client.jl:124
jl_fptr_trampoline at /home/keno/julia/src/gf.c:1895
jl_apply_generic at /home/keno/julia/src/gf.c:2250
jl_apply at /home/keno/julia/src/julia.h:1594 [inlined]
jl_f__apply at /home/keno/julia/src/builtins.c:563
jl_f__apply_latest at /home/keno/julia/src/builtins.c:601
#invokelatest#1 at ./essentials.jl:761 [inlined]
invokelatest at ./essentials.jl:760 [inlined]
exec_options at ./client.jl:309
_start at ./client.jl:476
jl_fptr_trampoline at /home/keno/julia/src/gf.c:1895
jl_apply_generic at /home/keno/julia/src/gf.c:2250
jl_apply at /home/keno/julia/ui/../src/julia.h:1594 [inlined]
true_main at /home/keno/julia/ui/repl.c:96
main at /home/keno/julia/ui/repl.c:217
__libc_start_main at /build/glibc-OTsEL5/glibc-2.27/csu/../csu/libc-start.c:310
_start at /home/keno/julia/julia (unknown line)
Allocations: 872735 (Pool: 872287; Big: 448); GC: 1
./compile.bash: line 19: 22733 Segmentation fault      ${JULIA} --output-o="$OUTPUT_O" -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage="$IMAGE" --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl

I thought the error you showed me was solved by calling Base.__init__() and Sys.__init__() manually. I can get your error by commenting out them:

diff --git a/precompile_wrapper.jl b/precompile_wrapper.jl
index 9ebfb2e..a8e8554 100644
--- a/precompile_wrapper.jl
+++ b/precompile_wrapper.jl
@@ -1,8 +1,8 @@
 atexit_hook_copy = copy(Base.atexit_hooks) # make backup
 # clean state so that any package we use can carelessly call atexit
 empty!(Base.atexit_hooks)
-Base.__init__()
-Sys.__init__() #fix https://github.com/JuliaLang/julia/issues/30479
+# Base.__init__()
+# Sys.__init__() #fix https://github.com/JuliaLang/julia/issues/30479
 using REPL
 Base.REPL_MODULE_REF[] = REPL

I uploaded the scripts I'm using here: https://gist.github.com/tkf/bb020c3e2d64d049696c7e549f0120ad/ab16ec9f94598a64f9e242f5d8b7ba1a9f7fc94e They are the same as the one included in the first post but just in case I had copy-and-paste mistake. I can reproduce LLVM ERROR: Broken function found, compilation aborted! by running the following commands in an empty directory:

wget https://gist.githubusercontent.com/tkf/bb020c3e2d64d049696c7e549f0120ad/raw/ab16ec9f94598a64f9e242f5d8b7ba1a9f7fc94e/compile.bash
wget https://gist.githubusercontent.com/tkf/bb020c3e2d64d049696c7e549f0120ad/raw/ab16ec9f94598a64f9e242f5d8b7ba1a9f7fc94e/precompile_wrapper.jl
bash compile.bash

Note also that this script can generate sys.a if I comment out Base.precompile:

$ git diff
diff --git a/precompile_wrapper.jl b/precompile_wrapper.jl
index 9ebfb2e..f6789aa 100644
--- a/precompile_wrapper.jl
+++ b/precompile_wrapper.jl
@@ -6,7 +6,7 @@ Sys.__init__() #fix https://github.com/JuliaLang/julia/issues/30479
 using REPL
 Base.REPL_MODULE_REF[] = REPL

-Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})
+# Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})

 Base._atexit() # run all exit hooks we registered during precompile
 empty!(Base.atexit_hooks) # don't serialize the exit hooks we run + added

$ rm -f sys.a

$ ./compile.bash
+ julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/takafumi/opt/julia/julia-1.1.0/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl

$ file sys.a
sys.a: current ar archive

FYI

$ julia -e 'using InteractiveUtils; versioninfo()'
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

$ julia-1.0 -e 'using InteractiveUtils; versioninfo()'
Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

$ ~/repos/watch/julia/usr/bin/julia -e 'using InteractiveUtils; versioninfo()'
Julia Version 1.2.0-DEV.339
Commit 00f257d603 (2019-02-16 01:47 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

Bump. I still can reproduce the bug with current master 1.2.0-DEV.669 (4671132ee1).

It is still reproducible with 1.4.0-DEV.297 (a68237f9c9).

I'm seeing this when using the new PackageCompiler on aarch64 on 1.3.1
https://github.com/JuliaLang/PackageCompiler.jl/issues/295

ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

I can reproduce the error with the first example

$ julia --version
julia version 1.3.1
$ ./compile.bash
+ julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/ian/Documents/julia-1.3.1/lib/julia/sys.so --compiled-modules=yes --optimize=2 ./precompile_wrapper.jl
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

Some further testing.
In the example above, each of these precompile statements fails:

Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 2}, Array{Int64, 1}})
Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})
Base.precompile(Tuple{typeof(Base.permutedims), Array{UInt8, 3}, Array{Int64, 1}})

These don't fail:

Base.precompile(Tuple{typeof(Base.permutedims), Array{Int64, 3}, Array{Int64, 1}})
Base.precompile(Tuple{typeof(Base.permutedims), Array{Float64, 3}, Array{Int64, 1}})
Base.precompile(Tuple{typeof(Base.rand), Int64}) (sanity check)

Stating the obvious to be complete... in the regular REPL the equivalent methods work. i.e. permutedims(rand(Bool,2,2,2), [1,3,2])

And I noticed that the blame on the permutedims functions shows no change in at least 2 years
https://github.com/JuliaLang/julia/blame/2d5741174ce3e6a394010d2e470e4269ca54607f/base/permuteddimsarray.jl

@timholy I wondered if you had any insight given you seem to have driven the permutedims approach, from the blame (I hate that name..)

Does the fact that this happens for Bool and UInt8 arrays, but not Int64 or Float64 arrays point to anything?

On MacOS I see the same as Keno:

signal (11): Segmentation fault: 11
in expression starting at none:0
uv_write2 at /workspace/srcdir/libuv/src/unix/stream.c:1397
...

Maybe some error happens before/during Base.__init__()? Does calling Base.reinit_stdio at the very beginning of the script help?

Note that it is enough to only have a bare

Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})

in the precompile_wrapper.jl.

and Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}}) returns true in REPL

Yeah, it returns true when using PackageCompiler as well. It isn't until the the code is getting written to the object file that LLVM asserts.

What happens when you use different optimization flags? -O0 vs -O1 vs -O2 vs -O3

With Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 2}, Array{Int64, 1}}) in precompile_wrapper.jl alone:

--optimize=0 - success
--optimize=1 - segfault
--optimize=2 - ptrtoint not supported for non-integral pointers etc.
--optimize=3 - ptrtoint not supported for non-integral pointers etc.

@tkf Just out of interest, how did you figure out this was cause by Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}}) ?

I'm trying to figure out a way to narrow in on other causes of this, as it seems to not just be limited to permutedims

I just tried in a build of 1.3.1 with LLVM_ASSERTIONS := 1 but didn't get anything more

ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

Now trying with LLVM_ASSERTIONS := 1 & LLVM_DEBUG := 1

@ianshmean Can you do the same thing (running with -O0 vs -O1 vs -O2 vs -O3) in Julia 1.0, 1.1, 1.2, and 1.3 to see if we can figure out when this bug was introduced?

From https://discourse.julialang.org/t/debugging-aot-compile-errors/20829/2 it looks like it started in 1.1

I'll try to cover the versions

Also, you said that this bug does occur for Array{Bool, 3} but does not occur for Array{Int64, 3}, right?

Can you see if it occurs for BitArray{1}, BitArray{2}, BitArray{3}, etc?

Indeed.
I'll test bitarrays. Any other types worth testing to narrow-in?

Bool, UInt8 = bad
Int64, Float64 = good

With --optimize=2

Bad

  • Array{Bool, 2-10}
  • Array{UInt8, 2-10}
  • Array{Int8, 2}

Good

  • Array{Int64, 3}
  • Array{Float64, 3}
  • BitArray{2-3}
  • Array{Int16, 2}
  • Array{Int32, 2}

And I don't see any more from LLVM with LLVM_ASSERTIONS := 1 & LLVM_DEBUG := 1

Given Bool is stored as a UInt8, it seems like an 8-bit problem

how did you figure out this was cause by Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}}) ?

@ianshmean What I did was something stupid-simple. Copy the file with precompile statements generated by PackageCompiler and then manually do "binary search" (keep deleting first or second half of the file).

Ok, I think I'll follow suit. I might even automate it.. my precompile_statements list is very long..

Can you test arrays of UInt16 and UInt32, just to confirm that it is definitely a problem with 8-bit integers, and not a problem with unsigned integers?

@ianshmean If you can identify a commit (probably near the tag for Julia 1.0.3) where the bug doesn鈥檛 occur, and a commit (probably near the tag for Julia 1.1.0) where the bug does occur, you could run git bisect so we can figure out which commit introduced the bug.

Can you test arrays of UInt16 and UInt32, just to confirm that it is definitely a problem with 8-bit integers, and not a problem with unsigned integers?

Neither fail. Looks like an 8-bit integer issue

Are there any julian knacks for doing a git bisect? Given it involves making each step, I wondered if anyone's automated it before?

The error suggests that this is a problem when trying to apply ptrtoint to non-integer pointers?

It looks like in the past, @Keno and @yuyichao have added some LLVM patches to fix this. For example:

  1. https://github.com/JuliaLang/julia/commit/2136bbf9e1c5179ac93e016aee6bb0a15353825c
  2. https://github.com/JuliaLang/julia/commit/943c8e57753022213b08a6e67faccdad5572d723
  3. https://github.com/JuliaLang/julia/commit/cadbe233a67674d5fa406c930a5aad3a5bcc6765
  4. https://github.com/JuliaLang/julia/commit/df451468a14e0b0f7985f8396a6c15ef5a411422
  5. https://github.com/JuliaLang/julia/commit/b72af507df65c73b16766b8799f7e7ce70cd39bf

Perhaps we need to add some more patches to LLVM?

Hopefully @Keno and @yuyichao can help out, since they seem familiar with this stuff.

Also maybe @vchuravy and @vtjnash can take a look, since I think they are also familiar with the LLVM stuff.

Just to add, this also happens back in 0.7.0, 1.0.5

+ /home/parallels/Documents/julia-0.7.0/bin/julia --output-o=sys.a -g1 --startup-file=no --code-coverage=none --history-file=yes --inline=yes --math-mode=ieee --handle-signals=yes --startup-file=no --warn-overwrite=no --compile=yes --depwarn=yes --cpu-target=native --track-allocation=none --sysimage-native-code=yes --sysimage=/home/parallels/Documents/julia-0.7.0/lib/julia/sys.so --compiled-modules=yes --optimize=2 -e 'Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})'
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
LLVM ERROR: Broken function found, compilation aborted!

I just found that the julia nightly is printing the function name. Note the 3rd line here. Oddly I didn't see this when I built master earlier today..

ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
in function japi1_permutedims!_3933
LLVM ERROR: Broken function found, compilation aborted!

I dumped the bad IR, on Julia master, as well as the IR that's normally generated at run time: https://gist.github.com/maleadt/f93ba85a91ba0860e00d883ff4052a8c
Just process with opt to see the IR verifier fail with the same error.

I'm using https://github.com/JuliaLang/PackageCompiler.jl/pull/333 to blacklist anything that causes this issue, and after blacklisting permutedims I'm getting the same thing for copyto!:

ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
ptrtoint not supported for non-integral pointers
inttoptr not supported for non-integral pointers
in function julia_copyto!_18518
LLVM ERROR: Broken function found, compilation aborted!

After @vtjnash suggested in slack that it could be the vectorizer optimizations, @KristofferC suggested I disable those lines and build master, so I commented out
https://github.com/JuliaLang/julia/blob/6d86384eadebef4f8512b662368b8a79c522c6ef/src/jitlayers.cpp#L234
https://github.com/JuliaLang/julia/blob/6d86384eadebef4f8512b662368b8a79c522c6ef/src/jitlayers.cpp#L237

And it worked. The example now passes with --optimize=2

Unfortunately it still hit the julia_copyto!_18518 error though when I ran PackageCompiler with the julia master build with no vectorizer.

@ianshmean You said that this is architecture-dependent, right? Can you post which architectures it works on, and which architectures you get the error on?

The MWE in this thread errors on both aarch64 and amd64 for me. That's not platform specific.

However, the package I'm PackageCompiling only hits this error on aarch64.
Somehow there's a significant difference in the precomp statements being generated for my package between aarch64 and arm64, and the list is too long to visually parse

[ Info: Num precomp statements generated on aarch64: 6406
[ Info: Num precomp statements generated  on amd64: 6096
[ Info: Statements uniquely generated on aarch64 (not on amd64): 1537
[ Info: Statements uniquely generated on amd64 (not on aarch64): 1277

And frustratingly neither set contains a permutedims on an 8-bit array type

To summarize. My take is that there's a bug in either/both of these that fixes the permutedims 8-bit example (I haven't build julia with these independently disabled, but can do that if it's helpful):

PM->add(createSLPVectorizerPass());         // Vectorize straight-line code
PM->add(createLoopVectorizePass());         // Vectorize loops 

And another bug somewhere else that julia_copyto!_18518 is hitting.

I tried to use export JULIA_LLVM_ARGS = -print-before-all with my PackageCompiler setup, but the output text seemed never ending (over an hour) so I cancelled before it errored.

If anyone has a suggestion, I'm happy to explore

I tried to use export JULIA_LLVM_ARGS = -print-before-all with my PackageCompiler setup, but the output text seemed never ending (over an hour)

Pipe it to a file instead, so that you don;t get slowed done by your terminal.

I haven't yet piped the output out..

But, I figured out the 2 out of 6406 precompile statements that were invoking the ptrtoint not supported for non-integral pointers, inttoptr not supported for non-integral pointers errors for my package:

precompile(Tuple{typeof(Base.copyto!), Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}, Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}}) 

precompile(Tuple{typeof(Base.circshift!), Array{UInt8, 2}, Array{UInt8, 2}, Tuple{Int64, Int64}})

Details on the bisector approach I used: https://github.com/JuliaLang/PackageCompiler.jl/issues/295#issuecomment-588551514

Of the three known:

  1. precompile(Tuple{typeof(Base.permutedims), Array{Bool, 3}, Array{Int64, 1}})
  2. precompile(Tuple{typeof(Base.copyto!), Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}, Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}})
  3. precompile(Tuple{typeof(Base.circshift!), Array{UInt8, 2}, Array{UInt8, 2}, Tuple{Int64, Int64}})

Fails on Ubuntu aarch64: [1,2,3]
Fails on MacOS: [1]

Here's a pure LLVM reproducer: https://gist.github.com/Keno/60d900bf197bfda75e2f9f72dec4411f

Reproduce with opt -loop-reduce. I'll look into patching this.

For those wondering how to do that, the steps are basically:

  1. Build LLVM with debug symbols
  2. Record the failing process with rr
  3. Run rr -a -M to find the timestamp when it prints the error
  4. Go there with rr -g
  5. Find the address of the failing instruction (call it I)
  6. watch *I (but put in the actual address for I - don't use a variable)
  7. Keep doing rc until you find the constructor of that value
  8. Set a breakpoint at the start of whatever pass you're in
  9. Use jl_write_bitcode_func to dump the module
  10. Verify that the dumped module fails under the problematic pass using opt
  11. Use llvm-extract to just get the problematic function
  12. Verify it still fails
  13. (Optional) Use llvm-reduce or bugpoint to reduce it even further

Awesome.

Also, for the precompile statements failing on aarch64, I just got IR dumps (and posted how I did each at the top). Note that each is caused by a differently numbered copyto! function:

Base.precompile(Tuple{typeof(Base.circshift!), Array{UInt8, 2}, Array{UInt8, 2}, Tuple{Int64, Int64}})

https://gist.github.com/ianshmean/18202bed7aa6ecc433f344bbce1d8dd2

Base.precompile(Tuple{typeof(Base.copyto!), Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}, Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}})

https://gist.github.com/ianshmean/79de0928d534eaee5c68edb168bb6f92

This is what comes out of bugpoint, BTW. It's usually easiest to do further debugging on the bugpoint reduced output, because it reduces the amount of code that gets executed, so debugging with rr is faster:

; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "julia"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128-ni:10:11:12:13"
target triple = "x86_64-unknown-linux-gnu"

define hidden void @"japi1_permutedims!_4259"() #0 {
top:
  br label %L42.L46_crit_edge.us

L42.L46_crit_edge.us:                             ; preds = %L82.us.us.loopexit, %top
  %value_phi11.us = phi i64 [ undef, %top ], [ %2, %L82.us.us.loopexit ]
  %0 = sub i64 %value_phi11.us, undef
  %1 = add i64 %0, undef
  %spec.select = select i1 undef, i64 undef, i64 0
  br label %L62.us.us

L82.us.us.loopexit:                               ; preds = %L62.us.us
  %2 = add i64 undef, %value_phi11.us
  br label %L42.L46_crit_edge.us

L62.us.us:                                        ; preds = %L62.us.us, %L42.L46_crit_edge.us
  %value_phi21.us.us = phi i64 [ %6, %L62.us.us ], [ %spec.select, %L42.L46_crit_edge.us ]
  %3 = add i64 %1, %value_phi21.us.us
  %4 = getelementptr inbounds i8, i8 addrspace(13)* undef, i64 %3
  %5 = load i8, i8 addrspace(13)* %4, align 1
  %6 = add i64 undef, %value_phi21.us.us
  br i1 undef, label %L82.us.us.loopexit, label %L62.us.us, !llvm.loop !1
}

attributes #0 = { "thunk" }

!llvm.module.flags = !{!0}

!0 = !{i32 1, !"Debug Info Version", i32 3}
!1 = distinct !{!1, !2}
!2 = !{!"llvm.loop.isvectorized", i32 1}

Candidate patch: https://reviews.llvm.org/D75072
Try it out and let me know if it fixes things.

Backport to LLVM9 is in https://github.com/JuliaLang/julia/pull/34860 for your trial convenience.

It worked! 馃帀馃帀馃帀馃帀

I built #34860 with export USE_BINARYBUILDER_LLVM=0, and ran the compile.bash example up top, with all three failing precomp statements on my aarch64 machine and it built successfully!

Thank you so much @Keno !

Notes:

  • It took about 12 hours to build julia and LLVM on this AARCH64 machine..
  • The stdlibs of the build are messed up for me. I can't install any package that has a stdlib in it (so can't test my actual PackageCompiler test case)

Test precomp statements

Base.precompile(Tuple{typeof(Base.permutedims), Array{Bool, 2}, Array{Int64, 1}})
Base.precompile(Tuple{typeof(Base.copyto!), Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}, Array{UInt8, 2}, Base.IteratorsMD.CartesianIndices{2, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}}})
Base.precompile(Tuple{typeof(Base.circshift!), Array{UInt8, 2}, Array{UInt8, 2}, Tuple{Int64, Int64}})

@ianshmean Were you able to fix the stdlib problem?

Yes. Deleting usr and rebuilding julia worked (thankfully I didn't have to rebuild LLVM so it was much faster to build).

I posted a comment over on https://github.com/JuliaLang/julia/pull/34860#issuecomment-591060835

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ararslan picture ararslan  路  3Comments

omus picture omus  路  3Comments

TotalVerb picture TotalVerb  路  3Comments

yurivish picture yurivish  路  3Comments

manor picture manor  路  3Comments