Julia: LLVM Segfault in ccall with `&`

Created on 30 May 2018  Â·  11Comments  Â·  Source: JuliaLang/julia

This script works fine on 0.6, but segfaults on master:

using Base.LinAlg
using Base.LinAlg.BLAS: libblas, BlasInt, @blasfunc

function gemm!(transA::Char, transB::Char, M::Int, N::Int, K::Int, alpha::(Float64), A::Ptr{Float64}, B::Ptr{Float64}, beta::(Float64), C::Ptr{Float64})
    if transA=='N'; lda=M; else; lda=K; end
    if transB=='N'; ldb=K; else; ldb=N; end
    ldc = M;
    ccall((@blasfunc(dgemm_), libblas), Void,
          (Ptr{UInt8}, Ptr{UInt8}, Ptr{BlasInt}, Ptr{BlasInt},
           Ptr{BlasInt}, Ptr{Float64}, Ptr{Float64}, Ptr{BlasInt},
           Ptr{Float64}, Ptr{BlasInt}, Ptr{Float64}, Ptr{Float64},
           Ptr{BlasInt}),
          &transA, &transB, &M, &N, &K,
          &alpha, A, &lda, B, &ldb, &beta, C, &ldc)
end

x = zeros(12, 4)
w = zeros(2, 2, 1, 1)
y = zeros(4, 3, 1, 1)

gemm!('N','N',12,1,4,1.0,pointer(x),pointer(w),0.,pointer(y))
➜  NNlib.jl git:(julia-0.7) ✗ j --depwarn=no test.jl

signal (11): Segmentation fault
in expression starting at /home/mike/projects/flux/NNlib.jl/test.jl:21
_ZNK4llvm5Value16DoPHITranslationEPKNS_10BasicBlockES3_ at /home/mike/projects/julia/usr/bin/../lib/libLLVM-6.0.so (unknown line)
Allocations: 341387 (Pool: 341240; Big: 147); GC: 0
[1]    3837 segmentation fault (core dumped)  j --depwarn=no test.jl

Tried some older versions and it doesn't seem to be recent (e.g. newoptimizer related).

bug deprecation lowering optimizer won't change

Most helpful comment

If we're not going to fix it, we should probably just hard break the & syntax lowering now (since with this bug, it's already 99% broken).

All 11 comments

I think you are supposed to use Ref in 0.7. See LinearAlgebra/blas.jl. Perhaps still shouldn't segfault.

Could this be a failure to protect memory from GC being exposed by more aggressive optimization? Having an API that takes raw pointers from Julia code is pretty much asking for trouble. The right way to do this would seem to be accepting arrays and doing the GC rooting and pointer conversion inside the gemm! function; alternatively, use Ref instead in the ccall signature.

I experienced the same in Nemo when trying it on 0.7. Using Ref fixed the issue for us.

I don't think it's a GC issue given that the arrays are global, and the issue seems to be coming from some LLVM code at compile time. Switching to refs does seem to fix the issue though.

Probably worth understanding as others will hit this when trying to upgrade, and if there's a miscompile it could cause issues elsewhere.

The following code should not be asking for trouble but still segfaults:

julia> z = BigFloat()
NaN

julia> ccall((:mpfr_set_si, :libmpfr), Int32, (Ref{BigFloat}, Clong, Int32), z, Clong(1), 0)
0

julia> z
1.0

julia> ccall((:mpfr_set_si, :libmpfr), Int32, (Ptr{BigFloat}, Clong, Int32), &z, Clong(1), 0)
┌ Warning: Syntax `&argument` is deprecated. Remove the `&` and use a `Ref` argument type instead.
â”” @ nothing none:0
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector<bool>::_M_range_check: __n (which is 4) >= this->size() (which is 3)

signal (6): Aborted
in expression starting at no file:0
raise at /build/glibc-Cl5G7W/glibc-2.23/signal/../sysdeps/unix/sysv/linux/raise.c:54
abort at /build/glibc-Cl5G7W/glibc-2.23/stdlib/abort.c:89
_ZN9__gnu_cxx27__verbose_terminate_handlerEv at /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (unknown line)
unknown function (ip: 0x7fac9b5756b5)
_ZSt9terminatev at /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (unknown line)
__cxa_throw at /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (unknown line)
_ZSt24__throw_out_of_range_fmtPKcz at /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (unknown line)
_M_range_check at /usr/include/c++/5/bits/stl_bvector.h:872
at at /usr/include/c++/5/bits/stl_bvector.h:881
emit_expr at /home/thofmann/julia/julia-dev/src/codegen.cpp:3888
emit_call at /home/thofmann/julia/julia-dev/src/codegen.cpp:3151
emit_expr at /home/thofmann/julia/julia-dev/src/codegen.cpp:3957
emit_ccall at /home/thofmann/julia/julia-dev/src/ccall.cpp:1557
emit_expr at /home/thofmann/julia/julia-dev/src/codegen.cpp:3968
emit_ssaval_assign at /home/thofmann/julia/julia-dev/src/codegen.cpp:3620
emit_stmtpos at /home/thofmann/julia/julia-dev/src/codegen.cpp:3868
emit_function at /home/thofmann/julia/julia-dev/src/codegen.cpp:6478
jl_compile_linfo at /home/thofmann/julia/julia-dev/src/codegen.cpp:1181
jl_compile_method_internal at /home/thofmann/julia/julia-dev/src/gf.c:1788
jl_fptr_trampoline at /home/thofmann/julia/julia-dev/src/gf.c:1822
jl_toplevel_eval_flex at /home/thofmann/julia/julia-dev/src/toplevel.c:850
jl_toplevel_eval at /home/thofmann/julia/julia-dev/src/toplevel.c:869
jl_toplevel_eval_in at /home/thofmann/julia/julia-dev/src/builtins.c:631
eval at ./boot.jl:316
jl_fptr_args at /home/thofmann/julia/julia-dev/src/gf.c:1833
jl_apply_generic at /home/thofmann/julia/julia-dev/src/gf.c:2140
eval_user_input at /home/thofmann/julia/julia-dev/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:85
jl_fptr_args at /home/thofmann/julia/julia-dev/src/gf.c:1833
jl_apply_generic at /home/thofmann/julia/julia-dev/src/gf.c:2140
macro expansion at /home/thofmann/julia/julia-dev/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:116 [inlined]
#28 at ./task.jl:254
jl_fptr_args at /home/thofmann/julia/julia-dev/src/gf.c:1833
jl_fptr_trampoline at /home/thofmann/julia/julia-dev/src/gf.c:1823
jl_apply_generic at /home/thofmann/julia/julia-dev/src/gf.c:2140
jl_apply at /home/thofmann/julia/julia-dev/src/julia.h:1540
start_task at /home/thofmann/julia/julia-dev/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 5372351 (Pool: 5371408; Big: 943); GC: 11
Aborted (core dumped)

This is definitely a problem during compilation. With the definitions as above, it's not necessary to call gemm! to get the segfault, it's enough to do

@code_llvm gemm!('N','N',12,1,4,1.0,pointer(x),pointer(w),0.,pointer(y))

This seems to go away when the deprecation is fixed, so not high priority.

Lowering does not linearize & expressions and the new optimizer probably does not handle & either. Any code that tries to use this deprecated syntax will emit nonsense and crash. I thought we already had an issue for it, but don't see it.

If we're not going to fix it, we should probably just hard break the & syntax lowering now (since with this bug, it's already 99% broken).

100% agree: hard break if it's not straightforward to fix (which it seems like it isn't).

Closing since this is a syntax error on 1.0 and we're unlikely to do a pre-1.0 release that fixes this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

manor picture manor  Â·  3Comments

ararslan picture ararslan  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

m-j-w picture m-j-w  Â·  3Comments

sbromberger picture sbromberger  Â·  3Comments