Julia: Function returning incorrect values on Julia 0.6.3, Windows 32-bit

Created on 3 Jun 2018  Â·  16Comments  Â·  Source: JuliaLang/julia

The following function extracted from ErrorfreeArithmetic.jl returns the incorrect result on Julia 0.6.3, Windows 32-bit. It is correct on 0.6.2 and 0.7-alpha.

The very strange thing is that setting a = 1.1; b = 0.1 and executing by hand the instructions in the function one after another gives the correct result in x and y.

function sub(a, b)
  x = a - b
  t = x - a
  y = (a - (x - t)) - (b + t)
  return x, y
end

Running this on Julia 0.6.3, Windows 32-bit gives:

julia> sub(1.1, 0.1)   
(1.0, 0.0)

On all other versions tested:

julia> sub(1.1, 0.1)
(1.0, 8.326672684688674e-17)

Most helpful comment

fldl

Looks like the official binaries might have the wrong JULIA_CPU_TARGET flag (e.g. generic or i686, instead of pentium4) – we should never be generating this instruction.

All 16 comments

Here's @code_llvm sub(1.1, 0.1):

julia> @code_llvm sub(1.1, 0.1)

define void @julia_sub_62624([2 x double]* noalias nocapture sret, double, double) #0 !dbg !5 {
top:
  %3 = fsub double %1, %2
  %4 = fsub double %3, %1
  %5 = fsub double %3, %4
  %6 = fsub double %1, %5
  %7 = fadd double %4, %2
  %8 = fsub double %6, %7
  %9 = getelementptr inbounds [2 x double], [2 x double]* %0, i64 0, i64 0
  store double %3, double* %9, align 8
  %.sroa.2.0..sroa_idx1 = getelementptr inbounds [2 x double], [2 x double]* %0, i64 0, i64 1
  store double %8, double* %.sroa.2.0..sroa_idx1, align 8
  ret void
}

On the failing 32-bit 0.6.3, the i64s at the end of the line 3rd from bottom (beginning %.sroa) are replaced with i32.

And the @code_native is completely different.

Can you compare code of:

julia> f(a, b) = (Core.tuple)(a, b)
f (generic function with 1 method)

julia> @code_llvm f(2.2, 3.3)
... 

I get this (in both cases f(2.2, 3.3) returns (2.2, 3.3)).
Custom build which works:

julia> @code_llvm f(2.2, 3.3)

define void @julia_f_65702([2 x double]* noalias nocapture sret, double, double) #0 !dbg !5 {
top:
  %.sroa.0.0..sroa_idx = getelementptr inbounds [2 x double], [2 x double]* %0, i32 0, i32 0
  store double %1, double* %.sroa.0.0..sroa_idx, align 4
  %.sroa.2.0..sroa_idx1 = getelementptr inbounds [2 x double], [2 x double]* %0, i32 0, i32 1
  store double %2, double* %.sroa.2.0..sroa_idx1, align 4
  ret void
}

julia> @code_native f(2.2, 3.3)
    .text
Filename: REPL[6]
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %eax
    movsd   12(%ebp), %xmm1         # xmm1 = mem[0],zero
    movsd   20(%ebp), %xmm0         # xmm0 = mem[0],zero
Source line: 1
    movsd   %xmm1, (%eax)
    movsd   %xmm0, 8(%eax)
    popl    %ebp
    retl    $4
    nopl    (%eax)

Official build which fails:

julia> @code_llvm f(2.2, 3.3)

define void @julia_f_65720([2 x double]* noalias nocapture sret, double, double) #0 !dbg !5 {
top:
  %.sroa.0.0..sroa_idx = getelementptr inbounds [2 x double], [2 x double]* %0, i32 0, i32 0
  store double %1, double* %.sroa.0.0..sroa_idx, align 4
  %.sroa.2.0..sroa_idx1 = getelementptr inbounds [2 x double], [2 x double]* %0, i32 0, i32 1
  store double %2, double* %.sroa.2.0..sroa_idx1, align 4
  ret void
}

julia> @code_native f(2.2, 3.3)
    .text
Filename: REPL[6]
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %eax
    fldl    20(%ebp)
    fldl    12(%ebp)
Source line: 1
    fstpl   (%eax)
    fstpl   8(%eax)
    popl    %ebp
    retl    $4
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop
    nop

fldl

Looks like the official binaries might have the wrong JULIA_CPU_TARGET flag (e.g. generic or i686, instead of pentium4) – we should never be generating this instruction.

You're right, that's my bad: I used JULIA_CPU_TARGET=generic for the 32-bit builds. I'm rebuilding the official 0.6.3 binaries for 32-bit Linux and Windows now. Good catch, @vtjnash.

Cool. While you're at it, it would be nice to change the directory name to julia-0.6.3.

Can we automate this process so that this kind of problem doesn't happen in the future?

The only reason this happened is because I had to set JULIA_CPU_TARGET manually; the buildbots automatically set it correctly, but with the 0.7 format which isn't recognized in 0.6. It shouldn't happen again unless we do another 0.6 release, but in that case I know what to avoid. 😉

The directory name thing isn't a property of the buildbots but rather of make install.

New 32-bit binaries for Linux and Windows have been uploaded. Let me know whether this fixes it for you.

The new tarball fixes the problem here.

Is there any reason why we can't detect when JULIA_CPU_TARGET is i[3-6]86 and either throw an error or use pentium4 automatically?

Like I said, this is already set up properly for 0.7-based builds. We could add logic to handle 0.6 as well but it doesn't seem worth the effort given how close we are to 0.7.

No, I meant adding logic to Makefiles to ensure that this mistake cannot happen again under any circumstances. It's too easy to shoot yourself in the foot (not only for official builds), and I don't see any reason not to do it.

(Also, in the future the format of JULIA_CPU_TARGET could change again, and the same problematic situation could happen say when preparing 1.4.3 during the 1.5 development cycle.)

Ah, I see what you mean. Yeah, that could be worthwhile, I think.

We actually do have some logic for this. It failed in this case because:
(a) in the past, we recommending setting MARCH= instead of JULIA_CPU_TARGET (since we the whole build to benefit from the enhanced performance options) – so we are checking the wrong variable now
(b) it doesn't consider that generic would also be bad on this ARCH
https://github.com/JuliaLang/julia/blob/61c01aa22f89b85406058f294bd4ea1009957b4e/Make.inc#L648

Indeed. So I've filed https://github.com/JuliaLang/julia/pull/27439.

Shouldn't we set MARCH on the buildbots rather than JULIA_CPU_TARGET then? Maybe it could slightly increase performance by enabling MMX/SSE instructions in the C/C++/Fortran code.

Was this page helpful?
0 / 5 - 0 ratings