Please answer these questions before submitting your issue. Thanks!
go version)?go version devel +9ce6b5c Thu Dec 7 17:38:51 2017 +0000 linux/amd64
a.k.a "go1.10beta1"
No, it can not be reproduced on go1.9 release.
go env)?GOARCH="amd64"
GOBIN="/work/go/go/bin/"
GOCACHE="/home/user/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/work/user/go"
GORACE=""
GOROOT="/work/go/go"
GOTMPDIR=""
GOTOOLDIR="/work/go/go/pkg/tool/linux_amd64"
GCCGO="/usr/bin/gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build975595394=/tmp/go-build -gno-record-gcc-switches"
Ran the following (abstracted/extracted from my production code).
https://play.golang.org/p/jBT07pKiFE2
I expected to see the identical functions, generate identical results, and so the example linked should print nothing.
In particular I expect an int32 value cast to uint32 cast to uint64, to produce an output that is representable in 32b, with the top 32b always 0.
What I see instead is that the function inside the closure produces an extended result; it appears the that something happens to the conversion --> uint64(uint32(int32)). I end up with a result where the top 32b are not zero, which looks broken to me, with reference to the language spec discussion on conversion.
In anycase I'd expect both the closure-generated function (via generator L17) and the function mask (L7) to produce identical results.
In case this is helpful or relevant it appears adding //go:noinline pragma in front of the non-closure generated method, makes the results match, although they match in the broken sense i.e. they both produce a result which has bits set in the upper 32b.
There have been some commits in the 1.10 window to teach the compiler to inline certain closures - see the commits from late 2017 in https://github.com/golang/go/commits?author=huguesb.
Not certain that those are the cause as I haven't checked, but they do seem good candidates. /cc @huguesb @randall77 @mdempsky
@mvdan note that the second playground example, which adds //go:noinline to the non-closure function, breaks the non-closure function i.e. it appears that if we prevent inlining then we get the incorrect result.
For clarity, here is just that example, identical functions with pragma noinline, and without that pragma, producing different results against go1.10beta1 (amd64/linux).
https://play.golang.org/p/p40iIhUIWcP
... I guess I'm saying that closures may have been a red-herring here, but because in general they do not inline, this may be why I was seeing it with a closure, the problem looks more generic.
@mvdan Looking at the play link it seems unlikely that my commits are related: closures can currently only be inlined when they're invoked inside the same function where they are defined, and functions that define a closure cannot be inlined. In other words, the closure part of this example should be compiled identically with or without my changes.
I stand corrected then - when I read closures and inlining, those commits were the first thing that came to mind.
/cc @mdempsky @griesemer @ianlancetaylor @aclements @randall77
Simplified repro case ( https://play.golang.org/p/uDjsdv8MCTQ ):
package main
import "fmt"
var mask1 = func(a, b uint64) uint64 {
op1 := int32(a)
op2 := int32(b)
return uint64(uint32(op1 / op2))
}
func mask2(a, b uint64) uint64 {
op1 := int32(a)
op2 := int32(b)
return uint64(uint32(op1 / op2))
}
func main() {
res1 := mask1(0x1, 0xfffffffeffffffff)
res2 := mask2(0x1, 0xfffffffeffffffff)
if res1 != res2 {
fmt.Printf("got %x, want %x\n", res1, res2)
}
}
Simpler repro: https://play.golang.org/p/W75JlEAUlaT
The issue appears to be in SSA lowering. I see these SSA ops:
v18 (8) = Div32 <int32> v10 v12
v20 (8) = ZeroExt32to64 <uint64> v18
v6 (?) = Addr <*uint64> {~r2} v2
v23 (8) = Store <mem> {uint64} v6 v20 v22
get lowered into:
v19 (8) = DIVL <int32,int32> v7 v8
v18 (8) = Select0 <int32> v19
v23 (8) = MOVQstore <mem> {~r2} v2 v18 v22
It looks like the zero-extend op is getting lost.
/cc @randall77 @cherrymui
As far as I can tell, the normal function is misbehaving. It happens to work when inlined though.
(MOVLQZX x) && zeroUpper32Bits(x,3) -> x
The problem is this rule. In zeroUpper32Bits,
case OpArg, OpSelect0, OpSelect1:
return x.Type.Width == 4
it assumes a Select Op has upper bit zeroed, which seems not true.
In theory, zeroUpper32Bits could look into the argument of a Select. As it is late in release cycle, we would probably just disable this optimization for Select for now.
Introduced from https://go-review.googlesource.com/c/go/+/58090.
cc @TocarIP
Change https://golang.org/cl/85736 mentions this issue: cmd/compile: disable "redundant zeroextensions" optimization for Select on AMD64
Do these conversions get any test coverage in the standard testing suite?
Looks like a problem with DIVL lowering
v22 = DIVL <int32,int32> v23 v9 : <AX,DX>
v24 = Select0 <int32> v22 : AX
Is lowered into:
v22 00012 (13) IDIVL CX
v22 00013 (13) JMP 16
v22 00014 (13) NEGQ AX // Why is this a 64-bit version that sets upper bits instead of zeroing them?
v22 00015 (13) XORL DX, DX
v29 00016 (13) MOVQ AX, "".~r2+16(SP)
I've just checked and generating NEGL instead of NEGQ for DIVL also fixes this issue.
But this can wait for go1.11
Reopening for 1.10 cherry pick.
@mdempsky, no need for re-open. We don't branch Go 1.10 until the rc1. It's not there yet: https://go.googlesource.com/go/+refs
Closing for no 1.10 cherry pick.
Should we open a new issue or queue up a CL for 1.11 to re-enable and fix the optimization?
@aclements Done: #23310.
Most helpful comment
I've just checked and generating NEGL instead of NEGQ for DIVL also fixes this issue.
But this can wait for go1.11