Discussion at https://groups.google.com/d/topic/golang-nuts/CN_rTywuD2U/discussion.
On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned.
It would be highly desirable if go/vet could detect when this rule is violated and warn about such cases. This note is easy to overlook, and non-trivial to satisfy. It's also not obvious how it applies in more complicated scenarios - e.g.:
etc. (see the discussion linked above)
At my company we have been bitten multiple times by non-64bits-aligned pointers used with sync/atomic functions on arm. Thanks to the fact that it panics at runtime, it only ever happened during tests/development. Developer were not aware of the BUG section at the bottom of the sync/atomic godoc page.
Each time it happened to us, the atomically accessed pointer was always a struct field and the struct was never embedded in another one, nor was it in a slice. I think this use case is the most common one, it is relatively rare to see a slice of structs having fields that are atomically accessed, but it might happen.
I'd like to work on a CL adding a vet check to cover for that case, that is also probably the simplest to detect, that is: detecting non-64bits alignment of a 64bit variable, when the variable is a field of a non-embedded struct.
This check would not cover all the cases, but I believe it would cover the most common one. Also it would be better than having no check at all, and could be a basis on which to improve upon.
related: https://github.com/golang/go/issues/599
@arl could you post a code snippet to show how the struct variable (edit: is declared)?
There are some ways to guarantee 64-bit alignments for struct fileds.
@go101 for example you can define your struct such as the atomically accessed field is the first struct field. In that case it's guaranteed it will be 64bits aligned, even on 32bits architectures.
package main
import (
"sync/atomic"
)
type A struct {
b int64 // b is accessed via a sync/atomic function
c int32
d int64 // d should NOT be accessed via a sync/atomic function as it's not 64bits aligned.
}
func main() {
a := A{}
// Atomically increment a.b. This is valid, b is the first field of the struct, so it's
// guaranteed to be 64-bits aligned.
atomic.AddInt64(&a.b, 1)
}
edit: updated the example to a complete one
I briefly spoke with @arl this past week, and adding a simple version of this check sounds fine to me. It could be expanded to add support for the more complex cases later, such as slices of structs.
As far as I can tell, the check would pass the correctness and precision criteria; the issue causes runtime panics, and it should be possible to have zero false positives. Frequency is a bit more subjective, as the check would simply save developer's time. Any piece of software that is properly tested shouldn't have this bug shipped out, as it should be caught by a panic at run time.
Labelling as NeedsDecision, but this gets a thumbs up from me. /cc @alandonovan @robpike @josharian
@go101 just saw your edit.
In our case moving the variable at the top of the struct solved the problem. We didn't have to rely on any other technique like padding because the variable, and the struct, were one of a kind (no slices involved).
The same code using sync/atomic can either be totally valid for, say, amd64, while invalid for arm32 (triggering a runtime panic). That means this vet check should somewhat be aware of the target architecture, otherwise the majority of users would see false positives.
The only way I know to avoid false positives:
arm32 build tag. That would make the check mostly useless for the majority of cases where users crosscompile the same code for various platforms (one of which being arm32)GOHOST is arm32.Thoughts ? CC @ianlancetaylor @alandonovan
I would be in favor of an opt-in check, enabled with an option or with -all.
We already have an arch-aware vet checks, asmdecl. It works by checking the code in play assuming GOOS/GOARCH. So if you only ever compile for amd64, you鈥檒l never see the vet check fire, but arguably, you also don鈥檛 have a problem. As long as you run your tests at least once for any target architecture, vet will check that target architecture, and catch the problem.
I haven鈥檛 thought much about this particular check, I just wanted to point out that we have a good arch-dependent vet mechanism.
@josharian @ianlancetaylor It looks like using GOOS/GOARCH/GOARM could also be a good solution, with some false positives only in a very specific case (i.e the user is running vet on arm32 but never had the intention to run their code from arm32).
Compared to hiding the vet check behind an experimental flag, this solution has the advantage of covering users that follow vet best practices (running it once per arch)
Notice that, if they鈥檙e compiling on arm32, I think it鈥檚 safe to assume that they would try to run at least once on arm32, if only for development purposes.
I hadn't realised we did have arch-dependent checks in vet; Josh's suggestion sounds great. I agree that if someone runs go vet or go test on a 32-bit architecture, this check should run. We shouldn't have false positives there.
We might have false negatives if a package is compiled for 32-bit, but only tested on 64-bit. I presume said package will have bigger issues than atomic parameter alignment, though :)
I have been working on a new vet check for that issue, the code was initially based off the standard Go tree and has now been rebased off of x/tools/go/analysis because as you know that's where the vet code has been moved.
The new check comes under the form of a new vet flag named atomicalign, it only runs on 32 bits archs and is a no-op everywhere else.
Basically the code does the following:
If vet is ran on one of the target platform (arm32 and x86) it will start to look for function calls of sync/atomic funcs acting on 64bits variables. When argument to atomic functions are struct fields, it then compute its alignment in order to issue vet errors in case the field is not aligned on multiple of 64 bits.
The check currently does not recurse into embedded structs (embedded or not) to know their size, so when that case is encountered, the vet check issues no warning in order to avoid false positives. This could potentially be a useful addition, but I'd like somebody to have a look at my code before adding that.
I added some tests, right now I'm not sure about how to make those tests platform-dependent, that is exclude them on non-targeted platforms?
Change https://golang.org/cl/158277 mentions this issue: go/analysis/passes/atomicalign: add atomicalign ckecker
It's been four months since @arl suggested implementing this. All replies have been in favor so far, so I'll replace NeedsDecision with NeedsFix. If Alan has any reservations, he can always -2 the CL itself :)
Looks like this was reverted in https://github.com/golang/tools/commit/718ddee9569ab27cee4900b77d6f38f1ddbbbf39 and a new CL with tests is required.
@tmm1
The first CL is still merged. The second one, which is an improvement over the first, has been reverted due to falling tests on some platforms.
@arl Could you elaborate on which cases are currently handled and which ones aren't? Should users simply be able to run go vet in a package to be alerted of atomic alignment problems?
Is the alignment check specific to the host architecture, and if so is there a way to check for alignment issues on a different architecture?
I have been working on a new
vetcheck for that issue, the code was initially based off the standard Go tree and has now been rebased off ofx/tools/go/analysisbecause as you know that's where thevetcode has been moved.
How does one build and use vet from x/tools/go/analysis?
The vet included with my go1.12.6 installation doesn't seem to have the atomicalign check:
$ go tool vet help 2>&1 | grep atomicalign
$
@tmm1
Currently only very simple cases are handled, but with no false positives. The objective was to first handle those, then build up on it in successive CLs.
Only cases handled are atomic variables used inside a global struct, however it fails to recognize when atomic functions are called from inside a struct method (when potentially affected struct is a pointer receiver).
I pushed a CL to handle exactly this some time ago, see https://go-review.googlesource.com/c/tools/+/163997 , but tests fail on mips and x86, they pass on arm though, it didn't receive much love though :).
Strange thing is there are cases where alignment is 64-bit on mips and x86, while, at least to my understanding of the bug note, they shouldn't.
That would mean not all platforms are affected in the same way by the bug note. That's what I'm trying to discover to go further but I'd like a feedback from somebody working on the compiler that may know why it is this way.
To try it yourself, you should run atomicalign on one of the affected architectures, as this check is a no-op on non-affected ones.
Most helpful comment
We already have an arch-aware vet checks, asmdecl. It works by checking the code in play assuming GOOS/GOARCH. So if you only ever compile for amd64, you鈥檒l never see the vet check fire, but arguably, you also don鈥檛 have a problem. As long as you run your tests at least once for any target architecture, vet will check that target architecture, and catch the problem.
I haven鈥檛 thought much about this particular check, I just wanted to point out that we have a good arch-dependent vet mechanism.