Dead stores are writes to variables that are not followed by corresponding reads. While dead store elimination is mostly a desirable optimization, some of the time it is essential for program correctness that dead store elimination (or the unreachable code elimination that it sometimes implies) does not happen.
The two examples on my mind are:
Secure zeroization of sensitive data (also known as cleansing, wiping, etc.), where the store (to computer memory) itself is essential, and it should not legitimately be followed by a read.
Benchmarking, where the problem is the elimination of code that is meant to be benchmarked
In C these problems might be solved by using memset_s, volatile types or inline assembly tricks.
In current Go user code, convoluted efforts to prevent dead store elimination for wiping memory and to prevent dead code elimination for benchmarking ultimately rely only on the lack of optimizations in current Go implementations, instead of on formal correctness, which this proposal aims to enable in addition to simplification of user code.
The new built-in "formallyRead" is meant to return nothing and either
takes a pointer or an arbitrary number of pointers
takes a slice or an arbitrary number of slices
if package "unsafe" is imported it additionally can take an unsafe.Pointer and a uintptr (to indicate base and size of a memory object)
The pointer arguments to formallyRead indicate objects that must be considered as having formally been read at the point of the formallyRead invocation. Slice arguments indicate the same for the slice's underlying array in the range up to len(slice).
The following minimal example is meant to show the effects of formallyRead :
package main
func main() {
...
var key [32]byte
// Do cryptographic operations with the key
...
for i := range key {
key[i] = 0
}
// Pretend to read the key so the compiler would not optimize out the preceding loop.
formallyRead(&key) // or formallyRead(key[:])
...
}
The following example showcases the cutting down of benchmark code that formallyRead would enable, because use of sink package level variables is no longer necessary:
var GlobalF float64
func BenchmarkAcos(b *testing.B) {
x := 0.0
for i := 0; i < b.N; i++ {
x = Acos(.5)
}
GlobalF = x
}
turns into:
func BenchmarkAcos(b *testing.B) {
for i := 0; i < b.N; i++ {
x := Acos(.5)
formallyRead(&x)
}
}
See also:
Regarding benchmarking:
https://groups.google.com/forum/#!topic/golang-dev/eCPwwvqs2Cg
A more ambitious and ambiguous proposal regarding the zeroization use case:
https://github.com/golang/go/issues/21374
Discusses some ugly hacks that are done in the name of zeroization:
My initial thought is that a builtin function zero that zeros out the memory associated with a pointer or slice would be easier to understand than formallyRead.
The benefits to benchmarking seem less interesting to me. Using global variables is easy enough to understand, and it doesn't seem likely to fail any time soon.
The benefits to benchmarking seem less interesting to me. Using global variables is easy enough to understand, and it doesn't seem likely to fail any time soon.
But you said in the linked mailing list discussion that:
Note that gccgo will already discard assignments to an unexported variable that is never read, among other optimizations.
Also, from the same discussion it seems like both you and rsc think that using sink variables complicates benchmarking code, but you say that formallyRead is not easy to understand ... Maybe the main problem is that formallyRead is not a good name?
Edit: i was confused about the "unexported variable" part and thus misunderstood what GCC optimizes and what does it not optimize, sorry.
Maybe the main problem is that formallyRead is not a good name?
Yes, maybe. For the benchmarking case, perhaps we could repurpose runtime.KeepAlive. Perhaps it already works.
I think runtime.KeepAlive should work as currently implemented (for both uses discussed here).
For benchmarking particularly, perhaps there should be a method on testing.B which could delegate to runtime.KeepAlive for now, but have a more evocative name.
func wipe(key []byte) {
for i := range key {
key[i] = 0
}
runtime.KeepAlive(key)
}
What's to stop the compiler from optimizing out the entire function? KeepAlive is currently implemented as:
func KeepAlive(x interface{}) {
// Introduce a use of x that the compiler can't eliminate.
// This makes sure x is alive on entry. We need x to be alive
// on entry for "defer runtime.KeepAlive(x)"; see issue 21402.
if cgoAlwaysFalse {
println(x)
}
}
I think the branch is there to confuse it into thinking there's a possibility of the value being needed?
runtime.KeepAlive cannot be optimized away.
Or perhaps more precisely, if we come up with an optimization that causes runtime.KeepAlive to be optimized away, we will change the implementation of runtime.KeepAlive to avoid that optimization.
I think the branch is there to confuse it into thinking there's a possibility of the value being needed?
Correct.
@randall77 A question: Suppose one passes a slice to runtime.KeepAlive (like in awnumar's example above). Does runtime.KeepAlive then protect just the slice structure (the one that has the pointer to the backing array, length and capacity; I think) or does it also protect the slice's backing array?
I am asking because I think println does not need to read the slice's backing array.
@nsajko Both. runtime.KeepAlive keeps its argument live directly (the slice structure). Anything that is kept live will also keep all its referents live, so the backing store is thus kept live also. If the slice backing store contained pointers, anything those pointed to would also be kept live, etc.
@randall77 Sorry for bothering you again, but, in your last message, did you mean "live" just in the GC sense, or also in the "prevents dead code elimination" sense?
I mean in the "live" sense. Dead code elimination isn't observable in the language.
@randall77 So when you said,
I think runtime.KeepAlive should work as currently implemented (for both uses discussed here).
How does this hold?
Sorry, I was talking about the language semantics. Yes, runtime.KeepAlive also makes sure its argument is computed.
Anything passed to runtime.KeepAlive must be computed and available at the runtime.KeepAlive call. As if someone might read it and check its contents. Whether that hypothetical someone is the GC or another goroutine.
Based on the discussion above, runtime.KeepAlive already does what the proposed new builtin function would do. So, closing.
Most helpful comment
I think
runtime.KeepAliveshould work as currently implemented (for both uses discussed here).For benchmarking particularly, perhaps there should be a method on
testing.Bwhich could delegate toruntime.KeepAlivefor now, but have a more evocative name.