Go: proposal: add built-in function "formallyRead" for preventing elimination of "dead" stores

Created on 28 Jul 2019 · 13Comments · Source: golang/go

Dead stores are writes to variables that are not followed by corresponding reads. While dead store elimination is mostly a desirable optimization, some of the time it is essential for program correctness that dead store elimination (or the unreachable code elimination that it sometimes implies) does not happen.

The two examples on my mind are:

Secure zeroization of sensitive data (also known as cleansing, wiping, etc.), where the store (to computer memory) itself is essential, and it should not legitimately be followed by a read.
Benchmarking, where the problem is the elimination of code that is meant to be benchmarked

In C these problems might be solved by using memset_s, volatile types or inline assembly tricks.

In current Go user code, convoluted efforts to prevent dead store elimination for wiping memory and to prevent dead code elimination for benchmarking ultimately rely only on the lack of optimizations in current Go implementations, instead of on formal correctness, which this proposal aims to enable in addition to simplification of user code.

The new built-in "formallyRead" is meant to return nothing and either

takes a pointer or an arbitrary number of pointers
takes a slice or an arbitrary number of slices
if package "unsafe" is imported it additionally can take an unsafe.Pointer and a uintptr (to indicate base and size of a memory object)

The pointer arguments to formallyRead indicate objects that must be considered as having formally been read at the point of the formallyRead invocation. Slice arguments indicate the same for the slice's underlying array in the range up to len(slice).

The following minimal example is meant to show the effects of formallyRead :

package main

func main() {
                ...

                var key [32]byte

                // Do cryptographic operations with the key
                ...

                for i := range key {
                                key[i] = 0
                }
                // Pretend to read the key so the compiler would not optimize out the preceding loop.
                formallyRead(&key) // or formallyRead(key[:])

                ...
}

The following example showcases the cutting down of benchmark code that formallyRead would enable, because use of sink package level variables is no longer necessary:

var GlobalF float64

func BenchmarkAcos(b *testing.B) {
    x := 0.0
    for i := 0; i < b.N; i++ {
        x = Acos(.5)
    }
    GlobalF = x
}

turns into:

func BenchmarkAcos(b *testing.B) {
    for i := 0; i < b.N; i++ {
        x := Acos(.5)
        formallyRead(&x)
    }
}

Most helpful comment

I think runtime.KeepAlive should work as currently implemented (for both uses discussed here).

For benchmarking particularly, perhaps there should be a method on testing.B which could delegate to runtime.KeepAlive for now, but have a more evocative name.

randall77 on 28 Jul 2019

👍3

All 13 comments

My initial thought is that a builtin function zero that zeros out the memory associated with a pointer or slice would be easier to understand than formallyRead.

The benefits to benchmarking seem less interesting to me. Using global variables is easy enough to understand, and it doesn't seem likely to fail any time soon.

ianlancetaylor on 28 Jul 2019

👍2

The benefits to benchmarking seem less interesting to me. Using global variables is easy enough to understand, and it doesn't seem likely to fail any time soon.

But you said in the linked mailing list discussion that:

Note that gccgo will already discard assignments to an unexported variable that is never read, among other optimizations.

Also, from the same discussion it seems like both you and rsc think that using sink variables complicates benchmarking code, but you say that formallyRead is not easy to understand ... Maybe the main problem is that formallyRead is not a good name?

Edit: i was confused about the "unexported variable" part and thus misunderstood what GCC optimizes and what does it not optimize, sorry.

nsajko on 28 Jul 2019

Maybe the main problem is that formallyRead is not a good name?

Yes, maybe. For the benchmarking case, perhaps we could repurpose runtime.KeepAlive. Perhaps it already works.

ianlancetaylor on 28 Jul 2019

👍1

I think runtime.KeepAlive should work as currently implemented (for both uses discussed here).

For benchmarking particularly, perhaps there should be a method on testing.B which could delegate to runtime.KeepAlive for now, but have a more evocative name.

randall77 on 28 Jul 2019

👍3

func wipe(key []byte) {
    for i := range key {
        key[i] = 0
    }
    runtime.KeepAlive(key)
}

What's to stop the compiler from optimizing out the entire function? KeepAlive is currently implemented as:

func KeepAlive(x interface{}) {
    // Introduce a use of x that the compiler can't eliminate.
    // This makes sure x is alive on entry. We need x to be alive
    // on entry for "defer runtime.KeepAlive(x)"; see issue 21402.
    if cgoAlwaysFalse {
        println(x)
    }
}

I think the branch is there to confuse it into thinking there's a possibility of the value being needed?

awnumar on 29 Jul 2019

runtime.KeepAlive cannot be optimized away.
Or perhaps more precisely, if we come up with an optimization that causes runtime.KeepAlive to be optimized away, we will change the implementation of runtime.KeepAlive to avoid that optimization.

I think the branch is there to confuse it into thinking there's a possibility of the value being needed?

Correct.

randall77 on 29 Jul 2019

👍2

@randall77 A question: Suppose one passes a slice to runtime.KeepAlive (like in awnumar's example above). Does runtime.KeepAlive then protect just the slice structure (the one that has the pointer to the backing array, length and capacity; I think) or does it also protect the slice's backing array?

I am asking because I think println does not need to read the slice's backing array.

nsajko on 30 Jul 2019

@nsajko Both. runtime.KeepAlive keeps its argument live directly (the slice structure). Anything that is kept live will also keep all its referents live, so the backing store is thus kept live also. If the slice backing store contained pointers, anything those pointed to would also be kept live, etc.

randall77 on 30 Jul 2019

@randall77 Sorry for bothering you again, but, in your last message, did you mean "live" just in the GC sense, or also in the "prevents dead code elimination" sense?

nsajko on 30 Jul 2019

I mean in the "live" sense. Dead code elimination isn't observable in the language.

randall77 on 30 Jul 2019

@randall77 So when you said,

I think runtime.KeepAlive should work as currently implemented (for both uses discussed here).

How does this hold?

awnumar on 30 Jul 2019

Sorry, I was talking about the language semantics. Yes, runtime.KeepAlive also makes sure its argument is computed.

Anything passed to runtime.KeepAlive must be computed and available at the runtime.KeepAlive call. As if someone might read it and check its contents. Whether that hypothetical someone is the GC or another goroutine.

randall77 on 30 Jul 2019

👍1

Based on the discussion above, runtime.KeepAlive already does what the proposed new builtin function would do. So, closing.

ianlancetaylor on 27 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings