Go: reflect: optimize v.Set(reflect.Zero(v.Type())) to not allocate

Created on 17 Jul 2019 · 18Comments · Source: golang/go

The Value API lacks an efficient way to zero out a value. Currently, one can do v.Set(reflect.Zero(v.Type())). However, this is not efficient since reflect.Zero may need to allocate a large object, and also reflect.Value.Set will use runtime.typedmemmove under the hood instead of the more efficient runtime.typedmemclr.

I propose adding reflect.Value.SetZero method as a more efficient way to set the value to the zero value.

Proposal Proposal-Accepted

Source

dsnet

👍18

Most helpful comment

It sounds like Ian's change is the more complete one and does not have significant performance problems, so we should probably go with that.

Based on the discussion above and the lack of API change, this now seems like a likely accept.

rsc on 22 Jan 2020

👍3

All 18 comments

I would also like to see this added and want to give some context where this can have a big effect with a use case:
I came across this "inefficiency" also recently when looking into code for a proto sanitizer for log writing that should wipe out sensitive fields and therefore needs to "unset"/"wipe" some proto fields to a zero value. This was using reflecting on structs. Creating zero values was showing up in profiling for a non trivial amount of time and memory allocation. A future new proto API might provide some better method but then that might also be able or need to have an efficient reflect method under the hood.

martisch on 17 Jul 2019

A future new proto API might provide some better method but then that might also be able or need to have an efficient reflect method under the hood.

Heh. Ironically, I filed this issue in trying to optimize the internal protobuf implementation itself and couldn't figure out an efficient to this.

dsnet on 17 Jul 2019

Given v.Set(reflect.Zero(v.Type())), the place where the inefficiency happens is if v.Type() is a value larger than a single word, so that the reflect.Value contains a pointer to a zeroed allocation.

I think we could change the representation of reflect.Value to define that if that pointer is nil, it is interpreted as pointing to an appropriate number of zeroed bytes. Then the idiom in question starts being more efficient without any changes to client code, no new API to learn, and so on.

The change seems like it could be done completely invisibly. The Value containing a nil pointer could not be addressable, but the result of reflect.Zero is not addressable anyway, so that wouldn't be a problem.

Should we think about doing that invisible optimization instead of new API?

rsc on 20 Aug 2019

👍3

I tried a simple version of that optimization and it did save 75% of the time on this benchmark. So the idea does seem worth pursuing. @dsnet Is this a plausible benchmark for the kinds of code you are concerned about?

(It's not a tiny change to the reflect package, diffstat reports 159 insertions, 35 deletions.)

func BenchmarkZero(b *testing.B) {
    t := TypeOf(struct {a, b, c, d *byte}{})
    v := New(t)
    for i := 0; i < b.N; i++ {
        v.Elem().Set(Zero(t))
    }
}

ianlancetaylor on 21 Aug 2019

Do you have a CL for this experiment that I can take for a spin?

dsnet on 21 Aug 2019

@dsnet https://golang.org/cl/191199

ianlancetaylor on 21 Aug 2019

Change https://golang.org/cl/191199 mentions this issue: reflect: treat nil Value ptr as zero value of type

gopherbot on 21 Aug 2019

The runtime already has a 1K region of zeros that we can use to be the backing store for the result of reflect.Zero.
The change would then be as simple as:

    if ifaceIndir(t) {
        var p unsafe.Pointer
        if t.size <= maxZero {
            p = unsafe.Pointer(&zeroVal[0])
        } else {
            p = unsafe_New(t)
        }
        return Value{t, p, fl | flagIndir}
    }

// Buffer of zeros. We could share this with runtime.zeroVal with linkname tricks, or keep it separate.
const maxZero = 1024
var zeroVal [maxZero]byte

We could also detect this particular buffer in Set and use typedmemclr instead of typedmemmove.

It would only require allocation in the >1K case, which I think is rare.

randall77 on 27 Aug 2019

👍1

This is also a big inefficiency in packages like encoding/json. Please let me know when a CL is ready and I'll be happy to help review and test.

If this can be done without new API, all the better :)

mvdan on 29 Aug 2019

Change https://golang.org/cl/192331 mentions this issue: reflect: use zero buffer to back the Value returned by Zero

gopherbot on 29 Aug 2019

@dsnet If you want one of these changes to go into 1.14, please do some benchmarking with real code. Thanks.

ianlancetaylor on 28 Sep 2019

It seems like this proposal is stalled on deciding whether it is possible to make v.Set(reflect.Zero(v.Type())) as efficient (by recognizing it in the library, not the compiler) as the proposed v.SetZero(), to avoid new API surface.

@dsnet, do either of Ian's or Keith's patch (both above) work well enough for you?

rsc on 15 Jan 2020

The use case I originally needed this for is no longer relevant. Perhaps one of the other :+1:'s could comment?

dsnet on 15 Jan 2020

My CL is simpler but has a performance discontinuity depending on size.
Ian's is more invasive but will handle all sizes.

randall77 on 15 Jan 2020

I briefly mentioned encoding/json last year - in particular, we set reflect values to zero in three places:

$ grep -nE 'Zero.*Type' *.go
decode.go:601:          z := reflect.Zero(v.Type().Elem())
decode.go:708:              mapElem.Set(reflect.Zero(elemType))
decode.go:917:          v.Set(reflect.Zero(v.Type()))

I'm travelling at the moment, so I can't spend the time to test both of the CLs against the current set of decode/unmarshal benchmarks. I can have a look next week, if noone beats me to it. From a quick look, they seem to be, respectively:

zeroing remaining array elements (unlikely path)
zeroing a map element value before reusing it to decode (hot path when decoding maps)
zeroing a value when decoding null into it (depends on the input)

My hunch is that JSON often involves smaller types. If one is decoding huge types, plenty of other pieces in the decoder would start getting slow, so I don't think extra allocations there would matter. So I'd default for Keith's simpler CL, and perhaps archive Ian's in case anyone has a strong need for it in the future.

mvdan on 15 Jan 2020

It sounds like Ian's change is the more complete one and does not have significant performance problems, so we should probably go with that.

Based on the discussion above and the lack of API change, this now seems like a likely accept.

rsc on 22 Jan 2020

👍3

I think it'd be good to fill out Ian's change before choosing it over Keith's to confirm that there are no meaningful performance side-effects from the as-yet-unimplemented runtime changes. That shouldn't change the prognosis for this proposal, just possibly which implementation we should.

josharian on 22 Jan 2020

👍2

No change in consensus so accepted.

(For clarity, that there's no API change here, so if the idea had started at just an optimization, this would not have needed to be in the proposal process.)

rsc on 5 Feb 2020

Was this page helpful?

0 / 5 - 0 ratings