Go: cmd/compile, bytes: bootstrap array causes bytes.Buffer to always be heap-allocated

Created on 2 May 2014 · 23Comments · Source: golang/go

What steps reproduce the problem?

1. Download playground.zip
2. Run go build -gcflags=-m bytes.go
3. Run go build -gcflags=-m bytes2.go

playground/bytes: Resembles a simplified version of Go's bytes.Buffer, with the
exception that NewBuffer returns Buffer{} instead of &Buffer{}.
playground/bytes2: Different implementation that avoids reslicing

What happened?

$ go build -gcflags=-m bytes.go
# playground/bytes
bytes/buffer.go:12: can inline NewBuffer
bytes/buffer.go:57: can inline (*Buffer).Read
bytes/buffer.go:69: can inline (*Buffer).Bytes
bytes/buffer.go:12: leaking param: b to result ~anon1
bytes/buffer.go:16: leaking param: b
bytes/buffer.go:16: leaking param: b
bytes/buffer.go:34: make([]byte, 2 * cap(b.buf) + n) escapes to heap
bytes/buffer.go:16: leaking param: b
bytes/buffer.go:44: leaking param: b
bytes/buffer.go:44: leaking param: b
bytes/buffer.go:52: leaking param: b
bytes/buffer.go:52: (*Buffer).Write p does not escape
bytes/buffer.go:57: (*Buffer).Read b does not escape
bytes/buffer.go:57: (*Buffer).Read p does not escape
bytes/buffer.go:69: (*Buffer).Bytes b does not escape
# command-line-arguments
./bytes.go:9: inlining call to bytes.NewBuffer
./bytes.go:13: inlining call to Read
./bytes.go:18: inlining call to Bytes
./bytes.go:9: moved to heap: myBuf
./bytes.go:11: myBuf escapes to heap
./bytes.go:8: make([]byte, 4) escapes to heap
./bytes.go:14: myBuf escapes to heap
./bytes.go:8: make([]byte, 4) escapes to heap
./bytes.go:11: main []byte literal does not escape
./bytes.go:12: main make([]byte, 2) does not escape
./bytes.go:13: main myBuf does not escape
./bytes.go:14: main []byte literal does not escape
./bytes.go:18: main myBuf does not escape


$ go build -gcflags=-m bytes2.go
# playground/bytes2
bytes2/buffer.go:13: can inline NewBuffer
bytes2/buffer.go:61: can inline (*Buffer).Read
bytes2/buffer.go:73: can inline (*Buffer).Bytes
bytes2/buffer.go:13: leaking param: b to result ~anon1
bytes2/buffer.go:38: make([]byte, size) escapes to heap
bytes2/buffer.go:17: (*Buffer).grow b does not escape
bytes2/buffer.go:47: (*Buffer).Grow b does not escape
bytes2/buffer.go:54: (*Buffer).Write b does not escape
bytes2/buffer.go:54: (*Buffer).Write p does not escape
bytes2/buffer.go:61: (*Buffer).Read b does not escape
bytes2/buffer.go:61: (*Buffer).Read p does not escape
bytes2/buffer.go:73: (*Buffer).Bytes b does not escape
# command-line-arguments
./bytes2.go:9: inlining call to bytes2.NewBuffer
./bytes2.go:13: inlining call to Read
./bytes2.go:18: inlining call to Bytes
./bytes2.go:8: main make([]byte, 4) does not escape
./bytes2.go:11: main myBuf does not escape
./bytes2.go:11: main []byte literal does not escape
./bytes2.go:12: main make([]byte, 2) does not escape
./bytes2.go:13: main myBuf does not escape
./bytes2.go:14: main myBuf does not escape
./bytes2.go:14: main []byte literal does not escape
./bytes2.go:18: main myBuf does not escape
# command-line-arguments


What should have happened instead?

This shouldn't be happening with playground/bytes:

./bytes.go:9: moved to heap: myBuf
./bytes.go:11: myBuf escapes to heap
./bytes.go:8: make([]byte, 4) escapes to heap
./bytes.go:14: myBuf escapes to heap
./bytes.go:8: make([]byte, 4) escapes to heap

A re-slicing shouldn't cause playground/bytes's Buffer to always be heap allocated.

It should be like playground/bytes2, which avoids heap allocation until a resize happens:

bytes2/buffer.go:38: make([]byte, size) escapes to heap

Please provide any additional information below.

If this is working as intended, the implementation of bytes.Buffer should change to
allow it to be completely stack allocated. The playground/bytes2 implementation is
completely stack allocated until the initial buffer needs to be resized. This allows it
to operate on the stack, avoiding a heap allocation if possible.

Attachments:

playground.zip (5470 bytes)

NeedsFix help wanted

Source

lukescott

Most helpful comment

It's hard to solve this issue without any API changes just with improvements to the escape analysis.
Although https://golang.org/cl/133375 does make it possible to implement "proper" bytes.Buffer outside of golang stdlib.

See https://github.com/intel-go/bytebuf.

The only change to buffer.go:

type Buffer struct {
    buf       []byte   // contents are the bytes buf[off : len(buf)]
    off       int      // read at &buf[off], write at &buf[len(buf)]
-   bootstrap [64]byte // memory to hold first slice; helps small buffers avoid allocation.
+   bootstrap *[64]byte // memory to hold first slice; helps small buffers avoid allocation.
    lastRead  readOp   // last read operation, so that Unread* can work correctly.
}

And we get these numbers:

name            old time/op    new time/op    delta
String/empty-8     138ns ±13%      24ns ± 0%   -82.94%  (p=0.000 n=10+8)
String/5-8         186ns ±11%      60ns ± 1%   -67.82%  (p=0.000 n=10+10)
String/64-8        225ns ±10%     108ns ± 6%   -52.26%  (p=0.000 n=10+10)
String/128-8       474ns ±17%     338ns ±13%   -28.57%  (p=0.000 n=10+10)
String/1024-8      889ns ± 0%     740ns ± 1%   -16.78%  (p=0.000 n=9+10)

name            old alloc/op   new alloc/op   delta
String/empty-8      112B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
String/5-8          117B ± 0%        5B ± 0%   -95.73%  (p=0.000 n=10+10)
String/64-8         176B ± 0%       64B ± 0%   -63.64%  (p=0.000 n=10+10)
String/128-8        368B ± 0%      256B ± 0%   -30.43%  (p=0.000 n=10+10)
String/1024-8     2.16kB ± 0%    2.05kB ± 0%    -5.19%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
String/empty-8      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
String/5-8          2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/64-8         2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/128-8        3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)
String/1024-8       3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)

quasilyte on 4 Sep 2018

👍17 🎉6

All 23 comments

Comment 1:

Another thing to note: If I change "NewBuffer" in playground/bytes2 to return &Buffer{}
then buf (in main()) escapes to the heap even though NewBuffer is inlined:
./bytes2.go:9: inlining call to bytes2.NewBuffer
./bytes2.go:13: inlining call to Read
./bytes2.go:18: inlining call to Bytes
./bytes2.go:8: make([]byte, 4) escapes to heap
./bytes2.go:9: <S> &bytes2.Buffer literal does not escape
./bytes2.go:11: main []byte literal does not escape
./bytes2.go:12: main make([]byte, 2) does not escape
./bytes2.go:14: main []byte literal does not escape
Otherwise if I return Buffer{} directly nothing gets allocated to the heap, unless a
resize happens.

lukescott on 2 May 2014

Comment 2:

_Labels changed: added repo-main, release-go1.4._

ianlancetaylor on 2 May 2014

Comment 3:

_Labels changed: added release-none, removed release-go1.4._

_Status changed to Accepted._

rsc on 15 Sep 2014

With Go 1.8, the provided sample code (the playground/bytes package) no longer escapes:

GOPATH=/tmp/gopath go build -gcflags -m bytes.go
# playground/bytes
/tmp/gopath/src/playground/bytes/buffer.go:12: can inline NewBuffer
/tmp/gopath/src/playground/bytes/buffer.go:57: can inline (*Buffer).Read
/tmp/gopath/src/playground/bytes/buffer.go:69: can inline (*Buffer).Bytes
/tmp/gopath/src/playground/bytes/buffer.go:12: leaking param: b to result ~r1 level=0
/tmp/gopath/src/playground/bytes/buffer.go:21: (*Buffer).grow ignoring self-assignment to b.buf
/tmp/gopath/src/playground/bytes/buffer.go:40: (*Buffer).grow ignoring self-assignment to b.buf
/tmp/gopath/src/playground/bytes/buffer.go:16: leaking param content: b
/tmp/gopath/src/playground/bytes/buffer.go:34: make([]byte, 2 * cap(b.buf) + n) escapes to heap
/tmp/gopath/src/playground/bytes/buffer.go:49: (*Buffer).Grow ignoring self-assignment to b.buf
/tmp/gopath/src/playground/bytes/buffer.go:44: leaking param content: b
/tmp/gopath/src/playground/bytes/buffer.go:52: leaking param content: b
/tmp/gopath/src/playground/bytes/buffer.go:52: leaking param content: p
/tmp/gopath/src/playground/bytes/buffer.go:57: leaking param content: b
/tmp/gopath/src/playground/bytes/buffer.go:57: (*Buffer).Read p does not escape
/tmp/gopath/src/playground/bytes/buffer.go:69: leaking param: b to result ~r0 level=1
# command-line-arguments
./bytes.go:9: inlining call to bytes.NewBuffer
./bytes.go:13: inlining call to (*bytes.Buffer).Read
./bytes.go:18: inlining call to (*bytes.Buffer).Bytes
./bytes.go:8: make([]byte, 4) escapes to heap
./bytes.go:11: main myBuf does not escape
./bytes.go:11: main ([]byte)("test") does not escape
./bytes.go:12: main make([]byte, 2) does not escape
./bytes.go:13: main myBuf does not escape
./bytes.go:14: main myBuf does not escape
./bytes.go:14: main ([]byte)("00") does not escape
./bytes.go:16: main string(buf) does not escape
./bytes.go:18: main myBuf does not escape
./bytes.go:18: main string(([]byte)(~r0)) does not escape

However, a bytes.Buffer still seems to always escape if you call its WriteXxx methods, so the root issue persists.

cespare on 11 Mar 2017

It's the use of the bootstrap array that's forcing bytes.Buffer to always escape now, from what I can tell. Here's a very simple repro:

package main

type B struct {
    buf     []byte
    storage [64]byte
}

func (b *B) X() {
    b.buf = b.storage[:10]
}

func main() {
    var b B
    b.X()
}

./test.go:8: can inline (*B).X
./test.go:12: can inline main
./test.go:14: inlining call to (*B).X
./test.go:9: b.storage escapes to heap
./test.go:8: leaking param: b
./test.go:14: b.storage escapes to heap
./test.go:14: b escapes to heap
./test.go:13: moved to heap: b

Is it the case that any self-referencing pointers foil escape analysis? For instance, it also escapes if you do this:

type B struct {
    p *int
    n int
}

func (b *B) X() {
    b.p = &b.n
}

cc @randall77 for escape analysis thoughts

@lukescott I took the liberty of retitling the issue given that the re-slicing thing was fixed but the underlying problem of bytes.Buffer always being heap-allocated was not (you also discussed this in the now-closed #7661).

cespare on 11 Mar 2017

I don't know why this would cause an escape. Probably a tricky corner case? Cycles in the escapes-to graph?
@drchase might know.

randall77 on 14 Mar 2017

I don't "know", but I suspect. Pretty sure that forms a cycle in the graph, which can look like an escape for various reasons.

dr2chase on 14 Mar 2017

Change https://golang.org/cl/86976 mentions this issue: strings: prevent copyCheck from forcing Builder to escape and allocate

gopherbot on 9 Jan 2018

Change https://golang.org/cl/133375 mentions this issue: cmd/compile/internal/gc: handle array slice self-assign in esc.go

gopherbot on 4 Sep 2018

See https://github.com/intel-go/bytebuf.

The only change to buffer.go:

type Buffer struct {
    buf       []byte   // contents are the bytes buf[off : len(buf)]
    off       int      // read at &buf[off], write at &buf[len(buf)]
-   bootstrap [64]byte // memory to hold first slice; helps small buffers avoid allocation.
+   bootstrap *[64]byte // memory to hold first slice; helps small buffers avoid allocation.
    lastRead  readOp   // last read operation, so that Unread* can work correctly.
}

And we get these numbers:

name            old time/op    new time/op    delta
String/empty-8     138ns ±13%      24ns ± 0%   -82.94%  (p=0.000 n=10+8)
String/5-8         186ns ±11%      60ns ± 1%   -67.82%  (p=0.000 n=10+10)
String/64-8        225ns ±10%     108ns ± 6%   -52.26%  (p=0.000 n=10+10)
String/128-8       474ns ±17%     338ns ±13%   -28.57%  (p=0.000 n=10+10)
String/1024-8      889ns ± 0%     740ns ± 1%   -16.78%  (p=0.000 n=9+10)

name            old alloc/op   new alloc/op   delta
String/empty-8      112B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
String/5-8          117B ± 0%        5B ± 0%   -95.73%  (p=0.000 n=10+10)
String/64-8         176B ± 0%       64B ± 0%   -63.64%  (p=0.000 n=10+10)
String/128-8        368B ± 0%      256B ± 0%   -30.43%  (p=0.000 n=10+10)
String/1024-8     2.16kB ± 0%    2.05kB ± 0%    -5.19%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
String/empty-8      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
String/5-8          2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/64-8         2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/128-8        3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)
String/1024-8       3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)

quasilyte on 4 Sep 2018

👍17 🎉6

@Quasilyte there's something I don't understand. The theory behind the bootstrap array was to avoid an allocation, by having the initial slice point to it. Unfortunately, our escape analysis is unable to prove that the buffer does not escape (would you be able to explain why, what is the limit?), so we always got 1 allocation (the bytes.Buffer instance).

Your change makes the buffer to be a pointer to an array, so that (bytes.Buffer) does not escape, but it forces an allocation for the bootstrap buffer.

So shouldn't it be the same?

rasky on 5 Sep 2018

@rasky trick is that bootstrap array allocated with new may be proven as non-escaping with the current escape analysis. So, it does achieve the initial goal of avoiding heap allocations for local buffers that may benefit from small buffer optimization.

quasilyte on 5 Sep 2018

Without applying CL133375 and with this patch to buffer.go (keeping zero Buffer value usable), I get these numbers in the bytebuf benchmarks:

name                         old time/op    new time/op    delta
String/bytes.Buffer/empty-4    64.1ns ± 0%    10.9ns ± 1%   -83.07%  (p=0.000 n=8+9)
String/bytes.Buffer/5-4        89.3ns ± 2%    66.5ns ± 1%   -25.52%  (p=0.000 n=10+10)
String/bytes.Buffer/64-4        103ns ± 0%      83ns ± 1%   -19.78%  (p=0.000 n=10+10)
String/bytes.Buffer/128-4       235ns ± 1%     175ns ± 2%   -25.57%  (p=0.000 n=9+10)
String/bytes.Buffer/1024-4      591ns ± 1%     547ns ± 2%    -7.56%  (p=0.000 n=8+9)

name                         old alloc/op   new alloc/op   delta
String/bytes.Buffer/empty-4      112B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
String/bytes.Buffer/5-4          117B ± 0%       69B ± 0%   -41.03%  (p=0.000 n=10+10)
String/bytes.Buffer/64-4         176B ± 0%      128B ± 0%   -27.27%  (p=0.000 n=10+10)
String/bytes.Buffer/128-4        368B ± 0%      256B ± 0%   -30.43%  (p=0.000 n=10+10)
String/bytes.Buffer/1024-4     2.16kB ± 0%    2.05kB ± 0%    -5.19%  (p=0.000 n=10+10)

name                         old allocs/op  new allocs/op  delta
String/bytes.Buffer/empty-4      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
String/bytes.Buffer/5-4          2.00 ± 0%      2.00 ± 0%      ~     (all equal)
String/bytes.Buffer/64-4         2.00 ± 0%      2.00 ± 0%      ~     (all equal)
String/bytes.Buffer/128-4        3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)
String/bytes.Buffer/1024-4       3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)

The bytes package benchmarks aren't so rosy, though.

opennota on 6 Sep 2018

Since this seems to be a big improvement for certain scenarios, but requires an API change, maybe this change would be a candidate for Go 2.

reezer on 6 Sep 2018

@reezer, please, have some more patience.

What @opennota described is almost an optimal thing for go1.
I'll be back soon with explanations and, probably, a CL.

quasilyte on 6 Sep 2018

👍2

Change https://golang.org/cl/133715 mentions this issue: bytes: remove bootstrap array from Buffer

gopherbot on 6 Sep 2018

👍1

@opennota what you're doing is essentially a make([]byte), but in a slightly more convoluted way.
We can remove bootstrap field completely and replace it with make.

The main benefit from not having a bootstrap (or having it as a pointer) is that b receiver
stops leaking from Buffer.grow method. Since Buffer.grow can't be inlined, it's important
to not have receiver leaking in order to make stack allocated buffer possible.

It also opens some new optimization possibilities described below.

Results for benchmarks from bytebuf package (comparing only old and new bytes.Buffer):

name            old time/op    new time/op    delta
String/empty-8     132ns ±14%      21ns ± 0%   -84.30%  (p=0.000 n=10+10)
String/5-8         190ns ±12%     130ns ± 5%   -31.62%  (p=0.000 n=10+9)
String/64-8        205ns ±15%     166ns ±13%   -18.79%  (p=0.000 n=10+10)
String/1024-8      831ns ± 1%     780ns ± 0%    -6.09%  (p=0.000 n=8+7)

name            old alloc/op   new alloc/op   delta
String/empty-8      112B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
String/5-8          117B ± 0%       69B ± 0%   -41.03%  (p=0.000 n=10+10)
String/64-8         176B ± 0%      128B ± 0%   -27.27%  (p=0.000 n=10+10)
String/1024-8     2.16kB ± 0%    2.05kB ± 0%    -5.19%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
String/empty-8      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
String/5-8          2.00 ± 0%      2.00 ± 0%      ~     (all equal)
String/64-8         2.00 ± 0%      2.00 ± 0%      ~     (all equal)
String/1024-8       3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)

With array self-assignment removed, we don't have a leaking param in bytes.Buffer.grow.
This makes it possible to do a fancy pre-allocation which will reside on stack for buffers that do not escape:

- var buf bytes.Buffer
+ buf := bytes.NewBuffer(make([]byte, 0, 64))

Note that one can use different capacity, so we can effectively have bootstrap "array" of more than 64 bytes.
And given that we're not leaking anymore, it really does improve things:

name            old time/op    new time/op    delta
String/empty-8     132ns ±14%      26ns ± 0%   -80.47%  (p=0.000 n=10+8)
String/5-8         190ns ±12%      53ns ± 1%   -71.95%  (p=0.000 n=10+10)
String/64-8        205ns ±15%      99ns ± 9%   -51.62%  (p=0.000 n=10+10)
String/1024-8      831ns ± 1%     931ns ±20%      ~     (p=0.500 n=8+10)

name            old alloc/op   new alloc/op   delta
String/empty-8      112B ± 0%        0B       -100.00%  (p=0.000 n=10+10)
String/5-8          117B ± 0%        5B ± 0%   -95.73%  (p=0.000 n=10+10)
String/64-8         176B ± 0%       64B ± 0%   -63.64%  (p=0.000 n=10+10)
String/1024-8     2.16kB ± 0%    2.18kB ± 0%    +0.74%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
String/empty-8      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
String/5-8          2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/64-8         2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
String/1024-8       3.00 ± 0%      2.00 ± 0%   -33.33%  (p=0.000 n=10+10)

Note that we do less allocations for workloads that fit the slice capacity.

Key points of the new implementation:

Zero value bytes.Buffer performs better than before
You can have trully stack-allocated buffer, and it's not even limited to 64 bytes
The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly
Empty writes don't cause allocations

There were no big reasons to use bytes.NewBuffer before. Now there are.

bytes.Buffer benchmark results (go test -v -count=10 -bench='ReadString|WriteByte|WriteRune|Buffer' -benchmem bytes):

name                       old time/op    new time/op    delta
ReadString-8                 9.20µs ± 1%    9.22µs ± 1%     ~     (p=0.148 n=10+10)
WriteByte-8                  28.1µs ± 0%    26.2µs ± 0%   -6.78%  (p=0.000 n=10+10)
WriteRune-8                  64.9µs ± 0%    65.0µs ± 0%   +0.16%  (p=0.000 n=10+10)
BufferNotEmptyWriteRead-8     469µs ± 0%     461µs ± 0%   -1.76%  (p=0.000 n=9+10)
BufferFullSmallReads-8        108µs ± 0%     108µs ± 0%   -0.21%  (p=0.000 n=10+10)

name                       old speed      new speed      delta
ReadString-8               3.56GB/s ± 1%  3.55GB/s ± 1%     ~     (p=0.165 n=10+10)
WriteByte-8                 146MB/s ± 0%   156MB/s ± 0%   +7.26%  (p=0.000 n=9+10)
WriteRune-8                 189MB/s ± 0%   189MB/s ± 0%   -0.16%  (p=0.000 n=10+10)

name                       old alloc/op   new alloc/op   delta
ReadString-8                 32.8kB ± 0%    32.8kB ± 0%     ~     (all equal)
WriteByte-8                   0.00B          0.00B          ~     (all equal)
WriteRune-8                   0.00B          0.00B          ~     (all equal)
BufferNotEmptyWriteRead-8    4.72kB ± 0%    4.67kB ± 0%   -1.02%  (p=0.000 n=10+10)
BufferFullSmallReads-8       3.44kB ± 0%    3.33kB ± 0%   -3.26%  (p=0.000 n=10+10)

name                       old allocs/op  new allocs/op  delta
ReadString-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
WriteByte-8                    0.00           0.00          ~     (all equal)
WriteRune-8                    0.00           0.00          ~     (all equal)
BufferNotEmptyWriteRead-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
BufferFullSmallReads-8         3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

quasilyte on 6 Sep 2018

👍8 🎉5

I had retitled this bug to be about bytes.Buffer's bootstrap array, so now that we got rid of that I suppose we can close this bug.

The underlying issue of self-referential structures tricking escape analysis (as demonstrated in https://github.com/golang/go/issues/7921#issuecomment-285855447) remains, however.

cespare on 7 Sep 2018

There's a lot of history on the thread, but skimming briefly I understand the issue boils down to @cespare's example:

type B struct {
    p *int
    n int
}

func (b *B) X() {
    b.p = &b.n
}

This function is analyzed as "leaking param: b", which means calling b.X() causes b to always be heap allocated.

The issue here is that we're assigning &b.n through a (implicit) pointer dereference, so escape analysis pessimistically assumes the pointer might point to the heap.

There's an optimization esc.go:isSelfAssign that tries to look for things like p.f = p.g to eliminate unnecessary leaks, but it doesn't recognize p.f = &p.g. It probably could though.

(The above explanation applies to both esc.go and escape.go; they use the same approach here and even both use isSelfAssign for this optimization.)

Edit: This is wrong. Simply recognizing p.f = &p.g in isSelfAssign is insufficient, because we need to still record that this assignment makes p self-referential.

mdempsky on 16 Apr 2019

@mdempsky If we eliminate unnecessary leaks in p.f = &p.g, then this code:

package escape

type S struct {
    i  int
    pi *int
}

var sink S

func f(p *S) {
    p.pi = &p.i
    sink = *p
}

func g(p *S) {
    p.pi = &p.i
}

func h() {
    var s S
    g(&s)
    sink = s
}

will only has leaking param content in func f(p *S) { right? Will var s S moves to heap?

cuonglm on 30 Apr 2019

@cuonglm Unfortunately, my earlier suggestion to ignore p.f = &p.g like how we ignore p.f = p.g is wrong.

In the example you pasted, it's important that var s S be heap allocated: after the assignment sink = s, we have sink.pi pointing to s.i. If we naively treated g as a non-escaping function though, s would be stack allocated.

To fix this, we need a way to tag functions with something like "*p becomes self-referential". The current scheme used by esc.go (and inherited by escape.go) doesn't support this though.

Once we remove esc.go, I expect we'll be able to cleanup/simplify the tagging scheme, and there will probably be opportunities for representing more fine-grained semantics like this.

mdempsky on 30 Apr 2019

👍1

I think I may case the same problem:

https://github.com/golang/go/issues/34463#issuecomment-535749969

https://github.com/golang/go/issues/34463#issuecomment-536232425

The memory is leaking