Go: Proposal: add a "merge" built-in function to join several slices.

Created on 18 Feb 2018  路  20Comments  路  Source: golang/go

The problem

Sometime, we need to insert a slice (call it x) into another (call it y) at the index i of y.
If the capacity of y is large enough to accept all elements of x, then things would simple.

func insert1(y, x []T, i int) []T {
    s := y[:len(s)+len(x)]
    copy(s[i+len(x):], s[i:])
    copy(s[i:], x)
    return s
}

However, the capacity of y is not large enough to accept all elements of x,
we must use make to allocate a new slice which is large enough to accept all elements of x and y.

func insert2(y, x []T, i int) []T {
    s := make([]T, 0, len(x)+len(y))
    s = append(s, y[:i]...)
    s = append(s, x...)
    s = append(s, y[i:]...)
    return s
}

The problem here is that the make function will clear all allocated bytes,
which is not essential for this case.

The proposal

So I propose a merge (or join, or concat) built-in function to merge several slices.

func merge(slices ...[][]T) []T

so that we can call

merge(y[:i], elements, y[i:])

which will be more efficient than the insert2 function.

Go2 LanguageChange Proposal

Most helpful comment

The core of this issue is to avoid the make function zeroing a just allocated memory.

You don't need a language change for that. You just need a compiler optimization that infers regions of slices that are unconditionally overwritten, which should be fairly straightforward for the simple case of appending a sequence of slices.

All 20 comments

Append already reuses the array if the bytes fit in the destination.

@as
append is only useful to merge two slices.

@josharian
looks https://github.com/golang/go/issues/18605 is about expanding variadic argument manners.
This proposal is to get a way to merge multiple slices into a new allocated slice without zeroing the new slice when it is allocated.

@dotaheor if you could do append(s, y[:i]..., x..., y[i:]...) then you would get all of this, with no new built-ins. And that's what #18605 is about it.

Ah, yes.

But, with the manner append(s, y[:i]..., x..., y[i:]...), y[:i] and x and y[i:] will be merged into one slice as the variadic parameter of the append function, then the new merged parameter will be merged again with the first parameter of the append function. There will be two allocations. My proposal will only make one allocation.

There鈥檚 no reason I see that it would have to be implemented that way.

Ok, it would be great if that change can satisfy this need.

Or we just do this in library code once we have generics (#15292).

@bradfitz generic is not helpful for this problem.
The core of this issue is to avoid the make function zeroing a just allocated memory.
Surely, it not possible for a make_without_zeroing proposal to get approved.
So I propose the merge function instead, it may be not only useful for the case in my first comment.

So why not a grow function instead, that acts like realloc in C? Then you could do more than merge (which would be grow+copy).

@andlabs grow still needs to zero new elements, which is unnecessary.

But I think that sometimes we really need a grow function (which doesn't zero old elements if a new underlying memory block is allocated), or an assureSliceCap function.
We can't implement a grow in the most efficient way by using make, copy and append.

The core of this issue is to avoid the make function zeroing a just allocated memory.

You don't need a language change for that. You just need a compiler optimization that infers regions of slices that are unconditionally overwritten, which should be fairly straightforward for the simple case of appending a sequence of slices.

It would be great that a compiler optimization can achieve the goal of a merge function.
I think it is possible at least for some scenarios.

This can be done either via #18605, or via generics, or via a compiler optimization. We're not going to accept this as a builtin function directly.

ok, I hope there is a tangible solution in planning to resolve this inefficiency problem,
and to fix the Go slice manipulation completeness problem.

I do have to wonder, and I forget if anyone said this at all in this thread, if the cost of zero-initializing is really significant enough to optimize it away. (This means actual experiments and benchmarks to confirm or deny this.)

I do have to wonder, and I forget if anyone said this at all in this thread, if the cost of zero-initializing is really significant enough to optimize it away.

The current compiler spends about 15% of its execution time zero initializing new allocations.

I doubt that much of that can be optimized away, but the answer is yes: It matters.

Was this page helpful?
0 / 5 - 0 ratings