Go: language: Go 2: allow setting slice cap < len, prohibiting writes

Created on 4 Jun 2018 · 18Comments · Source: golang/go

I'm not sure why there is no way to create a slice's from another, with a capacity lower than it's length.

It has a few uses. For example, I could pass in a copy of my slice set to 0 capacity to benefit from copy-on-write semantics. The length would still appear to have N items, but trying to write to any of them would trigger the slice to re-grow. This would allow for immutable, lightweight slices in general.

I'm not certain if it would break any code, I've seen almost noone calling "cap" in my experience.

FrozenDueToAge Go2 LanguageChange Proposal

Source

dantoye

👎7 😕1

Most helpful comment

It doesn't seem intuitive to me to allow the slice's capacity to be smaller than its length. It especially doesn't seem intuitive that s[i] could panic when i < len(s).

There are other discussions around immutability that seem more promising to me. For example, #22876 and #20443 seem relevant, and there are other proposals and mailing list discussions.

cespare on 5 Jun 2018

👍2

All 18 comments

You can do this already. See "Full slice expressions" in https://golang.org/ref/spec#Slice_expressions.

rogpeppe on 4 Jun 2018

@rogpeppe a bit quick on the trigger there!

"The indices are in range if 0 <= low <= high <= max <= cap(a)"

I'm discussing relaxing the last constraint, and allowing cap to be smaller than the max/length. Then normal semantics kick in, where the slice will grow if you append outside it's length. However, the only new behavior is that writing to an index that is below the cap is meaningless and won't actually do anything, or could even panic. If a function tried to write to index 0, when my cap was 0, it should panic.

It's a method of creating simple, effective readonly slices.

Here's an example where a function will, once in a blue moon, edit one of the underlying array items. I don't want to have to pass in a copy, because it would most likely be wasting memory/cpu to make a copy each time. However, I do want to ensure my slice is readonly. Right now I have no choices. However, it intuitively makes sense that if I create a new slice with 0 capacity, then no one can shove anything new into my slice. Anything people try to do to my slice would be over it's capacity, and so internally would need to grow. The length would still be 10, so they can still read all my items, just can't write to them.

https://play.golang.org/p/xf8F6_-4usD

dantoye on 5 Jun 2018

It doesn't seem intuitive to me to allow the slice's capacity to be smaller than its length. It especially doesn't seem intuitive that s[i] could panic when i < len(s).

There are other discussions around immutability that seem more promising to me. For example, #22876 and #20443 seem relevant, and there are other proposals and mailing list discussions.

cespare on 5 Jun 2018

👍2

Why not, @cespare ?

All the other proposals I've seen include adding new keywords and changing types entirely.

Also, I believe it makes perfect sense for the capacity to be 0, how else would you describe an item that cannot hold anything? Slices with zero capacity but a positive length are simply saying "I have something, but I can't hold anything", which is precisely the definition of readonly.

To me it's a natural extension.

dantoye on 5 Jun 2018

😕2

I agree with @cespare , cap < len is weird.
Bounds checks are now different for a[i] = x and x = a[i] (the former also has an i < cap(a) check).

What does slicing even mean now?
x = a[:i] used to always succeed if i<cap(a), and sets len(a)=i. Now what happens? Can this still set the length if i>cap(a) but i < len(a)?

Plus all the readonly problems, like ranging over a slice of large objects without copying them:

for i := 0; i < len(a); i++ {
    p := &a[i]  //oops! if cap(a)<len(a)
    ... do something with p ...
}

Or needing multiple versions of library functions like bytes.Index, one to search only up to cap(a) because the caller wants to do a modification, and one which searches up to len(a) because the caller is only reading.

randall77 on 5 Jun 2018

@randall77

I appreciate your feedback, but I'm not sure you understand the proposal.

Slicing would be the same as it is now - the cap would work as it does now, where the new slice will have zero capacity and len(a)=1, as you put it.

Your loop is a good example, as getting a reference to a specific element is a unique bit of syntactic sugar that would have to be modified to check for cap, not just len. It is the compilers job, in that situation, to determine if your code can safely use a reference. Alternatively, try take a reference, and when the caller passes you a cap

I'm not sure where the multiple versions comment comes from. bytes.Index would remain unchanged, like most functions that only use "len". The only functions that would "change" are functions that edit a slice in-place, and even then, the change would be to panic only if passed a cap

dantoye on 5 Jun 2018

Your loop is a good example, as getting a reference to a specific element is a unique bit of syntactic sugar that would have to be modified to check for cap, not just len. It is the compilers job, in that situation, to determine if your code can safely use a reference. Alternatively, try take a reference, and when the caller passes you a cap

But what if the ... do something with p... is only reading from p? Somehow you would now want two different &a[i] operations, one of which gives you back a writeable pointer (and panics if i>=len(a) || i >= cap(a)) and one of which gives you back a read-only pointer (and panics if i>=len(a)).
It would be unfortunate if you had to give write permission to a function which only reads a slice, solely to allow it to construct pointers to the elements. As a result, I don't think the cap<len solution can work in isolation; a more holistic read-only data structure solution is required.

For library functions, I think you need multiple versions because the "length" of the operation now depends on what you're going to do with the result. For example:

    a[bytes.IndexByte('a')] = 'b' // replace an 'a' with a 'b' 
    a[bytes.IndexByte('a') + 1] // return the character after 'a'

So does IndexByte stop searching at len(a), or min(len(a),cap(a))?
(This is maybe not the best example, I will try and think of a better one.)

randall77 on 5 Jun 2018

@randall77 I think maybe what's not clear is that this does not change the default behavior of anything. There is no "give write permissions", there is only removing write permissions.

If a function wants a reference to a specific index, Go would simply add inot exist. In can not be referenced, nor dereferenced, only read. This makes complete sense when you remember that you're looking at an element the slice does not have the capacity to be holding.

Also, for all those situations (and all existing code), you would use len as normal. Using slices never changes in any scenario. The only real change is that you can use a cap

dantoye on 5 Jun 2018

@dantoye What do you mean when you say "trying to write to any of them would trigger the slice to re-grow"?

ianlancetaylor on 5 Jun 2018

@ianlancetaylor By "write to", I did mean append to. Apologies.

Writing to an array would be changed in one way: perform a cap check first, then a len check. In most situations it would have equal performance, but would have an extra check if writing to an index between len and cap. If the cap check fails, but the len check passes, a new panic is raised: "runtime error: index not within slice capacity".

For appending, however, the standard mechanics apply, though I'm sure some changes must be made if the append logic assumes the len

dantoye on 5 Jun 2018

It seems to me to be fairly unfortunate to have to do an extra memory load and comparison for every write to a slice, but I agree that this seems to work. I'm going to turn into a language change proposal. Fair warning, though: I do not think it will be accepted.

ianlancetaylor on 5 Jun 2018

This would allow for immutable, lightweight slices in general.

All the other proposals I've seen include adding new keywords and changing types entirely.

Having a clear way of expressing read-only/immutability - be it via keywords and types - helps developers and compilers/tooling.

stemar94 on 5 Jun 2018

👍1

@ianlancetaylor we don't have to do an extra load often. I'm not too certain about the growth factor for slices, but an average chance of needing the extra load could be calculated fairly precisely.

It would add the extra load/cmp for cases where len < idx < cap, which would be equal to half of the growth factor, which is a function of cap. Probably something like log(len)/2 more load/cmp's on average, not a linear growth.

dantoye on 5 Jun 2018

@dantoye I'm probably missing something, but it seems to me that for a write to a slice we must always check against both cap and len. If we skip the cap check, we will mutate an immutable slice. If we skip the len check, we will do the wrong thing when cap > len. In the normal case both comparisons pass, so I don't see how we can skip either one.

ianlancetaylor on 5 Jun 2018

@ianlancetaylor true, now that I think about it.

However, maybe it could be determined at slice creation time. Internally setting a hidden "max" property to max(len,cap) and use that for bounds checking?

Implementation aside, I'm more curious of your thoughts about it's use. As it's just a slice of any type, it automatically can be used for readonly values of anything.

dantoye on 6 Jun 2018

Internally setting a hidden "max" property to max(len,cap) and use that for bounds checking?

Increasing the size of a slice from 3 words to 4 would probably have more far-reaching consequences (performance and otherwise) than adding the extra bounds check, since the compiler could probably eliminate a bunch of them. Although I suspect it'd still be expensive enough to be quite noticeable.

Implementation aside, I'm more curious of your thoughts about it's use. As it's just a slice of any type, it automatically can be used for readonly values of anything.

My 2c, FWIW: It is an interesting idea. But (a) I find it rather unintuitive and (b) like many of the readonly proposals, it enforces one particular kind of readonly-ness. Note that in a readonly slice []*T, you can't change which T a particular element points to, but you can modify the T that is pointed to. Relatedly, some of the blog posts linked to in #22876 are useful reading. If Go 2 is going to include some kind of immutability, I'd personally rather see it be more deeply integrated, rather than a clever but awkward trick that applies only to slices.

josharian on 6 Jun 2018

@josharian Fair points.

It seems there's a lot more to cap

Your []*T example, and all the points above, show that for all intents and purposes it could function well as a backwards-compatible shim for readonly semantics. However, as with all shims, it would not serve well as a permanent fixture.

dantoye on 6 Jun 2018

We aren't going to make this change. It will make all slice writes more costly. It's not a general answer to issues about immutability, as it only applies to slices. It's a special case with relatively few applications, and we don't need another special case.

(By the way, I'm not really clear on what happens when an index outside of the cap is written. The initial message suggests that the slice would "re-grow", but where would the newly grown slice be stored? Later messages suggest a panic on write, which would be easier to implement.)

ianlancetaylor on 25 Sep 2018

Was this page helpful?

0 / 5 - 0 ratings