Rfcs: Extending deref/index with ownership transfer: DerefMove, IndexMove, IndexSet

Created on 21 Mar 2015  Ā·  65Comments  Ā·  Source: rust-lang/rfcs

It is clear that there is a need for the ability to move out of smart pointers and indexable things (DerefMove, IndexMove). The other frequently desired option is to IndexSet, which would be a special-cased version for indexing in this situation:

map[key] = value

currently, this is handled via IndexMut, but that is sub-optimal, because if the key is not already part of the map, the result is a panic.

Basic plan

DerefMove/IndexMove/IndexSet should layer on top of the traits we have now. One needs to be careful here because of subtle interactions with autoref and so forth.

Postponed RFCs

T-lang T-libs

Most helpful comment

Theory

Both deref and index are for types containing values of other types. They allow to conveniently access contained value.
The main difference is that deref types usually contain only one instance of other type, while index types usually contain multiple instances.
There are actually many ways to access internal instance, and index/deref families should cover most of them.

There are two cases currently supported:

(1) accept ref, return ref (Deref/Index)
(2) accept mut ref, return mut ref (DerefMut,IndexMut)

There is one case requested for derefs:

(3) consume value, return value (DerefMove?,IndexMove?)

It's an open question whether or not we need Index counterpart. I personally don't think it will be very useful.

There are two more index cases requested:

(4) accept ref, return value (IndexGet?)
(5) accept mut ref, accept value, set value (IndexSet?)

They can be used for collections not actually storing value instances (e.g. bit array setting/returning bools).
Case 5) can also be used to insert new values to hashmaps, which I think is a highly desirable feature.
These cases apparently do not need deref counterparts.

Practice

Now let's get to practice, and please correct me if I'm wrong about current rules.
Here I assume that S is source type, T is target type and I is index type. s, t and i will be corresponding instances.

Deref

Current deref rules

Current manual deref follows simple rules:

  1. &*s- perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy - perform Deref and copy
    Note that if there is no DerefMut and S is Copy, then &mut *s fails, but &mut {*s} succeeds.
    If T is not Copy, then *s will fail. Copyability of S does not matter.

Autoderef follows the same rules, but also depends on context:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T and T is Copyable - perform Deref and copy.
    Note that rule autoderef rule 3) is somewhat inconsistent, it works for self in method calls, but does not work for arguments in function calls.
    If context needs value and T is not Copy, then it fails. Copyability of S does not matter.

Proposed deref rules

Deref rules proposed in #178, in short, are: try to move copyables, try to ref non-copyables, otherwise use whatever possible.
I do mostly agree with the proposed rules, but don't really like 'use whatever possible' part.

178 says that, for example, if there is no Deref, but DerefMove, then it should be used for all derefs.

Let's imagine that there is no Deref, but T is Copy. In this case &*s would perform DerefMove and take reference of the result, which is both non-intuitive and contradicting with how Deref currently works (see my note about missing DerefMut).
I'd say it's better to fail in this case. Another, even simpler option is to make Deref implementation mandatory for types implementing DerefMove. I think it's reasonable and makes everything easier.

Proposed manual deref rules:

  1. &*s - perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy and implements DerefMove - copy source and perform DerefMove
  4. *s and T is Copy and S implements Deref - perform Deref and copy result
  5. *s and neither 3) or 4) apply - perform DerefMove

Proposed autoderef rules:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T, S is Copy and implements DerefMove - copy source and perform DerefMove
  4. If context needs value of T, T is Copy and S implements Deref - perform Deref and copy result
  5. If context needs value of T, but neither 3) or 4) apply - perform DerefMove

If rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules (that's the only thing different from #178)

It is possible to support DerefGet/DerefSet, but, then again, it's noticeably ramps complexity and likely not very useful.
If needed, DerefGet/DerefSet should follow strategy for IndexGet/IndexSet described below.

Indexing

Current indexing rules

Indexing is more or less follow deref, but is generally simpler, because there is no autoderef counterpart.

Current rules for indexing:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] and T is Copy - perform Index and Copy
    If T is not Copy, then s[i] will fail.

Proposed indexing rules

This extends both #159 and #1129, adds more details and answers some unresolved questions.

Main problem is IndexGet. Option of having it adds lots of ambiguity into rules.
Also, possiblity of different implementation of Index and IndexGet scares me.
I think that if collection need IndexGet, then having Index/IndexMut/IndexMove does not make sense.

I propose to make IndexGet and Index/IndexMut/IndexMove mutually exclusive.
Types implementing IndexGet can have automatically generated default implementation of Index/IndexMove for trait compatibility purposes, but manually implementing Index/IndexMut/IndexMove should be forbidden.

With mutually exclusive IndexGet and Index/IndexMut/IndexMove, indexing rules actually become pretty simple.

For types implementing IndexGet:

  1. s[i] - perform IndexGet
  2. s[i] = expr (asssignment) - perform IndexSet
  3. s[i] += expr (compound assignment) - perform IndexGet, apply non-compound operation, then IndexSet result
    With these rules &s[i] and &mut s[i] are still possible, but they represent references to temporary value returned by IndexGet. Note that it differs from the way Deref works, but I think it worth it. For deref it is not as useful because autoderef can handle most of function interface compatibility issues. Indexing does not have autoderef counterpart and therefore it's not wise to prohibit taking references.

For all other types:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] = expr (asssignment) - perform IndexSet (note that it should NOT fallback to IndexMut)
  4. s[i] += expr (compound asssignment) - perform IndexMut, apply compound operation on it's result (note that it should NOT fallback to IndexSet)
  5. s[i] and T is Copy - perform Index and Copy
  6. s[i] and T is not Copy - perform IndexMove

Just like with derefs, if rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules.

Note that I'm not sure if we need IndexMove (consume value, return value), but I list it here for completeness sake.

Indexing examples

// Bits is a tightly packed bits array, implements IndexGet/IndexSet
println!("first bit is: {}", bits[0]); // bits.index_get(0)
bits[1] = true; // bits.index_set(1, true)
bits[2] ^= true; // bits.index_set(2, (bits.index_get(2) ^ true))
let tref = &bits[3] // &bits.index_get(3), reference to temporary

// Array implements Index/IndexMut/IndexSet
println!("first element is: {}", array[0]) // *array.index(0)
let second = &array[1] // array.index(1), ref of second element
let third = &mut array[2] // array.index_mut(2), mutable ref of third element
array[3] = 2; // array.index_set(3, 2)
array[4] += 3; // (*array.index_mut(4)) += 3

// Map implements:
//   Index/IndexMut for Index types where Key impl Borrow<Index>
//   IndexSet/IndexMove for Index types where Key impl From<Index>
let vref = &array["key"] // array.index("key"), ref of value
let vrefmut = &mut array["key"] // array.index_mut("key"), mutable ref of value
map["new key"] = 1; // map.index_set("new key", 1), creates new entry
map["new key"] += 2 // (*map.index_mut("new key")) += 2, modifies existing entry
return map["key"]; // map.index_move("key"), consumes map
// failures:
&map["missing key"] // map.index("missing key"), fails
&mut map["missing key"] // map.index_mut("missing key"), fails
map["missing key"] += 2 // (*map.index_mut("missing key")) += 2, does not insert value, fails

P.S.

  • It's the longest comment I've ever wrote
  • If necessary, I can create a new rfc with even more implementation details
  • Or I can start hacking, you know =)

All 65 comments

I'd like DerefMove for similar reasons that Servo wants it: https://github.com/servo/servo/issues/3868

Namely, I have an type Foo, which owns a raw pointer from a crate that has FFI bindings:

struct Foo {
    handle: my_ffi_crate::foo // foo is a *mut to an uninhabited enum
}

and a "borrowed" version of that called BFoo:

struct BFoo<'a> {
    handle: my_ffi_crate::bfoo, // bfoo is the *const equivalent of my_ffi_crate::foo
    _phantom: PhantomData<&'a Foo>
}

Foo is the Rust equivalent of a type in the C library that's a flexible data storage type: think something like a C type that works like JSON. The C library supports things like: list indexing, map lookup, etc. which, in the Rust crate, have signatures like fn get<'a>(&'a self, key: Key) -> BFoo<'a>.

Most of the immutable "accessor" methods of Foo have to be implemented for both Foo and BFoo<'a> (a map might be from keys to more maps), which is annoying. It would make sense, I believe, to implement DerefMove<Target=BFoo<'a>> for &'a Foo, and then just implement those methods on BFoo<'a>. Of course, that might also require some additional autoref/autoderef magic.

A potential problem for the design of DerefMove - How would unsized types be handled? (e.g. FnOnce)

Missing from this list is also IndexGet, which is the "returns a value" version of Index.

@Gankro what is the difference between your IndexGet versus the IndexMove in the list ?

Ah I assumed IndexMove took self by-value (given the comparison to DerefMove), but if it takes &mut self, then never mind.

what does the RFCs being postponed mean? will they simply be back on track as soon as someone has made up a coherent idea of how everything should play together and writes it up?

@flying-sheep it means "we're not rejecting this RFC because it's bad, now is just not the right time." This happened a lot before 1.0, so we could focus on what needed to ship in 1.0.

And yes, if there's a postponed RFC, someone needs to go through and evaluate if it's still valid in its current form, or update it, and re-submit it.

i’m glad that IndexGet/IndexMove will be a thing.

things like ndarray profit from it. generally, all data structures where ā€œsliceā€ views can’t be rust slices.

should i try to update the ā€œremaining Index traitsā€ RFC (#159)?

if so:

  • which name? the IndexRef/IndexValue train is gone, and @Gankro was confused by IndexMove, so maybe IndexGet?
  • better do index_set as default-implemented trait for IndexMut or in a dedicated IndexSet?
  • also merge in IndexAssign (#1129)?

I would say that IndexGet is the clearest, because of the ease of implementing it for both T and &T. IndexMove is too restricted as a name, in that you might not actually want to _move_ anything, but instead you might want to copy.

Perhaps it would be possible to use type inference to decide whether to use Index or IndexGet. Example:

let some_array: [i32; 4] = [1, 2, 3, 4];
let some_slice = &some_array[0..3];
let x = some_slice[2]; // x: &i32 (default behavior)
let y: i32 = some_slice[2]; // y: i32 (would use `IndexGet` to pass by value

This would more or less follow from the current inference of IndexMut.

I do see the potential for compiler confusion in choosing which trait to use for a given operation, but I would at least appreciate a default implementation of IndexGet<E, R> for &[R] where R: Copy. Syntactic sugar ([E]) aside, this would let us assume that slices would be acceptable, and additional implementors of IndexGet could define their own behavior (whether they want to copy, clone, whatever).

I would also like to mention that even though copying-when-indexing is antithetical to C/C++, it _is_ a very common idiom when it comes to Ruby, etc. I don't think this is a weakness in Ruby, but in fact allowing the end-user to use one idiom or other other should help ease the learning curve for people coming from dynamic languages that assume copy.

One of my greatest frustrations with the Ruby language is not knowing whether array indexing was happening by-value or by-reference. I think if Rust could acknowledge both of these domains, and allow programmers the explicit option of choosing one or the other, it would go a long way.

Edit: thanks @comex for pointing out that I indexed a stack array rather than a slice initially

@andrewcsmith some_slice[2] is already i32 - you need &some_slice[2] for &i32. The indexing operator carries an implicit dereference.

(some_slice is also not a slice...)

Theory

Both deref and index are for types containing values of other types. They allow to conveniently access contained value.
The main difference is that deref types usually contain only one instance of other type, while index types usually contain multiple instances.
There are actually many ways to access internal instance, and index/deref families should cover most of them.

There are two cases currently supported:

(1) accept ref, return ref (Deref/Index)
(2) accept mut ref, return mut ref (DerefMut,IndexMut)

There is one case requested for derefs:

(3) consume value, return value (DerefMove?,IndexMove?)

It's an open question whether or not we need Index counterpart. I personally don't think it will be very useful.

There are two more index cases requested:

(4) accept ref, return value (IndexGet?)
(5) accept mut ref, accept value, set value (IndexSet?)

They can be used for collections not actually storing value instances (e.g. bit array setting/returning bools).
Case 5) can also be used to insert new values to hashmaps, which I think is a highly desirable feature.
These cases apparently do not need deref counterparts.

Practice

Now let's get to practice, and please correct me if I'm wrong about current rules.
Here I assume that S is source type, T is target type and I is index type. s, t and i will be corresponding instances.

Deref

Current deref rules

Current manual deref follows simple rules:

  1. &*s- perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy - perform Deref and copy
    Note that if there is no DerefMut and S is Copy, then &mut *s fails, but &mut {*s} succeeds.
    If T is not Copy, then *s will fail. Copyability of S does not matter.

Autoderef follows the same rules, but also depends on context:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T and T is Copyable - perform Deref and copy.
    Note that rule autoderef rule 3) is somewhat inconsistent, it works for self in method calls, but does not work for arguments in function calls.
    If context needs value and T is not Copy, then it fails. Copyability of S does not matter.

Proposed deref rules

Deref rules proposed in #178, in short, are: try to move copyables, try to ref non-copyables, otherwise use whatever possible.
I do mostly agree with the proposed rules, but don't really like 'use whatever possible' part.

178 says that, for example, if there is no Deref, but DerefMove, then it should be used for all derefs.

Let's imagine that there is no Deref, but T is Copy. In this case &*s would perform DerefMove and take reference of the result, which is both non-intuitive and contradicting with how Deref currently works (see my note about missing DerefMut).
I'd say it's better to fail in this case. Another, even simpler option is to make Deref implementation mandatory for types implementing DerefMove. I think it's reasonable and makes everything easier.

Proposed manual deref rules:

  1. &*s - perform Deref
  2. &mut *s - perform DerefMut
  3. *s and S is Copy and implements DerefMove - copy source and perform DerefMove
  4. *s and T is Copy and S implements Deref - perform Deref and copy result
  5. *s and neither 3) or 4) apply - perform DerefMove

Proposed autoderef rules:

  1. If context needs &T - perform Deref
  2. If context needs &mut T - perform DerefMut
  3. If context needs value of T, S is Copy and implements DerefMove - copy source and perform DerefMove
  4. If context needs value of T, T is Copy and S implements Deref - perform Deref and copy result
  5. If context needs value of T, but neither 3) or 4) apply - perform DerefMove

If rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules (that's the only thing different from #178)

It is possible to support DerefGet/DerefSet, but, then again, it's noticeably ramps complexity and likely not very useful.
If needed, DerefGet/DerefSet should follow strategy for IndexGet/IndexSet described below.

Indexing

Current indexing rules

Indexing is more or less follow deref, but is generally simpler, because there is no autoderef counterpart.

Current rules for indexing:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] and T is Copy - perform Index and Copy
    If T is not Copy, then s[i] will fail.

Proposed indexing rules

This extends both #159 and #1129, adds more details and answers some unresolved questions.

Main problem is IndexGet. Option of having it adds lots of ambiguity into rules.
Also, possiblity of different implementation of Index and IndexGet scares me.
I think that if collection need IndexGet, then having Index/IndexMut/IndexMove does not make sense.

I propose to make IndexGet and Index/IndexMut/IndexMove mutually exclusive.
Types implementing IndexGet can have automatically generated default implementation of Index/IndexMove for trait compatibility purposes, but manually implementing Index/IndexMut/IndexMove should be forbidden.

With mutually exclusive IndexGet and Index/IndexMut/IndexMove, indexing rules actually become pretty simple.

For types implementing IndexGet:

  1. s[i] - perform IndexGet
  2. s[i] = expr (asssignment) - perform IndexSet
  3. s[i] += expr (compound assignment) - perform IndexGet, apply non-compound operation, then IndexSet result
    With these rules &s[i] and &mut s[i] are still possible, but they represent references to temporary value returned by IndexGet. Note that it differs from the way Deref works, but I think it worth it. For deref it is not as useful because autoderef can handle most of function interface compatibility issues. Indexing does not have autoderef counterpart and therefore it's not wise to prohibit taking references.

For all other types:

  1. &s[i] - perform Index
  2. &mut s[i] - perform IndexMut
  3. s[i] = expr (asssignment) - perform IndexSet (note that it should NOT fallback to IndexMut)
  4. s[i] += expr (compound asssignment) - perform IndexMut, apply compound operation on it's result (note that it should NOT fallback to IndexSet)
  5. s[i] and T is Copy - perform Index and Copy
  6. s[i] and T is not Copy - perform IndexMove

Just like with derefs, if rule condition is satisfied, but required trait is not available, then that's a failure and compiler should not try to apply other rules.

Note that I'm not sure if we need IndexMove (consume value, return value), but I list it here for completeness sake.

Indexing examples

// Bits is a tightly packed bits array, implements IndexGet/IndexSet
println!("first bit is: {}", bits[0]); // bits.index_get(0)
bits[1] = true; // bits.index_set(1, true)
bits[2] ^= true; // bits.index_set(2, (bits.index_get(2) ^ true))
let tref = &bits[3] // &bits.index_get(3), reference to temporary

// Array implements Index/IndexMut/IndexSet
println!("first element is: {}", array[0]) // *array.index(0)
let second = &array[1] // array.index(1), ref of second element
let third = &mut array[2] // array.index_mut(2), mutable ref of third element
array[3] = 2; // array.index_set(3, 2)
array[4] += 3; // (*array.index_mut(4)) += 3

// Map implements:
//   Index/IndexMut for Index types where Key impl Borrow<Index>
//   IndexSet/IndexMove for Index types where Key impl From<Index>
let vref = &array["key"] // array.index("key"), ref of value
let vrefmut = &mut array["key"] // array.index_mut("key"), mutable ref of value
map["new key"] = 1; // map.index_set("new key", 1), creates new entry
map["new key"] += 2 // (*map.index_mut("new key")) += 2, modifies existing entry
return map["key"]; // map.index_move("key"), consumes map
// failures:
&map["missing key"] // map.index("missing key"), fails
&mut map["missing key"] // map.index_mut("missing key"), fails
map["missing key"] += 2 // (*map.index_mut("missing key")) += 2, does not insert value, fails

P.S.

  • It's the longest comment I've ever wrote
  • If necessary, I can create a new rfc with even more implementation details
  • Or I can start hacking, you know =)

In the case of indexing, what would happen to
s[i].do(), for some impl T {fn do(&self);}?

would it become
s.index_move(i).do() by rule 6
or perhaps worse
s.index(i).clone().do() by rule 5, where clone has the Copy implementation

obviously it would be nice if it became (&s[i]).do()

Wow, it's been a while.

You're absolutely right, that's a very important use case I missed.

I guess rules have to be modified like this to make it work:

  1. &s[i] or s[i].method(...) where method needs &self - perform Index
  2. &mut s[i] or s[i].method(...) where method needs &mut self - perform IndexMut
  3. ...

I hate to see how complex it gets, but still can't find a more usable solution.

The same goes for Borrow. The Borrow trait is pretty useless as it is now. It even needs a workaround in the language to work for &[] and &str as the conversion creates a (hidden) new type.

Given that we're in the impl Period, is there any motion to make forward progress on this? In addition to the heavily discussed stuff, this issue seems to be a blocker for things like Box<FnOnce> which have been a sore spot for years now.

I also don't see a clear summary of what is blocking this, in case someone wanted to hack on this, which I think means the design work was never completed? Just trying to get a feel for the situation.

IndexGet plus IndexSet should not be invoked together, say by s[i] += ... It'd become a nightmare for performance when using larger numerical types, like even 256 bits, which may or may not be copy.

A priori, I'd think indexing into collection types should return Entry/Guard like types that provide access to the memory using DerefMut or DerefMove, whatever that looks like. Adding this IndexGet and IndexSet zoo sounds problematic, especially when HashMap's Entry took so much effort to hammer out recently.

Hello. I'd like this to move forward (especially because of Box<FnOnce>). Is there anything I could do to help to move forward?

@vorner A formal RFC needs to be written, possibly based off of @VFLashM's comment.

Anything happened with this lately?

Random thought while getting myself up to speed---IndexMove may not be desirable to overload on existing syntax, because the effect of the move is necessarily dynamic. I can definitely see someone doing something like:

let s = m[i];
{ ... }
let t = m[i];

and getting confused that the latter fails if the result type isn't copyable. This can't be caught at compile-time, since we don't actually move out of m. So perhaps we should require explicitly invoking moves on indexing operations, such as let s = m[move i]?

IndexMove would consume m - that would fail to compile since m no longer exists on the third line.

Err, right. I forget that case is useful. In that case, I meant IndexGet---the one that moves out of a container without consuming it.

@alercah Moving out is consuming it, as far as I can tell. Unless you mean something different by "consuming it"...

Ah, IndexGet isn't quite that. It's designed for cases where you have a data structure which can't return a reference into it (e.g. a bitset). The signature is fn index_get(&self) -> Self::Item, so the underlying datastructure isn't modified.

Ahhh, right, ok! That makes sense, then. :)

What's the use case of IndexMove, then?

I think IndexMove is more of an intellectual curiosity than a thing we have
a concrete usecase for.

On Tue, May 1, 2018 at 9:12 PM, Alexis Hunt notifications@github.com
wrote:

Ahhh, right, ok! That makes sense, then. :)

What's the use case of IndexMove, then?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rfcs/issues/997#issuecomment-385835902, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABFY4LLu26WeXqHBvZ9Qzqm68KqYycWEks5tuQf1gaJpZM4Dybu7
.

I was under the impression that IndexMove was consuming the container: fn index_move(self, index) -> Self::Item, such that one could use

vec[i]

rather than the current nicest way (as far as I'm aware?)

vec.into_iter().nth(i).unwrap()

Wouldn’t it do the equivalent of remove(index) for collections?

@alexreg It is equivalent to the extent that it returns a T without requiring Copy/Clone. It differs in that remove takes &mut collection, and does does nontrivial work – shifting all the remaining elements 1 to the left. One could continue to use the collection of remaining elements.

The IndexMut I described consumes the collection, but need not perform the potentially expensive shifting. As it's been consumed, one can no longer use it nor the other elements it might have contained.

@alecmotta Right. But with IndexMove, it would just be statically guaranteed that (for example) iterating over the collection after an IndexMove operation would not include that element, or what?

After a single IndexMove, the entire array is moved, so you can’t access any elements at all. IndexMove consumes the container. Honestly, I’m not sure what that’s useful for, but it isn’t doing anything complicated like you’re describing.

Oh, I see. Well it could be useful if e.g. you need to sort a Vec, then pick a certain element to do further analysis with, and don't care about the rest.

I understand the biggest motivation is DerefMove (so you can call consuming methods on DSTs inside a box). The IndexMove is more like for symmetry/consistency.

@vorner Right. DerefMove has the more obvious practicality of increasing efficiency for situations like that. IndexMove would be done for consistency and out of intellectual curiosity, it seems.

I wonder about doing something like this:

#[lang = "deref_move"]
trait DerefMove {
    type Target;
    fn deref_move(self) -> Self::Target;
}

impl<'a, T> DerefMove for &'a T
where
    T: ?Sized + std::ops::Deref,
{
    type Target = &'a T::Target;
    fn deref_move(self) -> Self::Target {
        self
    }
}

impl<'a, T> DerefMove for &'a mut T
where
    T: ?Sized + std::ops::DerefMut,
{
    type Target = &'a mut T::Target;
    fn deref_move(self) -> Self::Target {
        self
    }
}

impl<T> DerefMove for Box<T> {
    type Target = T;
    fn deref_move(self) -> T {
        *self
    }
}

Remove deref and deref_mut lang items, move entire magic to deref_move. Reduce owned_box to be a glue for nightly box syntax (if we want to keep supporting it).

I don't think IndexMove for v[i] is a real use cases. If that really happens frequently, we should have a method for Vec before we have syntax sugar (IndexMove) for it.

impl<T> Vec<T> {
  fn take(self, n: usize) -> T {
    self.into_iter().nth(i).unwrap()
  }
}

It is really uncommon (compared to DerefMove) that we are consuming a container that have large amount of element for just one. Consider the scenario

f(container[0]);
// blabla
f(container[1]);

if some one changes the signature of f from f(&T) to f(T), we would get error message "container is moved" other than "expect T".

I think we could implement IndexSet first and it would never break future IndexMove/DerefMove (if we decide to accept IndexMove finally)

Having IndexGet be mutually exclusive with Index/IndexMut/IndexMove would be fine, because if you're using IndexGet you usually don't have the ability to use references, such as with FFI containers or say a bitset. IndexGet/IndexSet would be a useful combination with plenty of prior art (a lot of dynamic langs use this pattern for one), but there would be a lot issues not overly overloading the [] operator.

Note that &{rvalue} and &mut {rvalue} are valid Rust, they would produce references to a temporary place. So if a type implements IndexGet + IndexSet, the following code will likely compile:

let bit = &mut bitset[4]; // `bit` is a unique reference to a temporary bool
*bit = !*bit;
// note: the actual `bitset[4]` is not mutated

but has totally different semantics than

bitset[4] = !bitset[4];
// note: the actual `bitset[4]` is flipped.

The error rules like https://github.com/rust-lang/rfcs/issues/997#issuecomment-226316485 are still needed if we want to make &mut bitset[4] an error.

I think that's something that definitely should be watched for, but I'm also not sure it needs special rules. As long as &mut bitset[4] always compiles to an invocation of IndexMut regardless of whether or not the trait is implemented, all should work fine.

This points out an important thing about IndexMove, though: it's pretty clear from the extensive discussion of bitsets, buttressed by @clouds56's comment, that a common, desired interpretation of let x = s[i]; is going to be to copy values out of a container that doesn't support references. But if IndexMove is going to be a feature, it's going to use the same basic syntax. @VFLashM's comment above suggests that IndexMove should be the fallback when used on a container that doesn't have copy. But in practice, this is going to be awful. For example, if m: HashMap<K, V>, then let x = m[k]; will delete k from the map iff V is non-Copy. So IndexMove would require some even-more-magic syntax like let x = move m[k]; and at that point, is there really much benefit over let x = m.remove(k).unwrap();?

If we assume that IndexMove necessarily needs more syntax in order to make it work, we can divorce it from the rest of the proposal as there will be no ambiguity. So I think, then, there definitely is enough to move forward with an RFC on IndexGet and IndexSet. I do think VFlashM's comment has some good thoughts in it, but I'm going to go and disagree with just about everything in it:

  1. As discussed above, IndexMove should be killed entirely.
  2. The comment isn't exhaustive about the situations where Index and IndexMut should be used. Fortunately, the answer is not hard: they should be used when required either by autoref for method resolution or by a (mutable) place context. An RFC should endeavour to be more precise about where exactly Index will be preferred.
  3. This comment is spot-on about the use of compound assignment operators. The default behaviour should not be to index twice. A type that wishes to support an in-place mutation, but cannot return a reference to a value, should have to do so with an API that lets it do that explicitly. This could be done by having IndexGet return an Entry-equivalent. Ideally, an RFC would provide a rudimentary example of what this may look like, to prove that this pattern will work with it.
  4. There is a backwards-compatility problem with item 5 of the Index/IndexMut case: nested accesses on partially-Copy types. If have a Vec of a type like struct { c: SomeCopyType, m: SomeNonCopyType }, then v[0].c is currently legal. But it wouldn't be legal if v[0] desugared to Index and then Copy, because then it would require copying SomeNonCopyType. With IndexMove gone, however, we can just revert to the existing behaviour, if we keep the rule that IndexGet and Index are mutually exclusive.
  5. However, I would really like IndexGet to not be mutually-exclusive to Index. It's a nasty conceptual bugbear. It also could in theory also be annoying for sparse containers, if they decide to, instead of an Entry API on IndexGet, to actually populate zero entries with places for mutation on an IndexMut call but want to distinguish IndexGet. The ideal behaviour from a language standpoint, though, is that s[i] desugars to IndexGet if it's being moved out of, and Index if not. There's only one problem:
  6. Backwards-compatibility becomes an issue there, because now we change desugaring to use IndexGet when it's available. This isn't an issue with VFlashM's proposal where Index and IndexGet are mutually distinct, but it is also a problem with IndexSet and IndexMut. Currently, s[i] = x; desugars to a call through IndexMut and this behaviour would break if IndexSet were the desugaring instead.

    We can solve both of these problems by providing blanket implementations that defer to the existing implementations of Index and IndexMut when calling IndexGet and IndexSet, but this would defeat the ability to provide alternate implementations, particularly sad for HashMap and IndexSet. Fortunately, specialization would solve this. Unfortunately, it's specialization. An RFC might be able to propose an implementation that works with min_specialization, though, or failing that a workaround in the language (conditionally calling IndexMut and Index as fallback if IndexSet and/or IndexGet aren't defined) until things can be moved to the library.

tl;dr I think we can move forward with a proposal for IndexGet and IndexSet, relegate IndexMove to the dustbin, and then implement indexing rules as follows, where the first rule that applies is followed, but if a trait is missing you just error rather than continuing:

  1. If it's the target of an assignment, use IndexSet.
  2. If it's a mutable place context or needs a mutable reference receiver, use IndexMut.
  3. If the result is being moved out of, use IndexGet.
  4. Otherwise, use Index.

We'd need to address backwards-compatibility, either by providing specializable impls of IndexGet in terms of Index and IndexSet in terms of IndexMut, or by doing these specific fallback options in the language. If we do it in the language, we might do it with an eye to doing it by specialization eventually.

Making IndexSet implemented in terms of IndexMut by default does bring us into the hashmap footgun of map[k] = v doing subtly different things whether k is a reference or a value (a la #1129).

Although I'm unsure if this will be a big deal in practice? HashMap will obviously not impl IndexMut without IndexSet, but we'd need to magically suppress the default impl so we get a type error and not footgun behaviour.

I think a possibly non footgunny solution would be to preclude trait resolution (via the [] operator) for IndexMut iff an impl for IndexSet exists, with the knowledge the people who do want "fall-through" can add a delegating trait impl. (In a way, implementing IndexSet would be opting out of having a raw(?) IndexMut a[k] on the LHS of an assignment.)

I also have concerns with the interaction of Index and Copy types vs IndexGet, and I fail to see a scenario in which having both would make for a more ergonomic API.

Thanks.

I wasn't aware of that footgun. I also found this comment helpful, particularly the fact that this was last postponed in part because of the possibility of emplacement syntax, which seems pretty dead (the latest version proposes no new syntax and would mesh quite well with IndexSet).

I'm not sure there's any way to resolve that footgun without compiler help, as specialization doesn't provide a mechanism that I'm aware of to delete an impl. My gut is that adding it would be a low-value addition to a feature already bogged down by type problems, so compiler magic may be the only way to get additional work here. I could see other workarounds (impl !IndexSet, say), but I can't imagine anything is much cleaner.

That said, it might be worth considering whether this sort of thing is likely to come up more. In general, is it possible to remove methods from a trait if they're moved to another trait with a blanket impl that ports them over? If this isn't a breaking change, allowing types to opt-out of the blanket impl could possibly be a useful feature? Since that's basically what we'd be looking to do, albeit in syntax rather than a method.

Yeah, I don't think worrying about specialization is particularly useful given that specialization looks to be several months/years from stability.

To flesh out my main worry with IndexGet, it's the fact that taking a mutable reference to the object return by IndexGet is likely to be legal by default, but with nonobvious semantics. On one hand, this code is perfectly legal and sometimes used in code today.

fn main() {

   let a = &mut f();
   *a = 6;
   println!("{}", a);
}

fn f() -> i32 {
    5
}

On the other hand, such code with IndexGet is likely to be a footgun at best, as it appears completely identical to IndexMut.

Hmm, here's my best attempt to write down the decision tree Rustc has to take.

  1. If it's the target of an assignment, use IndexSet iff impled, else use IndexMut
  2. If it's a mutable place context or needs a mutable reference receiver, use IndexMut.
  3. If the result is being moved out of, use IndexGet iff impled, else use Index + Copy.
  4. Otherwise, use Index.

Under this, if a trait exists but is not applicable, rustc will error instead of continuing on. Additionally, the example provided above (&mut on IndexGet) would fail with an appropriate diagnostic (missing IndexMut + why it errored) rather than treating IndexGet as a temporary.

This would also require compiler logic to distinguish no matching impl from no impl, which is nontrivial (impossible?) in the presence of generic constraints.

What types satisfy IndexSet? IndexSet cannot increase Vec length of course (UB), making it like [T]. I'd think foo[k] = v should avoid doing anything unpredictable like allocations, which rules out HashMap too. What remains?

I'd think foo[k] = v should avoid doing anything unpredictable like allocations, which rules out HashMap too.

What value does that provide to anyone?

FYI, HashMap does not currently implement IndexMut, to avoid this sort of ambiguity: https://github.com/rust-lang/rust/commit/5fe0bb743a0af0413f8989a70a4f926fa5c63074.

Here's a Pre-RFC? (not too familiar with how these work) @alercah

There's likely to be deref and "nonsenical"/overlapping impls that I have not considered. IndexGet really complicates this a lot wrt deref.

Rendered.

I just realized that this is just an ad-hoc GAT (but real GATs would be unhelpful because of &mut). I think that IndexGet probably cannot be mutually exclusive, nor can we rely on the no fall-through behavior :(

Take a simple struct AsciiString, which supports IndexGet of itself returning u8. It would be completely logical for it to internally store data as a String and deref to &str, which supports an Index. Ughhhh. We could of course require forwarding impls or a marker trait, but that is certainly less ergonomic.

Maybe a way to restrict users from implementing IndexGet and Index for the same Idx would work, with all the issues on generic constraints that entails.

@carbotaniuman

If it's a mutable place context or needs a mutable reference receiver, use IndexMut.

Under this, if a trait exists but is not applicable, rustc will error instead of continuing on.

This is exactly what I suggested, so we're on the same page here.

This would also require compiler logic to distinguish no matching impl from no impl, which is nontrivial (impossible?) in the presence of generic constraints.

Yeah, that's not a trivial thing to do and likely has bad edge cases. I don't have good ideas to deal with this right now.

Take a simple struct AsciiString, which supports IndexGet of itself returning u8. It would be completely logical for it to internally store data as a String and deref to &str, which supports an Index. Ughhhh. We could of course require forwarding impls or a marker trait, but that is certainly less ergonomic.

That's a sign to me that dereffing to &str is wrong, and it should be to &u8. I'm not really worried about this case.

Comments on the pre-rfc:

let b = &val[3]; // reference to a temporary

I think this should fail as well, when Index isn't defined, for consistency with &mut. In my rules, if you use it in immutable place context, Index needs to be used, so this will fail.

I would also also include, in the alternatives considered, a section for Index and IndexGet being mutually exclusive.

Actually one more thing: this RFC needs to provide more discussion about exactly how the fallback works; "if impld" is, as you mentioned above, a rather complex question and so it needs a precise definition.

What about a.foo(), where foo is a method taking self by reference? Many methods in stdlib take &self even for Copy types.

Ah, that's a good point.

I suppose there's no real downside to allowing taking a reference to a temporary. So maybe that should be fine, then. In which case, we'd have:

  1. Assignments use IndexSet, possibly falling back to IndexMut. Error if not implemented.
  2. Mutable place context uses IndexMut. Error if not implemented.
  3. If attempting to move, use IndexGet. If not implemented, continue.
  4. Use Index if implemented. If not implemented, continue.
  5. Use IndexGet. If not implemented, error.

Note that I left out Index + Copy because it's not required as an explicit option: if there is no IndexGet at step 3, then Index will be called, and then the move out of the result of Index will succeed iff the output type is Copy.

Another note: I think we need to ensure that the output types of Index and IndexGet agree, if implemented. The above list doesn't work properly if they don't. Method resolution will also be weird because Index and IndexGet could generate separate candidate types for s[i].m().

If we do a specialization approach, this could be free; otherwise we would need compiler support for it.

Ok Deref wise I think we can say that the decision tree is followed completely for each struct, only performing an auto-deref iff it would otherwise error.

For example, the a[5] = 5 line here would call the custom IndexMut impl and not HashMap's IndexSet impl when it gets one. That should eliminate all of the auto-ref/deref trickery that we need to worry about.

I think we need to ensure that the output types of Index and IndexGet agree, if implemented.

Can you define agree? Would IndexGet returning Vec<u8> and Index returning &[u8] be ok? What about Index returning &Vec<u8>?

Ok Deref wise I think we can say that the decision tree is followed completely for each struct, only performing an auto-deref iff it would otherwise error.

For example, the a[5] = 5 line here would call the custom IndexMut impl and not HashMap's IndexSet impl when it gets one. That should eliminate all of the auto-ref/deref trickery that we need to worry about.

I think I agree with you, and this is basically no change from currently. The candidate type &mut A would be considered first, and we would look first at the IndexSet impl, which doesn't exist, and then at the IndexMut impl. This matches up with how normal lookups work: inherent methods are prioritized over trait methods, but a trait method on the smart pointer type is prioritized over an inherent method on the dereffed type.

I don't know the exact what that autoderef interacts with Index's contextual desugarings currently, but it sounds like we could implement the fallback during the point where we look for eligible candidate methods, which must already have some special casing for lang items. But I'm not a compiler dev so I don't know exactly how this would work---this is a sign of under-specification of the reference, incidentally.

Can you define agree? Would IndexGet returning Vec and Index returning &[u8] be ok? What about Index returning &Vec?

My instinct is to say no to both. The basic problem is that when we do method candidate resolution, we look at methods on T, &T, and &mut T. But if s[i] could yield two unrelated types, then we have to consider T, &U, &mut U or something similar, which my instincts tell me is going to be badly confusing.

One other note: the RFC should address some plan to handle the fact that HashMap can't have an IndexMut impl added without insta-stabilizing. One option might be a smart pointer wrapper that impls them, and is never stabilized?

My instinct is to say no to both.

Is this a typo (wrt IndexGet returning Vec and Index returning &Vec)? So we would basically enforce all theSelf::Output are the same?

pub trait IndexGet<Idx: ?Sized> {
    type Output;
    fn index(&self, index: Idx) -> Self::Output;
}
pub trait Index<Idx: ?Sized> {
    type Output: ?Sized;
    fn index(&self, index: Idx) -> &Self::Output;
}
pub trait IndexMut<Idx: ?Sized>: Index<Idx> {
    fn index_mut(&mut self, index: Idx) -> &mut Self::Output;
}

It wasn't a typo, but you said Vec and &[u8], not Vec and Vec. I can't think of a use case for that.

And yes, exactly as you say.

Alright, pushed the changes. Previous rendered link is up to date.

I'm sorry I haven't had time to review this; I've been ill lately. It may be a while before I'm able to come back to it.

@carbotaniuman Mostly looks good for me, except for the implementation of IndexSet in terms of IndexMut, as index_set isn't defined for IndexMut.

But also I feel like I've lost some context in the past few weeks, because I'm back to the question of how could HashMap define a custom IndexSet? Won't the blanket impl in terms of IndexMut break that?

@alercah I'll do both questions with one stone - the RFC proposes adding index_set to IndexMut.

d'oh.

Ok, I think that's worth posting as a PR then :D

Was this page helpful?
0 / 5 - 0 ratings