Rust: Tracking issue for RFC #495 (features `slice_patterns` and `advanced_slice_patterns`)

Created on 6 Mar 2015  Â·  79Comments  Â·  Source: rust-lang/rust

New tracking issue: https://github.com/rust-lang/rust/issues/62254

Old content

Tracking issue for https://github.com/rust-lang/rfcs/pull/495

Breaking Changes

This RFC is a breaking change for most users of slice patterns. The main change is that slice patterns now have the type [_] instead of &[_].

For example, in the old semantics

fn slice_pat(x: &[u8]) {
    // OLD!
    match x {
        [a, b..] => {}
    }
}

the [a, b..] would have the type &[u8], a would have the type u8 and b the type &[u8].

With the new semantics, [a, b..] would have the type [u8] and b the type [u8] - which are the wrong types. To fix that, add a & before the slice and a ref before the tail as if you were matching a struct (of course, use ref mut if you want a mutable reference):

fn slice_pat(x: &[u8]) {
    // NEW
    match x {
        &[a, ref b..] => {}
    }
}

Concerns to be resolved before stabilization

  • [ ] The syntax conflicts with exclusive range patterns.
  • [ ] #8636 Matches on &mut[] move the .. match & don't consider disjointness
  • [ ] #26736 cannot move into irrefutable slice patterns with multiple elements
  • [x] #34708 double drop with slice patterns
  • [x] #26619 (Only E-needstest) Yet another bug with slice_patterns
  • [x] #23311 (Only E-needstest) LLVM Assertion: Both operands to ICmp instruction are not of the same type!

History and status

B-RFC-implemented B-unstable C-tracking-issue P-medium T-lang

Most helpful comment

Proposal to resolve the conflict with inclusive range patterns: add an @ symbol:

[first, second, ..]
[first, second, tail @ ..]
[ref first, ref second, ref tail @ ..]
[ref mut first, ref mut second, ref mut tail @ ..]

It already has a similar role in other places in patterns: x @ Some(_)

All 79 comments

triage: P-backcompat-lang (1.0 beta) -- at least for the backwards incompatible parts of rust-lang/rfcs#495. This is something of a nice to have, we can live with it if it doesn't get done.

I'm on it, by the way.

https://github.com/jakub-/rust/tree/array-pattern-changes. I'll open a PR soon.

@jakub- go go go!

:racehorse:

(this is puntable to 1.0 if necessary, but we hope to have it by 1.0 beta.)

We are going to feature-gate array patterns, in order to ensure that we can soundly land these changes to the semantics in the future.

Moving off of milestone, since these are now gated.

triage: I-nominated ()

P-high, not 1.0.

I've repurposed this as a general tracking issue for the slice_patterns and advanced_slice_patterns features.

Syntax for named subslice a.. conflicts with potential future exclusive slice patterns. Either it should be changed, or exclusive slice patterns should be forever rejected, or something else.

So, once MIR is around, I'll feel much better about this feature from the POV of soundness. It's certainly convenient once in a while. But I have realized a few quirks recently:

  1. I think there is still some debate about whether [] patterns should match &[T] values. I think no, you should write the &.
  2. I don't think that there is a syntax for taking a mut subslice.
  3. The syntax conflicts with exclusive slice patterns (as @petrochenkov pointed out).
  4. It's a bit unclear to me if the slice part is intended as a fresh subpattern. Seems like it should be.

It would be also nice to have unnamed .. or even named sub.. sublists in tuple and especially tuple variant patterns, but whatever syntax choice is made for slices it probably can be reused there.

I don't think that there is a syntax for taking a mut subslice.

[a, ref mut b..] seems to compile, its just not have been updated to by-value semantics from https://github.com/rust-lang/rfcs/pull/495. Subslice is an arbitrary pattern after all.

On Thu, Nov 05, 2015 at 12:02:03PM -0800, Vadim Petrochenkov wrote:

I don't think that there is a syntax for taking a mut subslice.

[a, ref mut b..] seems to compile, its just not have been updated to by-value semantics from https://github.com/rust-lang/rfcs/pull/495. Subslice is an arbitrary pattern after all.

I would expect that to produce a &mut &[T] value. Is that not what it does? (In particular, the ref isn't normally required to get a value of type &[T], is it?)

In particular, the ref isn't normally required to get a value of type &[T], is it?

Currently it isn’t, but according to rust-lang/rfcs#495, array patterns should be changed to make ref be required to get a &[T]:

Make subslice matching in array patterns yield a value of type [T; n] (if the array is of fixed size) or [T] (if not).

Because of this, a ref binding should give a &[T; n] or &[T] (according to the RFC). For example:

let x: [i32; 3] = [1, 2, 3];
match x {
    [1, b..] => { /* b is [i32; 2] */ }
    [2, ref b..] => { /* b is &[i32; 2] */ }
    [a, ref mut b..] => { /* b is &mut [i32; 2] */ }
}

let x: &mut [i32] = &mut [1, 2, 3];
match x {
    // &mut [1, b..] => { /* b is [i32] (invalid because b is !Sized) */ }
    &mut [2, ref b..] => { /* b is &[i32] */ }
    &mut [3, ref mut b..] => { /* b is &mut [i32] */ }
    _ => {}
}

@nikomatsakis
But that's one of the main points of RFC 495.

| Discriminant type | Pattern | Binding types |
| --- | --- | --- |
| [T; N] | [a, b..] | a is T, b is [T; N - 1] |
| [T] | [a, b..] | a is T, b is [T] |
| [T; N] | [ref a, ref b..] | a is &T, b is &[T; N - 1] |
| [T] | [ref a, ref b..] | a is &T, b is &[T] |
| [T; N] | [ref mut a, ref mut b..] | a is &mut T, b is &mut [T; N - 1] |
| [T] | [ref mut a, ref mut b..] | a is &mut T, b is &mut [T] |

On Mon, Nov 09, 2015 at 11:49:37AM -0800, P1start wrote:

In particular, the ref isn't normally required to get a value of type &[T], is it?

Currently it isn’t, but according to rust-lang/rfcs#495, array patterns should be changed to make ref be required to get a &[T]:

OK, seems good. So I definitely wouldn't want to stabilize until we've
implemented these semantics and gained some experience with them.

Currently, needing to create a &[T] for the "rest" sub-pattern causes issues in MIR.
I'll add a hack to generate the len switch followed by treating the slice pattern as an array pattern, which is capable of generating that "rest" slice rvalue.

We should probably switch to the RFC semantics in MIR trans, once we get rid of the old trans code, as they simplify things quite a bit.

What are the remaining blockers for stabilizing this? Is there anything I can do to help?

@sgrif Implementing the RFC is necessary. @arielb1 started in #32202 but there hasn't been any activity on that PR for a while. I'll take over if he's okay with it.

@eddyb

My implementation of slice patterns is done. I only have to figure out what is the best way to get match checking to work (maybe just re-add that hack?)

Hmm perhaps my comment regarding .. should have been posted here instead.

I just tried to follow this snippet for Slice Patterns. The example failed and pointed me to this issue.

I tried to update it based on what was said here but running it results in "Program Ended". No errors are thrown :(

@polarathene It's not supposed to print anything, so that means it worked. The assertions were satisfied and the program ended normally.

@solson Oh right... I have been used to running examples that printed out something and the Program Ended was unexpected, no idea why it didn't click that there was no actual print statements. Sorry about that!

Status (iiuc) implementation is complete, but recent (2016-06-09) so probably some bugs to iron out (see also #26158, #35044).

I believe there is also general uncertainty among the lang team that slice patterns are generally 'right'. Hopefully, we can get some experience with the implemented feature and think about possible interactions with other features.

Question: there was a conflict with exclusive ranges mentioned above by @nikomatsakis and @petrochenkov - is that still an issue or has something changed there?

Note PR #36353 in particular https://github.com/rust-lang/rust/pull/36353/commits/5baa6cf387e4a515e5f758c59e794ccad066f5ae is pointing out an issue with allowing moves of elements out of slices, and is disallowing it (a [breaking-change] that it notes in the description on that commit).

Open Issues (feel free to edit)

  • [ ] #8636 Matches on &mut[] move the .. match & don't consider disjointness
  • [ ] #26736 cannot move into irrefutable slice patterns with multiple elements
  • [ ] #34708 double drop with slice patterns
  • [ ] #26619 (Only E-needstest) Yet another bug with slice_patterns
  • [ ] #23311 (Only E-needstest) LLVM Assertion: Both operands to ICmp instruction are not of the same type! ~~

EDIT by @nikomatsakis: moved list to main issue

The slice-patterns vs. borrowck bugs do not really drop anything. I just have to fix the double drop problem.

Note the concern about how the syntax here conflicts with inclusive range patterns. Not that this is new, just don't want it forgotten.

Proposal to resolve the conflict with inclusive range patterns: add an @ symbol:

[first, second, ..]
[first, second, tail @ ..]
[ref first, ref second, ref tail @ ..]
[ref mut first, ref mut second, ref mut tail @ ..]

It already has a similar role in other places in patterns: x @ Some(_)

From https://github.com/rust-lang/rust/issues/23121#issuecomment-155171250:

Discriminant type | Pattern | Binding types
--- | --- | ---
[T; N] | [a, b..] | a is T, b is [T; N - 1]
[T] | [a, b..] | a is T, b is [T]

The second one would not work, right? Since [T] is dynamically sized and therefore cannot be directly on the stack. b in the pattern would need to be at least ref b. (And the discriminant would need to be behind &_ or another memory indirection, with a corresponding &_ pattern.)

ref tail @ is start of a pattern with binding

ref tail @ PATTERN

and while .. itself is not a pattern, pattern can start with .., for example a half open exclusive range pattern

ref tail @ ..RANGE_END

The syntax is still viable though, but the solution is a bit hacky - add "part of array pattern" flag to pattern parsing function and disambiguate in favor of subslice if this flag is true and .. is followed by , or ].

One more problem is that the current syntax is PATTERN.. and your suggestion reduces it to BINDING @ .., i.e. the subslice cannot be further deconstructed in place anymore.

Oh, but, range patterns with missing bounds ..END, BEGIN.., ...END, BEGIN... are not implemented yet, so ref tail @ ..RANGE_END won't be a problem if they are never implemented.

There is no .. pattern for RangeFull, right? So one token of look-ahead disambiguates ref range_to @ ..end from ref tail @ ...

the subslice cannot be further deconstructed in place anymore.

Can you give an example of how the previous syntax could do that?

@SimonSapin
Something like

    match [1, 2, 3, 4] {
        [a, x @ [10, 11].., d] => {}
        [a, [b, c].., d] => {}
    }

works now.
Actually

[a, ref mut b.., c]

is a special case of [PAT, PAT.., PAT] as well.

To make disambiguation simple and also keep the current functionality it would be ideal to use some prefix disambiguator.

[PAT, PREFIX PAT, PAT]

where PREFIX immediately means that we are going to parse a subslice pattern and not an element pattern.
For example, your syntax in "inverted" form:

[PAT, .. @ PAT, PAT] // full form
[PAT, .., PAT] // shorthand for [PAT, .. @ _, PAT]

[a, [b, c].., d] is the same as [a, b, c, d], isn’t it? As to [a, x @ [10, 11].., d], well… I could live without it. It could be replaced with slice-indexing expressions: y @ [a, 10, 11, d] => { let x = &y[1..3]; /* … */ }

Yeah, I didn't say it's useful, only that it works at the moment 😄

One more problem about BINDING @ .. is that simple .. stops being a convenience sugar for PAT.. (where PAT is _) and becomes a separate special case. .. cannot be desugared <dummy_name> @ .. due to different ownership semantics (BINDING @ .. moves the subslice and .. doesn't).
This problem doesn't arise for .. @ PAT (or any other PREFIX PAT).

EDIT: Basically, PREFIX PAT would be the least hacky and simultaniously the least invasive re-syntaxing.

EDIT2: Actually, PAT @ .. (instead of BINDING @ ..) is good too if parse_pattern never considers @ to be a part of the pattern if it's followed by .., or ..]. Just a bit more hacky.

Is it too late to change the syntax in more drastic ways?

If no, I'd have this proposal: Add "++" as a concat operator to the language in both expression and pattern form:

let x = vec![1, 2, 3] ++ &[4, 5, 6];
let y = "abc".to_string() ++ "def";
let z = (1, true) ++ ("test", ());

match *x {
    [1, 2, 3, 4] ++ ref v => ()
}

match *y {
    "abcd" ++ ref v => ()
}

match z {
    (a, b) ++ v => ()
}

Pro:

  • No more conflict with a .. pattern
  • We can stop using the + operator for things that are not commutative addition
  • Uniform treatment of slices and string slices

Con:

  • Huge change
  • More verbose syntax

Update: Added tuples

@Kimundi Another con is that itnwould complicate the choice for VG further, unless we use it there too.
That is, I sort of want something like ...x within [] and () to desugar to ++ x ++ outside them.

why not use + in patterns?

We already support matching on &str on literals and constants. Could we similarly allow (stabilize) matching slices on array literals and array/slice constants, while we're still pondering on some of the more advanced uses?

How about this syntax: [a, b, ..rest..] => {}

I agree with @jethrogb in that it would be amazing if we could move towards stabilising the simple, unambiguous cases for fixed-size arrays first.

Most of the games/graphics ecosystems use something like [f32; 3] for Point/Vertex types, or [[f32; 4]; 4] for matrix types, etc. It would be a big ergonomic win to be able to pattern match on these as they are used very frequently.

let [x, y, z] = add(a, b);
match point {
    [0.0, 0.0] => { ... },
    [x, y] if x > xmin && y > ymin && x < xmax && y < ymax => { ... },
    p => OutOfBounds(p)?,
}
match mat {

    [[1.0, 0.0, 0.0, x],
     [0.0, 1.0, 0.0, y],
     [0.0, 0.0, 1.0, z],
     [0.0, 0.0, 0.0, 1.0]] => foo(x, y, z),

    [a, b, c, _] => bar(a, b, c),
}

Matching on Point or Matrix types sounds like a bad idea considering PartialEq and https://github.com/rust-lang/rust/issues/41255.

@lnicola ahh true, avoiding matching on float literals is a good point. However even without matching on literals, being able to "destructure" fixed size arrays would still be massively useful.

Yup, that sounds like a much better motivating example.

being able to "destructure" fixed size arrays would still be massively useful.

It would, but I don't think that would be covered under a fast-tracked stabilization of trivial slice patterns.

[a, b, c, ..] and [a, b, .., c] (without attaching a pattern to ..) have no syntactic problems and can be stabilized right now if all the bad codeged was fixed (cc @arielb1, is it fixed?). EDIT: I see https://github.com/rust-lang/rust/issues/34708 is still open.

I don't think borrow checker limitation like https://github.com/rust-lang/rust/issues/8636 (e.g. less code is permitted than it potentially can be) are blockers for stabilization.

The compiler doesn't do tail call elimination for this snippet:

#![feature(slice_patterns)]
#![feature(advanced_slice_patterns)]

fn foldr<A, B>(f: fn(&A, B) -> B, acc: B, xs: &[A]) -> B {
  match *xs {
    [] => acc,
    [ref xs.., ref x] => foldr(f, f(x, acc), xs),
  }
}
pub fn main() -> i32 {
  foldr(|a, b| a + b, 0, &[1, 2, 3])
}

https://godbolt.org/g/CWc779

Would it be possible have it? If so I'd like to request taking it into account.

I guess the lack of tail call elimination is a more pervasive problem, e.g. https://godbolt.org/g/uV3Gmn.

I'm not sure this is the right place to post it, but destructing arrays is resulting in broken codegen. It took me something like 2 full days of work to track the cause of a SIGSEGV in my program to something like:

#![feature(slice_patterns)]

#[inline(never)]
fn test1() -> [Vec<usize>; 2] {
    [vec![0], vec![0]]
}

pub fn main() {
    let [a, b] = test1();
}

https://play.rust-lang.org/?gist=f4963044f5cad1834ca619471c047fb6&version=nightly

If you look at either MIR or Assembly it looks like both a and b are being freed twice.

@nikomatsakis I'm confused, how does drop elaboration not simply take care of it? Is drop elaboration just broken for const indexing?

Is there any way I can use this on stable and beta? I'm not doing anything fancy. Just matching against the last two vector items and the first vector item as a separate match.

&[2,1] => {}

I don't want to turn my crate into a nightly only package just for this.

UPDATE:

Nevermind; I figured out how to do it with indexing, cloning the innards, and tuple results.

Based on @petrochenkov's comment back in July, it seems like we could consider a subset of this for stabilization. @rust-lang/lang, thoughts?

Hmm, I'd like to see the subset clearly defined (and see the relevant tests). I am somewhat nervous about stabilizing things here until we have the drop elaboration story fully worked out.

Summary from the @rust-lang/lang meeting: We need a fix for https://github.com/rust-lang/rust/issues/34708 first. As soon as we have that fix, let's revisit this and decide if we should stabilize a subset or wait to stabilize the whole thing.

My take is that I would rather that we stabilize the whole thing at once. I'm a bit nervous to stabilize syntax involving .. if we haven't yet figured out how to give a name to the slice or otherwise define the recursive pattern.

There's no good alternative to the plain .. (without a name), all other patterns (structs, tuple structs, tuples) use it to mean "rest of the list".

There are few viable alternatives for syntax ".. + name" in this thread, but I wouldn't want to have stabilization of the most useful subset blocked on bikeshedding them.

I'd bet $5 that if we made it an error to use .. on a &[RangeFull] or ..name on a &[RangeTo<T>] that nobody would ever notice.

@petrochenkov

I wouldn't want to have stabilization of the most useful subset blocked on bikeshedding them.

I am not sure that [a, .., b] is the most useful subset. I would have thought that the most useful subset is [a, b..] or [a.., b] -- that is, extract something from the front or back and slice the rest. Certainly that's the main thing that I ever want. (Though we have handy methods for it.)

I just want matching against constant fixed-length arrays. The .. stuff you can already do with split_at.

I agree with @jethrogb in that it would be amazing if we could move towards stabilising the simple, unambiguous cases for fixed-size arrays first.

I could get behind this plan (i.e., no .. at all).

Why not x @ ..?

If you have something like [ref mut a, x @ .., ref mut b], what's wrong with having a mutable ref to a, a mutable ref to b, and a mutable slice x that doesn't include a or b? The aliasing still holds up, no?

Edit: Oh I see, I should've read the full thing.

Why not remove the .. entirely?

[a, @, b] if you don't want the thing, [a, x@, b] to capture the thing?

(Not sure if this has been suggested)

https://github.com/rust-lang/rust/issues/34708 was fixed by @mikhail-m1.
Any other codegen bugs blocking stabilization?

@petrochenkov of what specific subset?

If you want to stabilize, can you please open up a new issue that describes the behavior we are stabilizing, giving examples of tests that show how it works? It would be great to also specify what is not being stabilized. Example: https://github.com/rust-lang/rust/issues/48453

I don't know if there are other bugs, I know that @mikhail-m1 has been working on improving various double-drop bugs and analysis, but I think more for slice patterns.

my view of current state

item | state
-----|------
The syntax conflicts with exclusive range patterns. | completed

8636 Matches on &mut[] move the .. match & don't consider disjointness | I will look, need to check is nested cases support needed

26736 cannot move into irrefutable slice patterns with multiple elements | bug only in AST borrowck

34708 double drop with slice patterns | fixed

26619 (Only E-needstest) Yet another bug with slice_patterns | doesn't relate

23311 (Only E-needstest) LLVM Assertion: Both operands to ICmp instruction are not of the same type! | marked done

Excellent, looks like there are no codegen issues and no issues that may require breaking backward compatibility in the future.

In this case I'll prepare a stabilization PR for slice patterns without .., and a mini-RFC attempting to finalize syntax for slice patterns with ...

I just proposed stabilizing the subset without .. patterns in this issue:

https://github.com/rust-lang/rust/issues/48836

@petrochenkov Given the mutual impls of PartialEq for String and &str, I expected this to work: https://play.rust-lang.org/?gist=8157ad5c25d456007b27e3b62b8ce866&version=beta

It seems too strict to require the pattern to have the same element type as the matched slice.

@abonander
This looks more like some extension to https://github.com/rust-lang/rust/issues/42640 using Deref rather than something specific to slice patterns.

Any update on this? Being able to pattern match against Vecs and arrays is hugely important for many people coming from functional languages.

Rust has had basic slice patterns since 1.26. More advanced patterns are being discussed in https://github.com/rust-lang/rfcs/pull/2359.

@jethrogb Ah gotcha, what's the current recommended approach to destruct a Vec into head/tail?

@jethrogb Ok, thanks. I'm hoping to be able to match against [] vs x :: xs in one fell swoop, but split_at with an if would technically work.

You can match on split_first() for that.

Was this page helpful?
0 / 5 - 0 ratings