Rust: Tracking issue for RFC 2535, 2530, 2175, "Or patterns, i.e `Foo(Bar(x) | Baz(x))`"

Created on 7 Oct 2018  路  44Comments  路  Source: rust-lang/rust

This is a tracking issue for the RFC "Or patterns, i.e Foo(Bar(x) | Baz(x))" (rust-lang/rfcs#2535).

This issue also tracks the changes in rust-lang/rfcs#2175 and rust-lang/rfcs#2530 since RFC 2535 subsume those.

Steps:

  • [x] Implement the RFC (cc @rust-lang/compiler -- can anyone write up mentoring instructions?)
  • [ ] Polish the implementation
  • [x] Dogfood or-patterns in the compiler and standard library (done in #71220, #71221, and #71231)
  • [ ] Accumulate good test coverage
  • [ ] Adjust documentation (see instructions on forge)
  • [ ] Stabilization PR (see instructions on forge)

Unresolved questions:

  • [ ] Should we allow top_pat or pat<allow_top_alt> in inferrable_param such that closures permit |Ok(x) | Err(x)| without first wrapping in parenthesis?

    We defer this decision to stabilization as it may depend on experimentation. Our current inclination is to keep the RFC as-is because the ambiguity is not just for the compiler; for humans, it is likely also ambiguous and thus harder to read.

    This also applies to functions which, although do not look as ambiguous, benefit from better consistency with closures. With respect to function arguments there's also the issue that not disambiguating with parenthesis makes it less clear whether the type ascription applies to the or-pattern as a whole or just the last alternative.

  • [ ] Should the pat macro fragment specifier match top_pat in different
    Rust editions or should it match pat<no_top_alt> as currently specified? We defer such decisions to stabilization because it depends on the outcome of crater runs to see what the extent of the breakage would be.

The benefit of avoiding pat<no_top_alt> in as many places as possible would both be grammatical consistency and fewer surprises for uses. The drawbacks would be possible ambiguity or backtracking for closures and breakage for macros.

Implementation history:

A-exhaustiveness-checking A-patterns B-RFC-approved B-RFC-implemented B-unstable C-tracking-issue F-or_patterns I-nominated T-lang

Most helpful comment

What remains now is primarily hardening of tests, fixing smaller bugs where they show up, as well as dogfooding the implementation in the compiler and standard library.

All 44 comments

I've closed https://github.com/rust-lang/rust/issues/48215 in favor of this issue to track all related changes since https://github.com/rust-lang/rfcs/pull/2535 subsumes RFC 2175.

I would suggest that #![feature(or_patterns)] should imply #![feature(if_while_or_patterns)] and then we will eventually stabilize both gates if/once we stabilize or_patterns.

Minor thing that I just noticed: anonymous enums with the syntax of "tuple-but-|-instead-of-,", (A|B), is syntactically ambiguous with this feature, since we have parenthesized patterns.

@CAD97 example of a snippet that would show the ambiguity?

I confused things between patterns and types:

match it {
    it: (A|B) => (), // it with type ascription (A|B)
    it @ (A|B) => (), // it binding to pattern (A|B)
}

So long as patterns and types aren't valid in the same position, this is ok. (I confused myself since tuples are valid patterns.)

I've got a branch for this that I think is close, but as I'm quite busy at the moment, I'm not sure how soon I'll finish it; if anyone else would like to take this over in the meantime, just leave a comment.

@varkor Where's your branch? I'm happy to take over if you like... especially if it gets const generics to land any sooner. ;-)

@alexreg: it's at https://github.com/varkor/rust/tree/or-patterns. I think where I left it, there was some crash when you tried using or-patterns, but most of the boilerplate is in. Feel free to play around with it to see if you can get it working. (Though I'm not prioritising this over const generics, don't worry 馃槈)

@dlrobertson BTW, have you taken into account the unresolved questions whilst working on this?

@alexreg thanks for the ping. Haven't really thought about it too much. I've mostly been experimenting with the code, but at this point I am not keen on changing more than the pattern matching code for the following reasons:

  • I don't see much of a benefit in allowing |Ok(x) | Err(x)|
  • | feels very ambiguous here. When we hit a | when parsing a closure, is it a or-pattern
    or the end of the closure argument patterns? We would have to parse past the end to find out.

That being said, I'm still new, so I might be missing something.

@dlrobertson Thanks for working on this. The more important part to test with crater is whether top_pat can be used for pat macro fragments or if it has regressions so it would be good to start with top_pat and maybe switch to pat<no_top_alt> based on the data.

I think it's also worth considering fn foo(Ok(x) | Err(x): Result<A, B>) { .. } without parens since that doesn't look ambiguous like with closures (the part about type ascription is also irrelevant since both p and q must have the same type in p | q so it shouldn't matter what the ascription applies to). But we can leave that for later.

@dlrobertson That's fair enough. What @Centril said is good advice.

Back to looking at this. An update for anyone following this (and a note to self so I don't forget things).

TODO:

  • [x] Figure out lowering (particularly figure out if lowering should occur in HAIR -> MIR or HIR -> HAIR).
  • [x] be sure to remove the Vec<Pat> from match and while/if let since we now have a p | q pattern form. - @Centril
  • [x] ensure parsing conforms to the RFC

I have or-patterns in a usable state on a local branch. I'm sure there are bugs, but the examples I have work. The change is much larger than I expected. As a result, I plan to submit support for a subset of the syntax proposed in the RFC over 4 PRs:

  • [x] basic parsing (#61708)
  • [x] lowering for simple patterns (no subpatterns) (#63688)
  • [x] lowering patterns that include subpatterns
  • [x] exhaustive match checks

After these are submitted and merged I will continue to work on implementing the full syntax proposed in the RFC and fix bugs found along the way :smile:

Basic parsing PR https://github.com/rust-lang/rust/pull/61708 landed on 2019-08-18. It was written by @dlrobertson and reviewed by @varkor, @Centril, @petrochenkov, @matthewjasper, and @alexreg.

Would implementation of this RFC also apply to patterns in if conditionals?

For example, could I do something like this?

if some_str == "this" | "that" | "whatever" {
// ...
}

EDIT: Reading the guide level section of the RFC, I would suppose not, as the contents of an if conditional is not really a pattern in the same way that it is in if let or match, and I suppose today you can already do something like this with match:

match some_str {
    "this" | "that" | "whatever" => {},
    _ => {}
}

@jamesmunns Or with this if let:

fn discovered_universal_model_of_computation(name: &str) {
    if let "Alan" | "Alonzo" | "Kurt" = name {
        println!("Yes, {} did discover a turing equivalent model of computation.", name);
    } else {
        println!("Nope, but {} was pretty cool nonetheless.", name);
    }
}

fn main() {
    for name in &["Alan", "Alonzo", "Kurt", "Voevodsky"] {
        discovered_universal_model_of_computation(name);
    }
}

In https://github.com/rust-lang/rust/pull/64128, written by @Centril and reviewed by @davidtwco, issue https://github.com/rust-lang/rust/issues/64106 was fixed where the compiler would spuriously emit unused_parens warnings on:

  • ref? mut? x @ (p_0 | ... | p_n)
  • box (p_0 | ... | p_n)
  • & mut? (p_0 | ... | p_n)
  • |(p_0 | ... | p_n)| $expr
  • fn foo((p_0 | ... | p_n): $type)

In https://github.com/rust-lang/rust/pull/64111, written by @Centril and reviewed by @petrochenkov, the late-resolution behavior of or-patterns in nested positions was fixed as was the already-bound check and binding-consistency check. Moreover, the AST now uniformly uses ast::PatKind::Or and has no special means of representing or-patterns at the top level.

I'd be interested in implementing the exhaustiveness-checking part of this feature, since I've become familiar with the exhaustiveness algorithm

@Nadrieril Take a look at https://github.com/rust-lang/rust/pull/63688/files#diff-11dbfaea8934544f878f807e6285a221 -- it may be that @dlrobertson has already solved the issue there, but we could expedite this part. Also, we should include some tests for some non-exhaustive cases (the exhaustive cases may need further work in MIR to not ICE). It might be best to fix this once https://github.com/rust-lang/rust/pull/64508 has landed tho.

@Nadrieril great to see that you're interested in helping out! Due to the shift in the style of implementation ("expanding" the or-pattern at the Candidate level instead of manually manipulating the CFG) the exhaustiveness changes will be made in the MIR generation PR. That being said, as @Centril noted tests etc will definitely be needed/appreciated.

On 2019-09-25, https://github.com/rust-lang/rust/pull/64508, written by @Centril and reviewed by @matthewjasper, @varkor, and @nikomatsakis, landed. The PR adjusted the HIR and HAIR to consistently use or-patterns at the top & nested levels. Moreover, liveness, typeck, and dead_code analyses were adjusted also. Notably, The PR did however not adjust exhaustiveness checking. Tangentially, https://github.com/rust-lang/rust/pull/64271, landed 2019-09-12, written by @Centril and reviewed by @estebank, adjusted check_match diagnostics logic to be better prepared for or-patterns.

Updates since last time:

On 2019-12-03, #66967, written by @Nadrieril and reviewed by @varkor and @Centril, landed. In this follow-up to #66612, the top-level hack for or-patterns in exhaustiveness checking was removed, making or-patterns truly first-class in match checking.

On 2019-12-21, https://github.com/rust-lang/rust/pull/67428, written by @Centril and reviewed by @matthewjasper, landed. This PR adjusted regionck's resolve_local::is_binding_pat such that the P& grammar includes or-patterns.

On 2019-12-22, https://github.com/rust-lang/rust/pull/67439, written by @Centril and reviewed by @matthewjasper, landed. Among other things, this PR simplified HAIR lowering of or-patterns a smidgen.

On 2019-12-26, https://github.com/rust-lang/rust/pull/67592, written by @matthewjasper and reviewed by @Centril, landed. This PR did some preparatory cleanups in MIR lowering of match expressions before the MIR level support for or-patterns is added. Moreover, the PR gated or-patterns in const contexts under const_if_match.

What remains now is primarily hardening of tests, fixing smaller bugs where they show up, as well as dogfooding the implementation in the compiler and standard library.

  • On 2020-02-15, https://github.com/rust-lang/rust/pull/68856, written by @Centril and reviewed by @matthewjasper landed. The PR polished the type checking of patterns with respect to default binding modes. In relation to or-patterns, the PR declared that or-patterns are pass-through (AdjustMode::Pass) in terms of default binding modes. This was already the case in the implementation (and already stabilized for top level or-patterns) but the PR removed the FIXME that left the question open.
  • On 2020-03-11, https://github.com/rust-lang/rust/pull/69891, written by @Centril and reviewed by @varkor and @Nadrieril, landed. The PR fixed a bug (https://github.com/rust-lang/rust/issues/69875) in exhaustiveness checking where the patterns 0 | (1 | 2) and x @ 0 | x @ (1 | 2) would result in an ICE.

Is there anything blocking a subset of this from being stabilized? It seems like the most basic form of this feature is working and used throughout the compiler and std.

The most pressing blocking issue is the remaining ICE: https://github.com/rust-lang/rust/issues/72680, though there appear to be a couple of outstanding design questions. I would have thought it would be close to ready for stabilisation apart from that.

@varkor I've opened a PR for #72680, and I'm wondering if the design questions can be resolved after stabilization. It seems like both are backwards-compatible changes we can do later, right?

EDIT: I mean #78856

The former design question (about allowing | in closure arguments) does not need to be resolved immediately (alternatively, I think it should just be resolved in the negative now, because allowing them would be very confusing). But the latter seems like a backwards-incompatible change, if macro behaviour depends on it? But it seems like this was already considered for an edition change instead of being implemented immediately, anyway, so perhaps it can be deferred: the potential for increasing ambiguity for macros doesn't sound promising.

Personally, I'm inclined to wait for it to be a problem and address it in a future edition if needed. What would the process be for resolving the question? An MCP? An FCP? Or should I just open a stabilization PR?

One would need to make sure that we have extensive tests for the feature, and also that sufficient documentation exists. Then you could open up a stabilisation PR, and the lang team could FCP on the PR.

I'll nominate this for the @rust-lang/lang meeting, also to discuss the two unresolved questions. I think I agree with @varkor about the first one (closure arguments) -- I would definitely find such notation quite confusing. The second one is less clear to me (modifications to macro pattern matching), but it is almost certainly backwards incompatible. We've definitely been bitten by changes like that in the past breaking macros in the wild.

:+1: for not allowing this notation in closure arguments.

We discussed this in the 2020-11-10 @rust-lang/lang meeting. Our conclusion:

  • We seem to have consensus that we shouldn't allow this at the top level of closure arguments. It should suffice that we allow it within parentheses or deeper into a pattern. For instance, you can write |SomeNewtype(A(x) | B(x))| .
  • We feel that it should be possible to match patterns that use or-patterns using :pat in a macro, and this ought to happen along with the stabilization, so that people don't end up having to work around it in their macros. The most likely case where that would introduce breakage would be a macro that's trying to implement or-pattern syntax, and if you write such a macro in the most obvious way, changing :pat shouldn't break it. There may be other macros that are using | as a generic separator, though. We'd like to see a crater run to gauge the impact of this, but we're favorable.

I created https://github.com/rust-lang/rust/pull/78935 for a crater run.

Was this page helpful?
0 / 5 - 0 ratings