Rust: Change behavior of `?` as a macro separator and Kleene op in 2018 edition

Created on 30 Jun 2018  路  38Comments  路  Source: rust-lang/rust

This is a request for FCP on PR #51587, according to https://github.com/rust-lang/rust/pull/51587#issuecomment-401428361.

Tracking issue: #48075 (? macro repetition)

Current behavior:

  • ? is a macro separator, not a Kleene op.

Proposed new behavior (implemented in #51587):

  • Edition 2015: ? is a macro separator, not a kleene op, but using it as a separator triggers a migration lint.
  • Edition 2018: ? is _not_ a valid separator. ? _is_ a valid Kleene op.

EDIT: To clarify: Under this proposal, $(a)?+ matches + and a+ unambiguously.

Why we chose this behavior:

  • It's the cleanest choice both in terms of end-user experience and implementation. See the alternatives.

Alternatives: The main problem to address is ambiguity. Currently, ? is a valid separator, so there are a lot of ambiguous possibilities like $(a)?+. Is the ? a separator for 1 or more iterations OR is it a zero-or-one kleene op followed by a + token? The solutions fall into 2 categories: make breaking changes or add fallbacks.

  1. Fallbacks

    • The original implementation (currently on nightly behind feature gate macro_at_most_once_rep):



      • The ? kleene op is allowed to take a separator which can never be used (we don't allow trailing separators and ? allows at most one repetition).


      • When parsing a macro, we disambiguate by looking after the separator: if ? is followed by +, *, or ?, we treat the ? as a separator and the second token as a kleene op.



    • On the tracking issue, we decided that it is confusing for ? to take a separator, so the rules would be that ? followed by + or * is a separator, but ? followed by ? or any other token would be a ? kleene op followed by the other token.



      • This is a viable implementation, but it seems like a lot of special cases and the code gets messy. This led us more towards the breaking changes options.



    • There are probably lots of variations of fallbacks we could do, but they all have weird corner cases.

  1. Breaking changes

    • Make a breaking change in the _2015_ edition (implemented in #49719, accidentally merged without FCP, and rolled back).



      • Disallow ? as a separator in _2015_ edition.


      • Disallow ? kleene op from taking a separator.


      • This means that $(a)?+ matches + and a+ unambiguously. However, it is a _breaking change_. This was deemed not worthy of a breaking change in the 2015 edition.



    • Make a breaking change in the _2018_ edition and lint in the _2015_ edition. No change to behavior in _2015_. This is the current proposal implemented in #51587 and is the subject of this FCP.



      • The primary downside here is that it's not clear what happens if you import a macro from 2015 into 2018 that uses ? as a separator. Currently, you would just get a compile error.


      • I believe the breakage is likely to be extremely rare. A crater run showed no breakage (https://github.com/rust-lang/rust/pull/49719#issuecomment-381364742). However, we did found out later that there was apparently some breakage elsewhere (https://github.com/rust-lang/rust/pull/49719#issuecomment-394190428).


      • The implementation is significantly cleaner and more maintainable and the behavior is more consistent for end-users.



cc @nikomatsakis
cc @petrochenkov Reviewed the second attempt
cc @durka @alexreg @clarcharr @Centril @kennytm @ExpHP Who participated extensively in the original design discussions (EDIT: apologies if I missed anyone; there were a lot of people who came in at various points)
cc @oli-obk Who brought up clippy breakage
cc @sgrif @pietroalbini Who wanted to see an FCP

Whew... that's a lot of people :)

T-lang disposition-merge finished-final-comment-period

Most helpful comment

:bell: This is now entering its final comment period, as per the review above. :bell:

All 38 comments

Nice work, @mark-i-m. I think we finally have a fully appropriate solution in all cases, at least as I understand it. LTGM!

I think this is too much hassle over nothing.
This is a very useful feature and crater found zero regressions, we should make a "breaking change" implement it on edition 2015.

The one symbol lookup checking for another Kleene op and reducing the breakage from zero to absolute zero also doesn't look like too much complexity or special casing to me.

Also, the separator should be allowed on both left-hand and right-hand sides of a macro (this is the first thing I tried to use when trying ? in https://github.com/rust-lang/rust/pull/51726/commits/399da7bc351714a0bc829bbbf2ab1f0b3e4f60f8), or conservatively accepted and reported as a non-fatal error.

I don't really see a point in making breaking changes like this in the current edition, especially now that we have editions and the next one is right around the corner.

What would happen for macro definitions across editions?

// edition (4033 - A)
macro_rules! define_macro {
    ($($t:tt)*) => { 
        macro_rules! use_macro {
            ($($t)*) => {}
        }
    }
}

// edition A
define_macro!($(a)?*);

use_macro!(a?a?a);  // ???
use_macro!(a*);     // ???

@kennytm In the current design, if you use ? as a separator in 2018, you get an error. Perhaps this is a rare enough case that that's ok? The fix is frankly trivial: use any other separator than ?...

@kennytm In the current design, if you use ? as a separator in 2018, you get an error. Perhaps this is a rare enough case that that's ok? The fix is frankly trivial: use any other separator than ?...

@mark-i-m but can a 2018 crate use a macro defined in a 2015 crate that uses ? as a separator?

(going even further down the rabbit hole, there is the possibility of a 2015 crate with a macro that generates a macro_rules! definition that uses ? as a separator. ISTM that this clearly would not be callable from 2018... however, the odds of any such macro existing seem to be a trillion to one)

Also, it is unclear to me from the summary, but:

This means that $(a)?+ matches + and a+ unambiguously.

Is this describing the new proposed behavior in 2018, or the behavior of an alternative design? It seems desirable to me.

@ExpHP

@mark-i-m but can a 2018 crate use a macro defined in a 2015 crate that uses ? as a separator?

Sorry, perhaps I misunderstanding. What is the distinction from @kennytm's point?

Yes, in 2018 (according to the current plan) if you were to use ? as a separator (even coming from a 2015 crate), you would get an error. The expectation is that this case is unlikely enough and easy enough to fix that we are ok with the breakage across an edition boundary.

This means that $(a)?+ matches + and a+ unambiguously.

Is this describing the new proposed behavior in 2018, or the behavior of an alternative design? It seems desirable to me.

Yes, this describes the proposed behavior (implemented in #51587).

What is the distinction from @kennytm's point?

None that I know of, it just sounded to me like you and I interpreted his question differently, especially due to the quote

The fix is frankly trivial: use any other separator than ?...

which is not something a 2018 consumer of a 2015 macro crate can do.

which is not something a 2018 consumer of a 2015 macro crate can do.

Yes, that's true. However, I believe the fact that the fix is easy mitigates this a bit, since you can often make a PR to fix the problem, and a PR with this fix ought to be easy to review.

Again, I agree that this is not a super great situation, but I expect it to be rare.

Ignoring the state of the current implementation: I believe the expected behavior for the cross-crate case would be that a macro defined in the 2015 Edition crate would not treat ? as "0 or 1" but rather as separator. Put another way, the behavior of macros across crates is a somewhat orthogonal issue. (We have an open issue or two on that topic, and it does need attention and resolution, don't get me wrong.)

Regarding regressions, I'm curious: @petrochenkov asserted that a crater run found zero regressions, but I believe that @mark-i-m cited some regressions to me. Which is correct?

@rfcbot fcp merge

I'm going to propose that we accept this proposal. I feel like using an edition here for backwards compatibility is the right thing to do and it's an easy enough for us to do.

The one thing that gives me pause here is that we are effectively losing expressiveness: there is no (convenient, at least) way in Rust 2018 to get the equivalent of $(foo)?* -- but I agree with @petrochenkov that this is quite the edge-case and we can live with it. I'm not sure anyone cares.

At worst, I suppose, you could isolate the macro to Rust 2015 crate that you use as a dependency. =)

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @aturon
  • [x] @cramertj
  • [x] @eddyb
  • [x] @joshtriplett
  • [x] @nikomatsakis
  • [x] @nrc
  • [ ] @pnkfelix
  • [x] @scottmcm
  • [x] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

The one thing that gives me pause here is that we are effectively losing expressiveness

Escaping could be the way forwards, if we decide we really need this...

there is no (convenient, at least) way in Rust 2018 to get the equivalent of $(foo)?*

There's no way to express $(foo)++ in any Rust edition either 馃槃 (and that's one reason we get the weird syntax where T: X + Y +) (rust-lang/rfcs#991). When declarative-macros 2.0 has a solution for using + and * as a separator it could be easily extensible to cover ? as well if anyone cares.

Regarding regressions, I'm curious: @petrochenkov asserted that a crater run found zero regressions, but I believe that @mark-i-m cited some regressions to me. Which is correct?

Both. Crater found no regressions, but we still broke clippy.

Ignoring the state of the current implementation: I believe the expected behavior for the cross-crate case would be that a macro defined in the 2015 Edition crate would not treat ? as "0 or 1" but rather as separator. Put another way, the behavior of macros across crates is a somewhat orthogonal issue. (We have an open issue or two on that topic, and it does need attention and resolution, don't get me wrong.)

So IIUC, the current implementation would not change, but a different edition would be passed to the macros parser? So the macro parser would think it was in edition 2015 even though we are in a 2018 crate?

@mark-i-m I think the right way to think of it is that one can ask what Edition a given span is in.

There wouldn't really be a notion of "current edition" for the most part

  1. Fallbacks
    This is a viable implementation, but it seems like a lot of special cases and the code gets messy.

As you can see in https://github.com/rust-lang/rust/pull/51587 the edition-specific logic is much worse than previously implemented fallback.

I don't understand what kind of priorities you need to have to end up preferring such edition split and removal of a useful feature to introducing one-symbol lookup logic.

As you can see in #51587 the edition-specific logic is much much worse than previously implemented fallback.

I was surprised to read this comment. I find this implementation significantly cleaner than what was rolled back. The 2015 logic is similar to what on nightly today, and the 2018 logic very small without the feature gate. This clean 2018 logic is what well be updating in the future, which I consider a benefit.

removal of a useful feature to introducing one-symbol lookup logic.

Not removing; delaying until the edition in a couple of months...

Mostly a tangent, but I would absolutely love to have a solution to $(...)++ (Which presumably covers this case as well), its pretty commonly needed for me.

Using +, *, and ? as separators we seem to have determined is an extremely rare requirement, and not worth supporting right now. Perhaps in the future it might get supported via escape characters.

Hmm... Perhaps we should explore escaping tokens as separators? Will open a thread on internals.

:bell: This is now entering its final comment period, as per the review above. :bell:

I think the current proposal is forward compatible with separator escaping, right?

@mark-i-m I don't think so, no. Because any future escape char chosen would be recognised as a separator char under the current behaviour (or if $ were chosen, that would screw things up too). It would require a new edition lint. Maybe best to reserve an escape char now?

Hmm... It looks like currently \ is not allowed (the lexer errors out). There are also other combinations that are not allowed (e.g. the suggestion from the internals thread $(a)(?)). So it doesn't seem like too much of a stretch to think that we could find a reasonable escaping scheme after finishing this...

Ah, that's better than I thought.

@mark-i-m BTW, could you kindly expand the bit in the book about lints with the knowledge you picked up from writing this PR? :-)

The final comment period, with a disposition to merge, as per the review above, is now complete.

@mark-i-m Great. Maybe r? @nikomatsakis now?

The corresponding PR has been merged, closing this issue.

@mark-i-m: isn't using the operator in 2015 as a separator supposed to trigger a migration lint? It doesn't seem to at the moment.

@varkor You're not using it as a separator. You are using it as a Kleene op. And it does actually give the correct error:

error: expected `*` or `+`
 --> src/main.rs:7:33
  |
7 |   (do $b1:block $(and $b2:block)?) => {
  |                                 ^
  |
  = note: `?` is not a macro repetition operator

If you want the migration lint, you will need to add #![warn(question_mark_macro_sep)] or #![warn(rust_2018_compatibility)]:

warning: using `?` as a separator is deprecated and will be a hard error in an upcoming edition
 --> src/main.rs:8:33
  |
8 |   (do $b1:block $(and $b2:block)?*) => {
  |                                 ^
  |
note: lint level defined here
 --> src/main.rs:5:9
  |
5 | #![warn(rust_2018_compatibility)]
  |         ^^^^^^^^^^^^^^^^^^^^^^^
  = note: #[warn(question_mark_macro_sep)] implied by #[warn(rust_2018_compatibility)]
  = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in the 2018 edition!
  = note: for more information, see issue #48075 <https://github.com/rust-lang/rust/issues/48075>
Was this page helpful?
0 / 5 - 0 ratings