Rust: Error reporting from attribute macros regressed in 1.46.0

Created on 5 Sep 2020  Â·  11Comments  Â·  Source: rust-lang/rust

We take particular care to make sure that compilation errors associated with the items to which our attribute macros are applied are comprehensible. To that end we have tests to check the output of a variety of expected illegal programs to make sure developers could reasonably be expected to understand and correct those errors. In 1.46.0 we noticed a change in that behavior. While there were some beneficial changes to the way errors qualify structs, traits, etc. there seems to be a regression where 1.46.0 shows less information than in 1.45.0

I've put together a small repo that demonstrates the problem: https://github.com/ahl/span_regression

The first example is this macro:

#[proc_macro_attribute]
pub fn identity(
    _attr: proc_macro::TokenStream,
    item: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    item
}

Now consider this test program:

struct Thing {
    x: i32,
    y: i32,
}

#[identity]
fn doit() -> Thing {
    Thing {
        "howdy".to_string(),
    }
}

Under 1.45.0 one of the errors produced looks like this:

error[E0063]: missing fields `x`, `y` in initializer of `Thing`
  --> examples/test.rs:10:5
   |
10 |     Thing {
   |     ^^^^^ missing `x`, `y`

Under 1.46.0 that has changed to this:

error[E0063]: missing fields `x`, `y` in initializer of `Thing`

Note that under 1.46.0 no code is underlined.

Now consider a slightly more complicated macro:

#[proc_macro_attribute]
pub fn nested(
    _attr: proc_macro::TokenStream,
    item: proc_macro::TokenStream,
) -> proc_macro::TokenStream {
    let mut ret = proc_macro::TokenStream::new();
    let pre = vec![
        proc_macro::TokenTree::Ident(proc_macro::Ident::new("fn", proc_macro::Span::call_site())),
        proc_macro::TokenTree::Ident(proc_macro::Ident::new("foo", proc_macro::Span::call_site())),
        proc_macro::TokenTree::Group(proc_macro::Group::new(
            proc_macro::Delimiter::Parenthesis,
            proc_macro::TokenStream::new(),
        )),
        proc_macro::TokenTree::Group(proc_macro::Group::new(proc_macro::Delimiter::Brace, item)),
    ];
    ret.extend(pre);
    ret
}

(Note I had been using quote! but wanted to make sure that wasn't causing the problem)

With a similar example as above on 1.45.0:

error: expected identifier, found `"yo"`
  --> examples/test.rs:20:9
   |
19 |     Thing {
   |     ----- while parsing this struct
20 |         "yo".to_string(),
   |         ^^^^ expected identifier

With 1.46.0:

error[E0308]: mismatched types
  --> examples/test.rs:17:1
   |
17 | #[nested]
   | ^^^^^^^^^
   | |
   | expected struct `Thing`, found `()`
   | expected `Thing` because of return type

Rather than pointing to the offending code, the rustc error now points (unhelpfully) to the macro itself.

The general improvements made to error reporting in 1.46.0 (simpler naming, reduced duplicate errors) is greatly appreciated. This was one small regression I saw amongst the otherwise monotonic improvements.

A-diagnostics A-proc-macros C-bug ICEBreaker-Cleanup-Crew P-high regression-from-stable-to-stable

Most helpful comment

This is caused by https://github.com/rust-lang/rust/pull/73345 - since we no longer pass a Nonterminal to an attribute macro, the pretty-print/reparse check get used (and fails), which causes us to lose spans.

This is a very interesting corner case - the span_regression error is testing the behavior of a proc-macro called on a syntactically invalid Item. It looks like this relies on the parser managing to recover an Item, despite the fact that the body is syntactically invalid.

I think there are two possible ways to fix this:

  1. Disable the pretty-print/reparse check when we recovered from parse errors during the attribute target - this would restore the 1.45.0 behavior of the span_regression repository. However, this would effectively mean that the parser recovery behavior is now part of the stable surface of the compiler, instead of just a trick for producing more meaningful error messages. Any future changes to parser recovery could cause us to start invoking an attribute macro, if we were able to parse enough of the target to produce an AST struct.
  2. Skip invoking attribute macros if there were any errors during the parsing of the attribute target, regardless of recovery behavior. Since attributes produce an arbitrary TokenStream as output, this would require us to remove the target item entirely (since we can't know what the attribute macro would have produced). Effectively, custom attribute macros would disable parser recovery.

I'm leaning towards option 2 (assuming that it doesn't cause significant regressions in practice). Bang-proc-macros are the way to invoke a proc-macro on an arbitrary TokenStream - however, attributes macros are (syntactically) just another attribute, so it doesn't seem to make sense to invoke them on something that's not a syntactically valid attribute target. While the span_regression example is 'good enough' for recovery to be able to kick in, I think going with approach 1 would cause us to end up with an unprincipled list of a special cases where we invoke an attribute macro on a syntactically invalid target.

On the other hand, it's possible that option 1 could lead to a large number of spurious errors in practice - if the user introduces a syntax error when modifying a commonly-used struct (which has an attribute macro applied), removing this struct entirely could cause many spurious resolution errors. Invoking the attribute macro might lead to better (but still invalid) tokens being produced, leading to cleaner error messages.

All 11 comments

If I remember correctly, we should still be able to use cargo-bisect-rustc to bisect a regression in 1.46.0.
@rustbot ping cleanup

Hey Cleanup Crew ICE-breakers! This bug has been identified as a good
"Cleanup ICE-breaking candidate". In case it's useful, here are some
[instructions] for tackling these sorts of bugs. Maybe take a look?
Thanks! <3

cc @AminArria @camelid @chrissimpkins @contrun @DutchGhost @elshize @ethanboxx @h-michael @HallerPatrick @hdhoang @hellow554 @imtsuki @kanru @KarlK90 @LeSeulArtichaut @MAdrianMattocks @matheus-consoli @mental32 @nmccarty @Noah-Kennedy @pard68 @PeytonT @pierreN @Redblueflame @RobbieClarken @RobertoSnap @robjtede @SarthakSingh31 @senden9 @shekohex @sinato @spastorino @turboladen @woshilapin @yerke

Cc @petrochenkov

This is caused by https://github.com/rust-lang/rust/pull/73345 - since we no longer pass a Nonterminal to an attribute macro, the pretty-print/reparse check get used (and fails), which causes us to lose spans.

This is a very interesting corner case - the span_regression error is testing the behavior of a proc-macro called on a syntactically invalid Item. It looks like this relies on the parser managing to recover an Item, despite the fact that the body is syntactically invalid.

I think there are two possible ways to fix this:

  1. Disable the pretty-print/reparse check when we recovered from parse errors during the attribute target - this would restore the 1.45.0 behavior of the span_regression repository. However, this would effectively mean that the parser recovery behavior is now part of the stable surface of the compiler, instead of just a trick for producing more meaningful error messages. Any future changes to parser recovery could cause us to start invoking an attribute macro, if we were able to parse enough of the target to produce an AST struct.
  2. Skip invoking attribute macros if there were any errors during the parsing of the attribute target, regardless of recovery behavior. Since attributes produce an arbitrary TokenStream as output, this would require us to remove the target item entirely (since we can't know what the attribute macro would have produced). Effectively, custom attribute macros would disable parser recovery.

I'm leaning towards option 2 (assuming that it doesn't cause significant regressions in practice). Bang-proc-macros are the way to invoke a proc-macro on an arbitrary TokenStream - however, attributes macros are (syntactically) just another attribute, so it doesn't seem to make sense to invoke them on something that's not a syntactically valid attribute target. While the span_regression example is 'good enough' for recovery to be able to kick in, I think going with approach 1 would cause us to end up with an unprincipled list of a special cases where we invoke an attribute macro on a syntactically invalid target.

On the other hand, it's possible that option 1 could lead to a large number of spurious errors in practice - if the user introduces a syntax error when modifying a commonly-used struct (which has an attribute macro applied), removing this struct entirely could cause many spurious resolution errors. Invoking the attribute macro might lead to better (but still invalid) tokens being produced, leading to cleaner error messages.

@ahl Thanks for providing such a nicer reproducer!

@Aaron1011 thanks for the quick diagnosis.

I agree with your assessment that both options have downsides (I think in your final paragraph you're describing option 2 rather than option 1 vis-a-vis removing the struct entirely).

Option 1 seems like a bit of a mess; option 2 seems more "philosophically pure" while being a little less pragmatic. I agree that option 2 is probably preferable and that a crisp semantic like "only valid items will cause the invocation of attribute macros" is a very clear delineation between proc_macro and proc_macro_attribute.

@Aaron1011 I agree with your assessment and think that the second option is the obviously correct choice for us. This might cause some weird errors for proc macros with pseudo DSLs that are no longer turned into valid code, but given that we are already recovering the parse most of the body will be treated as empty anyways, so it should be fine™.

Could we also add something where the compiler emits one of its “this is a bug” messages if an error doesn’t show span information? Or is that not feasible?

@camelid: Right now, we deliberately create these 'dummy spans' due to issue https://github.com/rust-lang/rust/issues/43081. PR https://github.com/rust-lang/rust/pull/76130 makes progress towards eliminating these cases, but there's still more work to be done.

In the future, I think it would definitely make sense to emit a warning, and direct users to open a bug report. However, I don't think we'd want to emit an actual ICE - it would just create noise, since the query stack and call stack aren't useful for debugging missing spans.

@Aaron1011 Okay, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nikomatsakis picture nikomatsakis  Â·  236Comments

withoutboats picture withoutboats  Â·  211Comments

withoutboats picture withoutboats  Â·  202Comments

Centril picture Centril  Â·  382Comments

alexcrichton picture alexcrichton  Â·  240Comments