Rustfmt: allow optional programmer discretion on chain length/function arguments per line

Created on 3 Jul 2020  路  18Comments  路  Source: rust-lang/rustfmt

When lines hit max_width (or its derivatives), rustfmt will break chains and function calls with one call or argument per line. We have found this to result in code that is less clear, and no more consistent with Rust's style than the code that it replaces. We would like to see an option whereby these chains and function calls would be left to programmer discretion as long as they observe the constraints of indentation and max_width.

To make this more concrete, we have experienced this most recently in the embedded Rust world, where hardware-abstracting crates make extensive use of types to constrain context (viz. svd2rust's API).

The upshot of this is that we end up with long chains that are asymmetric in that different positions in the chain actually denote different states. Often, the programmer will use the line break to help clarify what's going on; some (real) examples:

fn func() {
    p.RCC.pllcfgr.modify(|_, w| {
        w.pll1vcosel().wide_vco()
            .pll1rge().range8()
            .divp1en().enabled()
            .divq1en().enabled()
            .divr1en().enabled()
    });

    p.RCC.d1cfgr.write(|w| {
        w.d1cpre().div1()
            .d1ppre().div1()
            .hpre().div1()
    });
}

In this case, the programmer has clearly grouped the chain to indicate the type states -- and to the reader, both uses are clear. rustfmt, however, doesn't like either of these expressions; they get reformatted as:

fn func() {
    p.RCC.pllcfgr.modify(|_, w| {
        w.pll1vcosel()
            .wide_vco()
            .pll1rge()
            .range8()
            .divp1en()
            .enabled()
            .divq1en()
            .enabled()
            .divr1en()
            .enabled()
    });

    p.RCC
        .d1cfgr
        .write(|w| w.d1cpre().div1().d1ppre().div1().hpre().div1());
}

In both cases, meaning has been lost -- even though the code as written is consistent with Rust's style. We can of course wrap these constructs in #[rustfmt skip] -- but they come up often enough that we would very much rather fix it in rustfmt. And to be clear, we don't want these to be compressed either (as raised in #4146 and #2010); we seek to have them be optionally left to the programmer's discretion, within the bounds of the configured style.

Thank you in advance for your consideration and your work on this essential (if underappreciated!) part of the Rust ecosystem!

cc: @cbiffle @steveklabnik @davepacheco @ahl

poor-formatting

Most helpful comment

@joshtriplett FWIW I'll just throw in my 2c that annotations like you describe can really grow to encumber code bases with more "noise" than code. Once a formatter (or tool) starts to go down this path, it's a slippery slope.
If it's possible to avoid adding annotations, by adding smarts that works as-desired 9 times out of 10 (leave the blocks as specified by the programmer, split into one-call-per-line if necessary?) then personally I'd rather live with that 1 time out of 10 or 20 being annoying, instead of always having to deal with manual annotations.

All 18 comments

So, rustfmt doesn't really have the concept of "programmer discretion"; it's mostly intended to produce the same output given the same parse tree, regardless of whitespace. There are exceptions to that; for instance, rustfmt has the concept of "paragraph breaks", where a blank line between blocks of code will be preserved, on the assumption that it was intended to break up code into logical chunks.

I can see how this would be similar. On the other hand, what happens if one of those .x().y().z() blocks becomes longer than the configured line width? Would you expect that one to get split into one-call-per-line, with the others remaining untouched? What happens if the arguments to a function are long?

I do think rustfmt could do a much better job with chains, in a variety of ways. For example, I'd love to see rustfmt consider putting simple calls on the same line of the ) or }) from a complex call:

foo.bar(
    big_arg1,
    big_arg2,
})?.build()?;

foo.bar(BigStruct {
    ...
}).await?

And for the case you're describing, it looks like there is a common pattern as well, and I'm wondering if it could be easily expressed in a way rustfmt could learn to handle:

#[rustfmt::chain(start_line)]
fn pll1rge(...)

#[rustfmt::chain(end_line)]
fn range8(...)

#[rustfmt::chain(start_line)]
fn divp1en(...)

#[rustfmt::chain(end_line)]
fn enabled(...)

Suppose that you put something like those attributes on the methods, and then rustfmt automatically implemented the paired-calls pattern you're using? rustfmt::chain(start_line) would mean "if there's more than one of these in the chain, always break this onto a new line even if there's enough room", and rustfmt::chain(end_line) would mean "it's OK to put this on the end of a line that already has a call in it".

That would probably be easier to implement in rustfmt than attempting to preserve semantics implied by line breaks in the original code. And it'd mean that you get a uniform style across your codebase in the manner you want. Thoughts?

+1, to me this is the most desired change pending for rustfmt

@joshtriplett

I can see how this would be similar. On the other hand, what happens if one of those .x().y().z() blocks becomes longer than the configured line width? Would you expect that one to get split into one-call-per-line, with the others remaining untouched? What happens if the arguments to a function are long?

I was about to ask this, if max_width gets exceeded again in the lines below, there are two options:

  1. Wrap exceeding methods to the line below in the same way that rustfmt's _normalize_comments = true_ works (if they overflow, send to line below).
  2. Break the line that's exceeding max_width into N new lines, where N is the number of method calls (one line for each call).

In my opinion, the second option is less chaotic when line splitting is needed, it's easier to regroup things in a semantic way.

@joshtriplett FWIW I'll just throw in my 2c that annotations like you describe can really grow to encumber code bases with more "noise" than code. Once a formatter (or tool) starts to go down this path, it's a slippery slope.
If it's possible to avoid adding annotations, by adding smarts that works as-desired 9 times out of 10 (leave the blocks as specified by the programmer, split into one-call-per-line if necessary?) then personally I'd rather live with that 1 time out of 10 or 20 being annoying, instead of always having to deal with manual annotations.

@ericsampson I realize that it would add complexity to the formatter. But having ways to automatically format codebases to match a more nuanced formatting style would help greatly, especially if such annotations can occur in crates that provide such libraries. Consider clap, for instance, the configuration of which often involves a massive nested set of chained calls. It'd be nice to improve the formatting of that, without people having to hand-tweak it.

@joshtriplett Thanks for the thoughtful response, and more generally for everything you've done for Rust!

I think one of my more general frustrations is that the Rust coding style itself actually defers quite a bit to programmer discretion (frequently using language like "prefer" rather than "must"), but -- as you point out -- that discretion is not as reflected in rustfmt itself. I would really like to see rustfmt embrace some of the deliberate deference to judgement in Rust's coding style, and optionally allow for these "prefers" and "shoulds" to be interpreted as written in the style rather than as rigid absolutes. I understand that this isn't really rustfmt's model -- which wants to take the AST and re-render it -- but I would argue that that model is far more rigid that the style that it purports to enforce.

We should also remember that, ultimately, the programmer does have discretion: when and where rustfmt insists on being overly autocratic or inflexible (or, as in this case, results in decidedly less readable/maintainable code), the programmer will elect to turn it off via #[rustfmt::skip] -- an outcome that all can agree is undesirable for myriad reasons!

@joshtriplett of course it's the maintainer's choice 110% so do whatever you think works best, I just wanted to give a little food for thought re the annotations path based on some experience from other languages. Cheers!!

I can see how this would be similar. On the other hand, what happens if one of those .x().y().z() blocks becomes longer than the configured line width? Would you expect that one to get split into one-call-per-line, with the others remaining untouched? What happens if the arguments to a function are long?

Given how it's accomplished in some other languages/formatters, the simplest solution would probably be to bless a rustfmt configuration or annotation which results in this format remaining stable unless a line is too long. (In which case, only that line would get broken up.)

fn func() {
    p.RCC.pllcfgr.modify(|_, w| {
        w.pll1vcosel().wide_vco() //
            .pll1rge().range8() //
            .divp1en().enabled() //
            .divq1en().enabled() //
            .divr1en().enabled()
    });

    p.RCC.d1cfgr.write(|w| {
        w.d1cpre().div1() //
            .d1ppre().div1() //
            .hpre().div1()
    });
}

@bcantrill -

Sorry, have been meaning to comment on this one for a while.

Though implicitly indicated previously by the inclusion of this issue in a milestone, I want to state explicitly that we're going to try to support this; just a matter of figuring how best to do so.

I understand that this isn't really rustfmt's model -- which wants to take the AST and re-render it

This is spot on, and I want to emphasize it to provide context since I suspect this thread is likely to see some increased traffic due to the reference from your post (which was a very enjoyable read by the way!)

rustfmt is indeed the AST-driven/pretty-printer type of code reformatter. This means that rustfmt has to process and write something for each and every AST node from the parsed program, and there has to be something that defines how rustfmt rewrites each node. The Style Guide is what provides those definitions, and those definitions are what rustfmt uses by default when reformatting. Yes, the Style Guide has language that varies between "must", "should", etc., but rustfmt by design uses and applies all the Style Guide prescriptions regardless of the strength of wording so that it has a consistent target formatting to work with.

I think this is a powerful and useful feature, as it allows users to easily ensure their code is fully compliant with _all_ of the Style Guide prescriptions (regardless of "should" vs "must"), with zero configuration. I don't think we'd want to remove that feature, just as similarly there are places where the style guide uses "must", but rustfmt also has configuration options that will let users do the exact opposite, and I wouldn't want to remove those either (e.g. derives guidelines and merge_derives config).

I've personally come to prefer these types of code formatters. While I don't absolutely love every aspect of the emitted formatting either, I do love being able to defer to an automated tool that provides predictable and deterministic styling and eliminates back and forth debates on subjective topics that in my experience typically prove to be rather unproductive. There's definitely still occasions where I'll give rustfmt's output a cross look, but overall the benefits of automated reformatting, and doing so according the Style Guide, outweigh the instances of formatting that I dislike. I think there's a lot of users and contexts where that value holds true as well.

However, I know those tradeoffs don't weigh out the same for everyone, and I 100% agree there's cases where the emitted formatting is so painful that users are forced to consider polluting their code with myriad skip attributes or not using rustfmt altogether. The snippets in the description are good examples of this in my opinion.

While I do think reformatting according to the Style Guide should continue to be rustfmt's default modus operandi, I also think we should strive to provide options that at a minimum allow users to avoid these types of specific cases that a sizeable portion of the community indeed finds particularly painful.

Close observers of recent rustfmt releases may have noticed early hints of this, such as configuration options that have a Preserve variant, where rustfmt will take the existing formatting into account as part of the heuristics in reformatting. There are of course other tradeoffs with this tact, but it seems to be something users want, and something we want to be able to support.

The challenge is that it's not always easy to incorporate this within rustfmt. We just don't know at this point whether we could support this via some new config options (e.g. fn_call_layout = Preserve), or whether we would need to add an entirely new formatting mode to rustfmt. The former, at least in incremental pieces, could likely be delivered more quickly, but it will also require figuring out answers to the accompanying "what if" questions (several of which were noted above in this thread).

Prettier is also an AST/pretty-printer type of formatter, and it is able to support preserving existing formatting in some cases, so I have to think we've got opportunities to do so as well.

At some point in the near future I'm going to be doing some refactoring of the chain handling code to address an unrelated long-standing issue. While I'm in there, I'll also be thinking through if/how we could do this and what questions we'd need to get answered before attempting to implement some type of Preserve option.

Suppose that you put something like those attributes on the methods, and then rustfmt automatically implemented the paired-calls pattern you're using? rustfmt::chain(start_line) would mean "if there's more than one of these in the chain, always break this onto a new line even if there's enough room", and rustfmt::chain(end_line) would mean "it's OK to put this on the end of a line that already has a call in it".

That would probably be easier to implement in rustfmt than attempting to preserve semantics implied by line breaks in the original code. And it'd mean that you get a uniform style across your codebase in the manner you want. Thoughts?

I'm not keen on this, primarily for two reasons. First, I know this was offered as an alternative, but I feel like the ask was for something non-attribute based, at least implicitly if not explicitly. Users can already toss #[rustfmt::skip] in there but I think that's proving too ineffective especially in these types of cases. I know these attributes would be more powerful, but my sense is that folks would still prefer something that took current formatting into account (though I do agree that formatting chains via those attributes would probably be easier for us to implement)

Secondly, and somewhat selfishly :smile:, I fear this could snowball pretty quickly and we'll end up with dozens and dozens of attributes. I worry this would add a lot of burden and complexity within the rustfmt codebase, but I also think it could make code noisy and messy for end users.

On Mon, Oct 12, 2020 at 07:43:00PM -0700, Caleb Cartwright wrote:

I'm not keen on this, primarily for two reasons. First, I feel like the ask was for something non-attribute based (at least implicitly if not explicitly). Users can already toss #[rustfmt::skip] in there but I think that's proving too ineffective especially in these types of cases.
...
Secondly, and somewhat selfishly :smile:, I fear this could snowball pretty quickly and we'll end up with dozens and dozens of attributes. I worry this would add a lot of burden and complexity within the rustfmt codebase, but I also think it could make code noisy and messy for end users.

The difference is that these attributes would appear on method
definitions, and would then improve formatting for all users of those
methods. One method definition, many many callers. That seems worth a
few annotations for cases where rustfmt will never be able to
understand otherwise.

The difference is that these attributes would appear on method definitions, and would then improve formatting for all users of those methods. One method definition, many many callers. That seems worth a few annotations for cases where rustfmt will never be able to understand otherwise.

Would would mean decisions about the formatting of a codebase could be deferred to a dependency's author, rather than in the control of the developers of the actual code in question? That doesn't seem like a good shift in responsibility.

@calebcartwright Thank you for writing up a great summary.

The difference is that these attributes would appear on method
definitions, and would then improve formatting for all users of those
methods. One method definition, many many callers. That seems worth a
few annotations for cases where rustfmt will never be able to
understand otherwise.

This means rustfmt needs to be able to resolve names, which is currently (and will be, sadly) outside the scope of rustfmt.

I sometimes think one formating rule or the other would better be different, but what i really love about rustfmt, is that the same code always looks the same. If you allow programmer discretion in formating every third programmer would do much worse.
So i would much rather have bad but standardized formater than programmer discretion, thank you very much!
And the i don't think rustfmt is bad at all.

If you allow __programmer discretion__ in formating every third programmer would do much worse. So i would much rather have bad but standardized formater than programmer discretion

While I agree on this point as a general trend, I have to admit that it does fail in the aforementioned examples. In those cases, strickly sticking to it is __valuating standardization over readability__, which I think is a bad idea as well. I think in this very cases, human flexibility is better than blind procedural formatting.

I also think that examples like the ones given above are both rare enough that special-casing them is the right solution, and frequent enough that falling back on #[rustfmt::skip] is undesirable. Thus my full support :)

While I agree on this point as a general trend, I have to admit that it does fail in the aforementioned examples. In those cases, strickly sticking to it is valuating standardization over readability, which I think is a bad idea as well. I think in this very cases, human flexibility is better than blind procedural formatting.

Fully agreed! rustfmt overall does a great job, but there are some cases where some programmers like to put more nuance into their formatting to convey additional information or to increase readability by symmetry, and then rustfmt can sometimes remove that information and likewise remove the symmetry.

This issue is an example of this; my own pet peeve here is rustfmt's handling of match, which suffers from similar issues (https://github.com/rust-lang/rustfmt/issues/3995, https://github.com/rust-lang/rustfmt/issues/4004, https://github.com/rust-lang/rustfmt/issues/4005). It might make sense to consider this more broadly instead of focusing on each separate case (or maybe they truly do need individual treatment, not sure).

On the extreme other end, you might be able to solve the original example with language:

trait W {
  fn pll1rge_range8(self) -> Self;
  fn divp1en_enabled(self) -> Self;
  ...
}

impl W for SomeW {
  fn pll1rge_range8(self) -> Self { self.pll1rge().range8() }
  ...
}

fn func() {
    p.RCC.pllcfgr.modify(|_, w| {
        w.pll1vcosel_wide_vco()
            .pll1rge_range8()
            .divp1en_enabled()
            .divq1en_enabled()
            .divr1en_enabled()
    });
}

I'll admit this is probably overkill (and may be impossible in some cases), but there may be situations where this might be useful for cases beyond simply formatting, and a nice format would just be a bonus.

Thinking about a solution that works 9 out of 10 cases, can't there be some setting that when splitting, would allow chains to be not 1 per line, but some other specified number?

I expect the typical code would be quite symmetric, but in other parts of the code regular 1 per line call makes more sense.

@joshtriplett I think the chain start/end annotations, even if nice for rustfmt, are going to be a "crate service" provided to the users which might not happen or work nice in practice. And when the chain formatting doesn't happen for the user, the source of the issue is going to be very obscure, unless you're familiar with this particular formatting choice.

Also, will these chain attributes take precedence over the defaults? Won't that mean that depending on the crates you draw from, different parts of the same code would look different?

I've hit this problem in a code doing parsing with pest and have a similar problem to @bcantrill, but in my case the end of chain is a local choice that makes sense from the perspective of the operations I'm doing.

In the test code I have these:

parse.unwrap()
    .next().unwrap().into_inner() // inside typedef
    .next().unwrap().into_inner() // inside rhs_typedef_complex
    .next().unwrap().into_inner() // inside complex_type_def
    .next().unwrap().into_inner() // inside rhs_typedef_simple

But in main.rs, depending on the context it might make more sense to have the break after unwrap():

pair
    .into_inner().next().unwrap()
    .into_inner().next().unwrap()
Was this page helpful?
0 / 5 - 0 ratings

Related issues

cramertj picture cramertj  路  4Comments

ozkriff picture ozkriff  路  4Comments

thomaseizinger picture thomaseizinger  路  3Comments

torkleyy picture torkleyy  路  5Comments

alatiera picture alatiera  路  4Comments