Rustfmt: make sure we work with unicode

Created on 9 Mar 2015  路  8Comments  路  Source: rust-lang/rustfmt

In particular, we use byte positions where we should use char positions in many places. Furthermore, when we do use char positions, we don't check the 'physical' width of the character.

p-high

Most helpful comment

When using many unicode characters in a line I get a series of:

Rustfmt failed at process.rs:894: line exceeded maximum length (sorry)

Even though my line is only 71 characters long.

All 8 comments

Byte positions _should_ be fine, and they're also more performant than character indices.

I think that rustfmt should use char positions for at least the width of the line, or writing error messages in native language leads to the fact that it is necessary to set max_width = 180.
Otherwise the line simply is not processed.

When using many unicode characters in a line I get a series of:

Rustfmt failed at process.rs:894: line exceeded maximum length (sorry)

Even though my line is only 71 characters long.

It appears that rust fmt replaces many non-unicode chars (such as or ) with space . (test case '渭' will be converted to ' ')

@czipperz Can you share an example where this happens ?

Yes. https://github.com/czipperz/rust-comp/blob/b1e3df1f7a04f99e0c0a7bac7f97d715c43ab187/rust-comp-front/src/pos.rs#L78 . It's possible it's a problem with emacs. When I run M-x rust-fmt-buffer it replaces unicode characters with spaces. But cargo fmt --all works fine.

@czipperz looks like it's emacs related indeed. Maybe if you find out how emacs calls rustfmt we can reproduce the bug if it really is on rustfmt side.

The code in this gist was formatted with rustfmt in the Rust playground. As can be seen, the comments behind the string containing (non-ASCII) unicode characters seem rather haphazardly "aligned". I believe this to be related to this issue, and likely the cause is use of byte lengths instead of unicode string lengths, but do correct me if I'm wrong. (Yes, this particular example might be a bit of a niche case, but I can imagine more legitimate situations where similar issues would arise)

Was this page helpful?
0 / 5 - 0 ratings