Roslyn: Feature request: allow digit separator after 0b or 0x

Created on 22 Jul 2016 · 8Comments · Source: dotnet/roslyn

This was previously requested in a comment on #216 (and I independently viewed that thread precisely to see if it was already valid).

With VS15 Preview 3, we have:

Valid:

var x = 0b1010_0000;
var y = 0x1234_abcd;

Not valid:

var x = 0b_1010_0000;
var y = 0x_1234_abcd;

I find the latter more readable than the former. While I can see the reason why digit separators before just digits isn't valid (e.g. _1, which is a valid identifier), the leading 0x or 0b already prevents the token from being an identifier.

// cc @zippec

Area-Compilers Area-Language Design Feature Request Resolution-External

Source

jskeet

👍18

Most helpful comment

Note:

I'd be very wary about allowing a prefix of _ as _1 is already a legal identifier.

CyrusNajmabadi on 11 Feb 2017

👍3

All 8 comments

This is a language request, not a compiler request. The compiler is behaving per its specification.

gafter on 1 Aug 2016

/cc @khyperia FYI

gafter on 1 Aug 2016

I implemented this a while back, so I figured I'd chime in with what I know. While I'm not sure where the exact spec is hiding (it likely is equivalent to the compiler), this is what the compiler does:

Any "string of digits" (e.g. 0-9 for decimals, plus fullwidth for VB) in any literal (decimal, hex, binary, float, double) can contain any number of underscores at any place between the first and last digit (i.e. cannot start nor end with an underscore).

There are additional cases that might be interesting to consider when discussing the choice of if 0x_2 should be allowed, mostly relating to floats ("reasonable" means "easy to design without breaking changes"). I've also listed cases that cannot or are difficult to be parsed without a breaking change. (All of these are impossible with today's rules)

0x_2 -- the original proposal
0b_10 -- same
0x2_ -- reasonable
_1.2e3 -- might be technically possible, but involves lookahead to see a digit or the e (and breaks in unintuitive ways)
1_.2e3 -- reasonable
1._2e3 -- same as earlier, but even more unintuitive (e.g. 1._2 is impossible to be resolved in the parser, it needs the exponent syntax to be possible)
1.2_e3 -- reasonable
1.2e_3 -- reasonable (this is an odd one - prefixing an underscore to the digit sequence isn't simple to do in the other two cases)
1.2e3_ -- reasonable

Additionally, 0_x2 might be considered, but I don't see how that makes sense at all.

Note that if any rules is changed, we would also want to update VB, as well as possibly F# - I helped out a PR implementing digit separators in F#, and they ended up following the same rules.

Edit from half a year later (2017-02-11): Don't know what I was thinking with _1.2e3 or 1._2e3 being technically possible to be resolved in the parser, they're definitely not. My personal opinion is that 0x_2 is the only truly useful change, but I figured I'd correct the above for potential future discussion.

khyperia on 3 Aug 2016

1._2e3 -- same as earlier, but even more unintuitive (e.g. 1._2 is impossible to be resolved in the parser, it needs the exponent syntax to be possible)

So the compiler will have to wait until it knows whether _2e3 is a valid extension method on int or not to choose between 1 and 1.2e3? I don't know if that's worth it.

orthoxerox on 3 Aug 2016

I think the proposed grammar was
Literal ::= Prefix ( Sep? Digit )* Digit

AdamSpeight2008 on 3 Aug 2016

I'm closing this and letting the LDM decide before doing any work. See https://github.com/dotnet/csharplang/issues/65

gafter on 11 Feb 2017

Note:

I'd be very wary about allowing a prefix of _ as _1 is already a legal identifier.

CyrusNajmabadi on 11 Feb 2017

👍3

@CyrusNajmabadi The meaning of prefix here are is, are those define by the language specification.

Prefix ::= HexPrefix | BinPrefix | ...
HexPrefix ::= "0h" | "0H"
BinPrefix ::= "0b" | "0B"

If the prefix is missing then it possible for the character sequence to match a identifier (possibly legal) if the underscore separator is first. Though this is unlikely as the prefix is required in this context of digit separators.

AdamSpeight2008 on 11 Feb 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Feature request: have the compiler derive interfaces from classes

DavidArno · 3Comments

C# should have something akin to My.Resources.

AdamSpeight2008 · 3Comments

[Discussion] nested withers for deep immutable object updates (lenses?)

orthoxerox · 3Comments

FormattableString and IFormattable should win over string in overload resolution

AceHack · 3Comments

"Convert to interpolated string" only offered on the existing string parts, rather than what would become the entire interpolated string

dpoeschl · 3Comments