Julia: make numeric literal juxtaposition less brittle

Created on 1 Apr 2016 · 15Comments · Source: JuliaLang/julia

The most obvious two changes would be:

disallow 0 as a juxtaposed numeric literal coefficient: 0n would be a syntax error.
disallow juxtaposition of floating-point numeric literals with trailing .: 1.x would be a syntax error.

There may be others but these two strike me as clearly a good idea. One nice thing about disallowing 0 as a juxtaposed numeric literal coefficient is that it opens up as many 0x1234 syntaxes as one might ever want, so it kind of future-proofs us for that. Another nice thing is that instead of getting a surprise when they try 0x after 0y just working, people will get a warning as soon as they try 0y and they can immediately learn that 0 doesn't work as a juxtaposed numeric literal coefficient.

breaking parser speculative

Source

StefanKarpinski

👍4

Most helpful comment

This is a pretty simple change and we would benefit from doing it sooner rather than later.

StefanKarpinski on 9 May 2016

👍3

All 15 comments

See also #5246, #11529, #10920, ... for related problems with literals and operators etc.

stevengj on 1 Apr 2016

👍1

I think I would generally be in favor of deprecating float literals with trailing . but that's another matter.

StefanKarpinski on 1 Apr 2016

👎1

This is a pretty simple change and we would benefit from doing it sooner rather than later.

StefanKarpinski on 9 May 2016

👍3

Bump – @JeffBezanson, could you do this?

StefanKarpinski on 12 May 2016

Number parsing changes are never simple :)

Working on this. First interesting issue I hit is 0im, which seems useful to allow.

JeffBezanson on 12 May 2016

Ah, that's an interesting one.

StefanKarpinski on 13 May 2016

Maybe a better option to disallowing 0n is to require 0x1234 syntaxes to use upper case letters. Then we could also support many 0x style syntaxes via r"0[a-z][A-Z0-9]+". This would mean that 0xa is 0*xa, 0xA is 0x0A, and 0im would still work.

omus on 13 May 2016

How about allowing 0x to mean 0*x and disallowing 0 juxtaposed with any starting with a single non-digit followed by a digit (and anything after that). That way 0x and 0im are allowed, and 0x1 would be a hex value while 0y1 would be illegal. It's slightly more brittle than I would like, but at least the rule doesn't care about a specific list of leading characters.

StefanKarpinski on 13 May 2016

@omus: Unfortunately, using the lowercase form is by far more common and also looks better.

StefanKarpinski on 13 May 2016

@StefanKarpinski I agree that forcing uppercase letters looks worse.

I like your suggestion but unfortunately the non-digit following a digit heuristic means that 0x1234 is a hexidecimal while 0xabcd would not be.

Could we not just try to parse r"0x\S+" as a hexidecimal and allow for 0x as 0*x? Where I see this breaking down is if we decide to make 0i a special prefix then 0im would be parsed as a special value.

omus on 13 May 2016

I like your suggestion but unfortunately the non-digit following a digit heuristic means that 0x1234 is a hexidecimal while 0xabcd would not be.

smacks forehead

StefanKarpinski on 14 May 2016

Honestly, disallowing float literals with a trailing . seems like a more pressing issue. (Disallowing it in Julia source code, that is; functions to parse CSV files etcetera should continue to allow trailing ..)

stevengj on 15 May 2016

👎1

I agree. Shall we merge #16339?

JeffBezanson on 18 May 2016

👍2

Was #16339 the biggest part of this that we wanted to do prior to 0.5? For the rest should we remove the milestone?