Antlr4: release 4.7

Created on 14 Mar 2017 · 32Comments · Source: antlr/antlr4

Ok guys, @antlr/antlr-targets, can we start discussion about releasing 4.7? The biggest change is the unicode from @bhamiltoncx but this is a big fix to lots of stuff. Can you guys look through the issues for anything we really should fix for 4.7?

build

Source

parrt

Most helpful comment

@parrt, I think we can we finally close it. :)

KvanTTT on 1 May 2018

👍2

All 32 comments

I'd love some suggestions on how to best test the perf of the new Unicode logic. Once PR #1757 lands, I'd like to write a perf test using a complex Unicode grammar parser like this:

https://github.com/bhamiltoncx/grammars-v4/commit/665f75563f7b01bb84a9cf872346fcc2007135f7

I'm guessing that we'll need to fix #1613 at least to make parsing Unicode performant with the Java runtime (we currently hard-code a cache to only support Unicode values up to U+007F).

bhamiltoncx on 14 Mar 2017

👍1

I wonder if testing performance can be done using the existing descriptor object. We encode a grammar such as you have there and then we have the test rig report timing?

We can hold off on #1613 until we know that Unicode is definitely a problem.

parrt on 14 Mar 2017

One thing about releases that have bugged me is that I sometimes, by mistake, upgrade the IntelliJ plugin that generates code, but my runtime library (Cpp, etc) is out of date. In the best case, I get a compile error (but they can be fairly obscure), at worst, I suppose nasty runtime bugs can happen.

I wonder if it's possible to add some sort of early detection of generated parser <-> runtime mismatch that works for all targets?

ghost on 14 Mar 2017

👍1

That already exists, at least for Java, C# and others. not sure C++. you get a warning message when parsers start.

parrt on 14 Mar 2017

@mike-lischke I have not noticed such warnings in C++. Have I missed it, or is it something that should be added?

ghost on 14 Mar 2017

If such a warning would be in place I'd have it ported. I know there is a version number in the source files...

About the release: what happened with 4.6.1? You redeclared that to 4.7 @parrt? I have actually 2 serious bug reports for the C++ target, but I guess I won't be able to fix them soon, so they shouldn't block the release.

mike-lischke on 15 Mar 2017

@mike-lischke which 2?

ghost on 15 Mar 2017

1592 and #1753, you commented in both

mike-lischke on 15 Mar 2017

@pureconfig You need to replace your use of the IDE plugin for code generation with a build step that generates anything out-of-date when your project is built. It is the only way to ensure your output is always correct. Any other method of generation should be considered "for experimental purposes only" and should be expected to produce invalid, outdated and/or changes in output from time to time.

For my fork releases I've made it extra difficult for users to manually generate code outside of a build integration. "If they're doing it wrong, it should feel wrong."

sharwell on 15 Mar 2017

@sharwell Thanks for the input. I do have a build system that generates everything, but during development I generate from the plugin when I change the grammar.

As for versioning, I would strongly prefer that each release had a Listener/abstract Visitor method called

 void version46();

in version 4.6, and - you guessed it

void version47();

in version 4.7. Etc.

Making a mistake would be impossible, and the compiler would tell me exactly what is going on.

ghost on 15 Mar 2017

@mike-lischke @pureconfig Please see how Java gives the appropriate warning: https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/RuntimeMetaData.java#L144

    /**
     * This method provides the ability to detect mismatches between the version
     * of ANTLR 4 used to generate a parser, the version of the ANTLR runtime a
     * parser was compiled against, and the version of the ANTLR runtime which
     * is currently executing.

parrt on 15 Mar 2017

@mike-lischke "You redeclared that to 4.7". Yep. incorporation of widespread unicode changes made a point release too minor.

parrt on 15 Mar 2017

That code exists also in the C++ target, however first checkVersion is never called and second it logs any error to the console. If you have a normal application with a graphical user interface that text won't be seen.

mike-lischke on 15 Mar 2017

Hmm...yeah, generated code has to call it; e.g.,

    static { RuntimeMetaData.checkVersion("4.7", RuntimeMetaData.VERSION); }

We used to throw exception but that prevented from running. I made the decision to change to just a msg in log file (if gui); not a perfect solution either way.

parrt on 15 Mar 2017

@parrt A cute solution for Java and recent C++ compilers *) is to have a Version class in the runtime where a static field is added for each version, but the old ones are kept deprecated.

@Deprecated
public static final int VERSION_45 = 0;

@Deprecated
public static final int VERSION_46 = 0;

public static final int VERSION_47 = 1;

And the generator would put a field in the base listener/visitor:

int version = VERSION_46;

...which in this case would lead to a big fat "VERSION_46 has been deprecated". Since the valid version is = 1, this can also be used to test for version at runtime, although that seems less useful.

WDYT?

*) for C++, something like __declspec(deprecated) can be used. For C# and other targets, who knows.

ghost on 16 Mar 2017

what's wrong with current solution? This doesn't solve runtime issue either.

parrt on 16 Mar 2017

@parrt I'm intent on cleaning up the Go runtime so that it has decent godoc output (mentioned here: https://github.com/antlr/antlr4/issues/1374) Right now, it's pretty dismal. :( This would involve two major operations:

Moving a number of types and variables to be package private. In Go, this means basically making them lower case. This will need to be done carefully to align with the other runtimes, though. I don't want to accidentally hide features.
Moving documentation from the Java runtime over to the Go runtime. My intent here is to just do a direct copy/paste job and change them minimally to match the godoc standard.

What are you thinking for 4.7? Next week? Next month? I can plan my work around that. I think both of these operations could be done incrementally. It's such a large API that I do not want to make this block a 4.7 release.

pboyer on 16 Mar 2017

Since it's just documentation we can inject that at any time. I'm hoping to finish up 4.7 within a week or so so probably too soon to finish all of that playing around :)

parrt on 16 Mar 2017

Ok, we're getting closer. @bhamiltoncx has made a nice generic factory interface for getting input char streams and will hopefully have time to make PR you folks can sniff for your targets.

parrt on 19 Mar 2017

Will you guys be publishing the javascript runtime to npm at the time that you release this? I'm waiting on the fix for #1639 and reference the javascript runtime from this repository as npm doesn't support subdirectories.

rozza2058 on 20 Mar 2017

Yes we will.

Le 20 mars 2017 à 21:45, rozza2058 notifications@github.com a écrit :

Will you guys be publishing the javascript runtime to npm at the time that you release this? I'm waiting on the fix for #1639 https://github.com/antlr/antlr4/pull/1639 and reference the javascript runtime from this repository as npm doesn't support subdirectories.

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/1756#issuecomment-287763038, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLYJD9NKNb7j3ZAeLsUKxMKKJZHMWvNks5rnoMMgaJpZM4Mc44O.

ericvergnaud on 20 Mar 2017

1753 looks like a blocker

ghost on 23 Mar 2017

I don't think @mike-lischke will have time to review this before we go out...though we are still discussing new char stream API so you never know! :)

parrt on 23 Mar 2017

The problem with #1753 and #1592 is that they require potentially a lot of time to be analyzed and fixed. This is something I can only start when I know I have at least a day just for that (which is unfortunately very rare).

mike-lischke on 24 Mar 2017

After much thought concerning the "create stream" interface, I have decided on the following end result: C++, Python, Go, and Swift APIs didn't need any changes to support full Unicode code points, so I decided to leave those alone. Java, C#, and JavaScript runtimes required changes. Rather than gutting and changing the interface of the previous ANTLRFileStream etc..., I have deprecated those and introduced a CharStreams.fromXXX factory style interface for the new stuff. Motivation and so on explained here: https://github.com/antlr/antlr4/blob/master/doc/unicode.md I have updated the target docs slightly to show the new interface.

Any last little cleanup before 4.7? I'm working on the release notes.

parrt on 30 Mar 2017

Wow that was a long ride, giving all the discussions and tests and optimizations. Great accomplishment and big kudos to @bhamiltoncx!

@parrt We need to talk about unbuffered streams again. They don't make problems in parse trees either, as long as they have random access to the underlying storage (e.g. file, in memory strings). :-D

mike-lischke on 30 Mar 2017

Yeah, I agree after 4.7 we should do a proper random access stream that
doesn't buffer (except to decode bytes to Unicode code points).

That means we will need to avoid InputStream and Reader on the Java side,
as those only support reading forward.

On Thu, Mar 30, 2017, 2:23 AM Mike Lischke notifications@github.com wrote:

Wow that was a long ride, giving all the discussions and tests and
optimizations. Great accomplishment and big kudos to @bhamiltoncx
https://github.com/bhamiltoncx!

@parrt https://github.com/parrt We need to talk about unbuffered
streams again. They don't make problems in parse trees either, as long as
they have random access to the underlying storage (e.g. file, in memory
strings). :-D

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/antlr/antlr4/issues/1756#issuecomment-290324851, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AApAU32Ma4-KtWmhdLl3NRU-iPqhtMexks5rq1h2gaJpZM4Mc44O
.

bhamiltoncx on 30 Mar 2017

@mike-lischke I'm perfectly fine with on-demand loading from files in principle, but I usually let the operating system do that with virtual memory. ;) Hence, I just buffer everything up. I can see using memory mapped I/O instead, for those languages that support it.

parrt on 30 Mar 2017

Ok, @antlr/antlr-targets, all tests report green on my box and waiting on a final build from CI sites. I'm guessing we are good to launch 4.7!

parrt on 30 Mar 2017

👍1