Ok guys, @antlr/antlr-targets, can we start discussion about releasing 4.7? The biggest change is the unicode from @bhamiltoncx but this is a big fix to lots of stuff. Can you guys look through the issues for anything we really should fix for 4.7?
I'd love some suggestions on how to best test the perf of the new Unicode logic. Once PR #1757 lands, I'd like to write a perf test using a complex Unicode grammar parser like this:
https://github.com/bhamiltoncx/grammars-v4/commit/665f75563f7b01bb84a9cf872346fcc2007135f7
I'm guessing that we'll need to fix #1613 at least to make parsing Unicode performant with the Java runtime (we currently hard-code a cache to only support Unicode values up to U+007F
).
I wonder if testing performance can be done using the existing descriptor object. We encode a grammar such as you have there and then we have the test rig report timing?
We can hold off on #1613 until we know that Unicode is definitely a problem.
One thing about releases that have bugged me is that I sometimes, by mistake, upgrade the IntelliJ plugin that generates code, but my runtime library (Cpp, etc) is out of date. In the best case, I get a compile error (but they can be fairly obscure), at worst, I suppose nasty runtime bugs can happen.
I wonder if it's possible to add some sort of early detection of generated parser <-> runtime mismatch that works for all targets?
That already exists, at least for Java, C# and others. not sure C++. you get a warning message when parsers start.
@mike-lischke I have not noticed such warnings in C++. Have I missed it, or is it something that should be added?
If such a warning would be in place I'd have it ported. I know there is a version number in the source files...
About the release: what happened with 4.6.1? You redeclared that to 4.7 @parrt? I have actually 2 serious bug reports for the C++ target, but I guess I won't be able to fix them soon, so they shouldn't block the release.
@mike-lischke which 2?
@pureconfig You need to replace your use of the IDE plugin for code generation with a build step that generates anything out-of-date when your project is built. It is the only way to ensure your output is always correct. Any other method of generation should be considered "for experimental purposes only" and should be expected to produce invalid, outdated and/or changes in output from time to time.
For my fork releases I've made it extra difficult for users to manually generate code outside of a build integration. "If they're doing it wrong, it should feel wrong."
@sharwell Thanks for the input. I do have a build system that generates everything, but during development I generate from the plugin when I change the grammar.
As for versioning, I would strongly prefer that each release had a Listener/abstract Visitor method called
void version46();
in version 4.6, and - you guessed it
void version47();
in version 4.7. Etc.
Making a mistake would be impossible, and the compiler would tell me exactly what is going on.
@mike-lischke @pureconfig Please see how Java gives the appropriate warning: https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/RuntimeMetaData.java#L144
/**
* This method provides the ability to detect mismatches between the version
* of ANTLR 4 used to generate a parser, the version of the ANTLR runtime a
* parser was compiled against, and the version of the ANTLR runtime which
* is currently executing.
@mike-lischke "You redeclared that to 4.7". Yep. incorporation of widespread unicode changes made a point release too minor.
That code exists also in the C++ target, however first checkVersion is never called and second it logs any error to the console. If you have a normal application with a graphical user interface that text won't be seen.
Hmm...yeah, generated code has to call it; e.g.,
static { RuntimeMetaData.checkVersion("4.7", RuntimeMetaData.VERSION); }
We used to throw exception but that prevented from running. I made the decision to change to just a msg in log file (if gui); not a perfect solution either way.
@parrt A cute solution for Java and recent C++ compilers *) is to have a Version class in the runtime where a static field is added for each version, but the old ones are kept deprecated.
@Deprecated
public static final int VERSION_45 = 0;
@Deprecated
public static final int VERSION_46 = 0;
public static final int VERSION_47 = 1;
And the generator would put a field in the base listener/visitor:
int version = VERSION_46;
...which in this case would lead to a big fat "VERSION_46 has been deprecated". Since the valid version is = 1, this can also be used to test for version at runtime, although that seems less useful.
WDYT?
*) for C++, something like __declspec(deprecated) can be used. For C# and other targets, who knows.
what's wrong with current solution? This doesn't solve runtime issue either.
@parrt I'm intent on cleaning up the Go runtime so that it has decent godoc output (mentioned here: https://github.com/antlr/antlr4/issues/1374) Right now, it's pretty dismal. :( This would involve two major operations:
What are you thinking for 4.7? Next week? Next month? I can plan my work around that. I think both of these operations could be done incrementally. It's such a large API that I do not want to make this block a 4.7 release.
Since it's just documentation we can inject that at any time. I'm hoping to finish up 4.7 within a week or so so probably too soon to finish all of that playing around :)
Ok, we're getting closer. @bhamiltoncx has made a nice generic factory interface for getting input char streams and will hopefully have time to make PR you folks can sniff for your targets.
Will you guys be publishing the javascript runtime to npm at the time that you release this? I'm waiting on the fix for #1639 and reference the javascript runtime from this repository as npm doesn't support subdirectories.
Yes we will.
Le 20 mars 2017 à 21:45, rozza2058 notifications@github.com a écrit :
Will you guys be publishing the javascript runtime to npm at the time that you release this? I'm waiting on the fix for #1639 https://github.com/antlr/antlr4/pull/1639 and reference the javascript runtime from this repository as npm doesn't support subdirectories.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/1756#issuecomment-287763038, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLYJD9NKNb7j3ZAeLsUKxMKKJZHMWvNks5rnoMMgaJpZM4Mc44O.
I don't think @mike-lischke will have time to review this before we go out...though we are still discussing new char stream API so you never know! :)
The problem with #1753 and #1592 is that they require potentially a lot of time to be analyzed and fixed. This is something I can only start when I know I have at least a day just for that (which is unfortunately very rare).
After much thought concerning the "create stream" interface, I have decided on the following end result: C++, Python, Go, and Swift APIs didn't need any changes to support full Unicode code points, so I decided to leave those alone. Java, C#, and JavaScript runtimes required changes. Rather than gutting and changing the interface of the previous ANTLRFileStream etc..., I have deprecated those and introduced a CharStreams.fromXXX factory style interface for the new stuff. Motivation and so on explained here: https://github.com/antlr/antlr4/blob/master/doc/unicode.md I have updated the target docs slightly to show the new interface.
Any last little cleanup before 4.7? I'm working on the release notes.
Wow that was a long ride, giving all the discussions and tests and optimizations. Great accomplishment and big kudos to @bhamiltoncx!
@parrt We need to talk about unbuffered streams again. They don't make problems in parse trees either, as long as they have random access to the underlying storage (e.g. file, in memory strings). :-D
Yeah, I agree after 4.7 we should do a proper random access stream that
doesn't buffer (except to decode bytes to Unicode code points).
That means we will need to avoid InputStream and Reader on the Java side,
as those only support reading forward.
On Thu, Mar 30, 2017, 2:23 AM Mike Lischke notifications@github.com wrote:
Wow that was a long ride, giving all the discussions and tests and
optimizations. Great accomplishment and big kudos to @bhamiltoncx
https://github.com/bhamiltoncx!@parrt https://github.com/parrt We need to talk about unbuffered
streams again. They don't make problems in parse trees either, as long as
they have random access to the underlying storage (e.g. file, in memory
strings). :-D—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/antlr/antlr4/issues/1756#issuecomment-290324851, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AApAU32Ma4-KtWmhdLl3NRU-iPqhtMexks5rq1h2gaJpZM4Mc44O
.
@mike-lischke I'm perfectly fine with on-demand loading from files in principle, but I usually let the operating system do that with virtual memory. ;) Hence, I just buffer everything up. I can see using memory mapped I/O instead, for those languages that support it.
Ok, @antlr/antlr-targets, all tests report green on my box and waiting on a final build from CI sites. I'm guessing we are good to launch 4.7!
oops, looks like 4.7 is out. we can put this in 4.7.1 i guess.
@parrt, I think we can we finally close it. :)
Most helpful comment
@parrt, I think we can we finally close it. :)