Antlr4: Migrate JavaScript runtime to TypeScript

Created on 20 Sep 2016 · 30Comments · Source: antlr/antlr4

The JavaScript ANTLR4 runtime is getting to a size where it's difficult to maintain in JavaScript. (One data point on this is that JavaScript > 10 KLOC effectively becomes read-only and fixing bugs becomes difficult.) One possible approach to mitigating this would be to migrate the codebase to the TypeScript language (a superset of JavaScript that adds static type checking and modern JavaScript constructs while transpiring into plain JavaScript.) The tools support for TypeScript is good and getting stronger.

I've made a couple of test migration experiments, and think I'm ready to do this for real. I've already found a number of bugs in the runtime through static type checking. A couple of questions before I too far into it?

Would this get support from the ANTLR4 community? Its not a one-man effort. Any volunteers?
After several experiments, I think it probably should be made a replacement for the existing JavaScript runtime. I think this will best allow comparison of the original and generated .js file versions. Sound OK?

javascript improvement

Source

BurtHarris

👍14

Most helpful comment

I submitted several pull requests that should resolve the majority of the feature gap remaining in tunnelvisionlabs/antlr4ts. I still need to make improvements to the distribution strategy (reduce the time it takes to get updates published to npmjs), but overall it seems to be getting to a good state.

sharwell on 26 Nov 2018

👍3 ❤1

All 30 comments

I think that's a great idea. I did just use ANTLR for the first time in a TypeScript project and must say it's been complicated=)

I've never been deep in the actual ANTLR runtime, but a language conversion should be doable. I'd be glad to help.

I guess the most recent version is in your fork on the typescript branch?

GeorgDangl on 21 Sep 2016

@BurtHarris
Hi Burt, FYI the initial runtime was a 1 man effort (mine).
I certainly encourage a Typescript target.
However I see it as a new target to start with, eventually derived from the javascript one, in order to avoid tying all users to Typescript (It's hard to decide at this point how JS will evolve: ES6, jsx, Typescript...).
And if the JS runtime gets generated from Typescript one, great!
Eric
n.b. be careful with your wording, a "bug" in Typescript might not be a bug in JS. My "definition" of a bug is a piece of code for which a test can be written which demonstrates incorrect behavior. At this point, all the tests pass, so there is no "bug" according to this definition (obviously the test coverage is far from 100%, so this is just a theoretical argument).

ericvergnaud on 21 Sep 2016

@BurtHarris I sent you a message in Gitter :smile:

sharwell on 21 Sep 2016

As background Eric, Typescript includes up-to-date implementations of both ES6 (now officially called ECMAScript 2015) and JSX, adding to them the static type checking. Its not very hard to predict where JavaScript is going over the next year or two...

I try to be careful with regard labeling things bugs. But as I see it, running code through static analysis is a form of test not previously applied to the ANTLR JavaScript runtime. Code reviews are another perfectly valid form of testing.

Let me illustrate just _a few_ suspected bugs identified by static type checking, reviewed by inspection and by comparison against the Java sources. @ericvergnaud do you want to you tell me if you consider them bugs?

In parser.js line 334, ParseTreePatternMatcher is referenced but never defined.
In class IntervalSet, reference to the undefined member intervalslength should probably be intervals.length. See intervalSet.js line 94
Class LexerATNSimulator declares a static property debug at LexerATNSimulator.js line 104 but accesses it like an instance property (this.debug) at line 148 and line 163.
getOffendingToken() is referenced at recognizer.js line 105 and ErrorStrategy.js line 169, etc; but it is never defined...

It is possible that one of the above is a false positive, bug, but it's a fair amount of work to back-reference these to the original source code lines, something I'd rather not continue doing.

BurtHarris on 21 Sep 2016

👍1

@GeorgDangl, to be clear the branches currently in my repository are all still 'learning experiments', I haven't committed to a plan yet, and when I do would probably execute it on yet another branch. Of the branches up there, tsMigration is probably the one you might want to look at if interested. I just updated it.

BurtHarris on 21 Sep 2016

@BurrHarris some of these are indeed bugs (intervalslength),
others are more probably dead code, blindly converted from Java but never called in the current implementation.
As per JavaScript future, I would avoid making predictions. I believe a dedicated TypeScript target is a safer path.

Envoyé de mon iPhone

Le 22 sept. 2016 à 01:44, Burt Harris [email protected] a écrit :

As background Eric, Typescript includes up-to-date implementations of both ES6 (now officially called ECMAScript 2015) and JSX, adding to them the static type checking. Its not very hard to predict where JavaScript is going over the next year or two...

I try to be am careful with regard labeling things bugs. But as I see it, running code through static analysis is a form of test not previously applied to the ANTLR JavaScript runtime Code reviews are another perfectly valid form of testing.

Let me illustrate just a few suspected bugs identified by static type checking, by inspection and by comparison against the Java sources. @ericvergnaud do you want to you tell me if you consider them bugs?

In parser.js line 334, ParseTreePatternMatcher is referenced but never defined.

In class IntervalSet, reference to the undefined member intervalslength should probably be intervals.length. See intervalSet.js line 94

Class LexerATNSimulator declares a static variable debug at LexerATNSimulator.js line 104 but accesses it like an instance variable (this.debug) at line 148 and line 163.

getOffendingToken() is referenced in several locations including recognizer.js line 105 and ErrorStrategy.js line 169, but never defined...

Its possible that one of the above is a false positive, bug, but it's a fair amount of work to back-reference these to the original source code lines, something I'd rather not continue doing.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

ericvergnaud on 22 Sep 2016

Is there some way I can help?

Phmager on 4 Nov 2016

Certainly, you could:

create a TypeScript runtime by cloning the JavaScript one.
implement the TypeScript target in the tests generator so you have a good place to start, and automatically validate your progress.
implement the TypeScript target in the antlr tool, with the corresponding target language templates (.stg files).
enhance the yaml file for Travis CI.
Once all this runs with the existing Javascript code, you could:
modify the generated listeners and visitors
modify the generated parsers
modify the high level client API
I would focus on the above only, which would expose the existing JavaScript runtime as a TypeScript one. From there we'll see what happens...

ericvergnaud on 4 Nov 2016

What about @BurtHarris fork and branch tsMigration? https://github.com/BurtHarris/antlr4/tree/tsMigration

Phmager on 4 Nov 2016

@Phmager I believe @BurtHarris has made substantial progress on this in the past month but hasn't published an update yet. More information should be available soon.

sharwell on 4 Nov 2016

Yes @Phmager, the branch above should be considered dead, but we are making progress on an alternative approach I think you will like. More info soon.

BurtHarris on 4 Nov 2016

@Phmager A new TypeScript target is now available here. We have a bit of work to do to make the API more "TypeScript-like", but we do have it working and an initial build published to NPM.
https://github.com/tunnelvisionlabs/antlr4ts

sharwell on 2 Dec 2016

👍1

Nice! I didn't have time yet to look at the project in detail. But I will definitely test it.

Phmager on 2 Dec 2016

I'm actively using this new target already in my node.js module: https://www.npmjs.com/package/antlr4-graps and haven't found anything not working yet, except for the profiler.

mike-lischke on 12 Jan 2017

@mike-lischke @sharwell That's awesome, but I don't want to use a target not officially supported by Team Antlr. Otherwise, bitrot will eventually ensue.

ghost on 27 Aug 2017

@notsonotso I'm also contributing to the TypeScript port. Some family issues have limited the time I have available, but it will be supported.

BurtHarris on 29 Aug 2017

Just wondering, are there plans (or reasons not) to eventually merge this effort back into upstream?

erikkemperman on 20 May 2018

I doubt it. Development on this target is extremely slow and there were no new releases since months (and it's still in alpha). I'm still thinking if it wouldn't make more sense to start fresh (antlr4ts also has a number of differences compared to the Java runtime, which sometimes make it more complicated than necessary to work with it).

mike-lischke on 20 May 2018

Okay .. unfortunately the efforts to build a TS runtime in ANTLR (and also the antlr4ts project) seem to have declined :/

@mike-lischke @sharwell When starting now (as a ANTLR consumer) a project with a TypeScript based parser (which is expected to have a long life, i.e. needs to be maintained for a long time), would you either recommend to

use the JavaScript ANTLR4 runtime in TS with --allowJs (and accept the missing generated typings, but have an "officially supported" and up-to-date ANTLR parser), or
use antlr4ts (with better typing, but the risk that this project won't be maintained in the future)

ulrichb on 21 Nov 2018

I prefer TS over JS but I see your point. It would be much easier to decide if the community knew what to expect from the TS target at all in the future.

On the other hand, if I'd depend on a good TS runtime for my products (I'm using the TS runtime exclusively for my antlr4 vscode extension - a hobby) I would probably start a new TS port.

mike-lischke on 21 Nov 2018

@mike-lischke Thanks for your reply.

ulrichb on 21 Nov 2018

Is there a simple way to produce pure JS ES 5 code from TS?
If so, I’m happy to explore a path where the JS target is generated from a TS one.
But that requires having a TS person involved sharing the same thinking.

Envoyé de mon iPhone

Le 21 nov. 2018 à 16:41, Mike Lischke notifications@github.com a écrit :

I prefer TS over JS but I see your point. It would be much easier to decide if the community knew what to expect from the TS target at all in the future.

On the other hand, if I'd depend on a good TS runtime for my products (I'm using the TS runtime exclusively for my antlr4 vscode extension - a hobby) I would probably start a new TS port.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

ericvergnaud on 21 Nov 2018

The TS transpiler can generate ES5 code, yes (and that would be a good idea actually). But the main problem is the stalled TS runtime implementation.

mike-lischke on 21 Nov 2018

As @mike-lischke already said this is no problem. Just use the --target ES5 option when invoking tsc, the TypeScipt compiler.

So in theory it would be possible to replace (change) the ANTLR JS target to a TS target and invoke tsc afterwards to get both, a TS and JS output. This would have the advantage that ANTLR would only need to maintain the TS target and get JS "for free" and both outputs would be exactly identical in the means of behavior.

A disadvantage would be that ANTLR would need to bundle (or maybe better depend on an existing) Node.js runtine (on the dev machines) for JS output (which IMO is okay, it's just like the dependency on the JVM). It would also need to bundle (or depend) on the TS compiler, but there is a completely self-contained npm package.

Maybe (I don't know the inner workings of the JS target, so I'm guessing) an alternative would be to add a flag to the JS target ("emit TS types"), which conditionally adds types to the JS code (TS is just a typed super set of JS), e.g. prepend a : MyAstNode after getMyAstNode(). This has the advantage to avoid the dependency on Node and the TS compiler, _but_ of course would increase the complexity of the JS emitter (a lot of conditional emitting).

ulrichb on 21 Nov 2018

End of the day, the purpose of the JS runtime is to run in browsers. So I’m not gonna tell existing JS users that they suddenly need to recompile the runtime from TS to JS in order to use it.
Rather there would be 2 targets, with the JS runtime generated from the TS one. It’s not ‘free’ because the JS templates and the TS ones would be different, plus the recompile might introduce issues.
But as already mentioned, the problem is not theory, it’s people.
I’m the author and sole maintainer of the JS and Python runtimes. In that respect, ‘the ANTLR team’ is just me, doing for free all the JS stuff + python and c# (c# originally written by @sharwell). I’m supportive of the TS->JS concept but I have no personal need for it.
So unless someone raises their hand, nothing will happen.

Envoyé de mon iPhone

Le 21 nov. 2018 à 19:31, ulrichb notifications@github.com a écrit :

As @mike-lischke already said this is no problem. Just use the --target ES5 option when invoking tsc, the TypeScipt compiler.

So in theory it would be possible to replace (change) the ANTLR JS target to a TS target and invoke tsc afterwards to get both, a TS and JS output. This would have the advantage that ANTLR would only need to maintain the TS target and get JS "for free" and both outputs would be exactly identical in the means of behavior.

A disadvantage would be that ANTLR would need to bundle (or maybe better depend on an existing) Node.js runtine (on the dev machines) for JS output (which IMO is okay, it's just like the dependency on the JVM). It would also need to bundle (or depend) on the TS compiler, but there is a completely self-contained npm package.

Maybe (I don't know the inner workings of the JS target, so I'm guessing) an alternative would be to add a flag to the JS target ("emit TS types"), which conditionally adds types to the JS code (TS is just a typed super set of JS), e.g. prepend a : MyAstNode after getMyAstNode(). This has the advantage to avoid the dependency on Node and the TS compiler, but of course would increase the complexity of the JS emitter (a lot of conditional emitting).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

ericvergnaud on 21 Nov 2018

[...] the purpose of the JS runtime is to run in browsers. So I’m not gonna tell existing JS users that they suddenly need to recompile the runtime from TS to JS in order to use it.

Maybe there was a misunderstanding. The TS->JS compilation is something ANTLR can/should do (not the user) and it could be even transparent for the ANTLR user if Node + TSC would be bundled.

What I meant with "JS for free" is that it wouldn't be necessary to maintain two targets. Just the TS target and the JS target would generated (compiled) from the TS target. (Again TS is "only" a typed super set of JS and 100 % run-time compatible.)

BTW: This could also be implemented gradually (iteratively): In a version 1 of the TS target it could only generate a part of the types (and just emit emit the any type for not-yet supported members) and then the typing could be improved in latter versions step by step. This is a strategy which is also used when porting large JS code bases to TS.

BTW 2: Many thanks for your "JS stuff" !

ulrichb on 21 Nov 2018

Hello folks. Part of the stalled TS runtime falls on me, I've gotten focused on some non-code matters that seem close to an end. I've also sort of lost-touch with @sharwell who really was at the heart of the effort. Sam are you here?

BurtHarris on 23 Nov 2018

@BurtHarris I left you a message in Gitter 😄

sharwell on 24 Nov 2018

sharwell on 26 Nov 2018

👍3 ❤1

Any chance of adding links on the antlr website to support external projects that implement antlr for other languages or promote the upstream of those projects to antrl organization?

I think the projects would benefit from some marketing.