Theia: [Ericsson team] Items for April

Created on 10 Apr 2018  路  27Comments  路  Source: eclipse-theia/theia

Here are the items we plan to work-on for April (some are already started).

One theme that emerged is that we want to address some shortcomings that impede us using Theia to develop Theia. A secondary theme is to make Theia applications easier to troubleshoot at runtime.

  • Make TSLint work in Theia: https://github.com/theia-ide/theia/pull/1662
  • Create improved Theia HelloWorld extension
  • Perform usability assessment, fix some low-hanging fruits, report the rest
  • Electron front-end, using remote Theia backend. (https://github.com/theia-ide/theia/issues/174)
  • Prototype Multi-root workspaces support (https://github.com/theia-ide/theia/issues/1660)
  • Investigate improved language Coloring (for TypeScript at first). Could be semantic coloring, but there may be an alternative that is simpler and almost as good.
  • "open workspace": add preference to decide if should be opened in new tab or same (issue TBD)
  • prototype adding dynamic tracing in Theia (try using framework like wft, Catapult, or maybe just "trace" level logs...? )
  • Allow changing log levels of different loggers independently (https://github.com/theia-ide/theia/pull/1624)
epic

Most helpful comment

Investigate Semantic Coloring

It doesn't necessarily need to be 'semantic'. Vs Code (and so does TS Server) not provide any semantic coloring, but instead the lexical coloring uses a more complex grammar, that identifies certain tokens to be used as type name, variable, etc. But that is purely syntactically. Also, it is not always correct, but it doesn't seem to bother anyone :-).

Also, there is ongoing work on the redhat side : https://www.youtube.com/watch?v=_pLDXlndgXA

All 27 comments

FYI, the ideal would really be to be able to generate trace files as per the Trace Event Format introduced by Google and supported by the Chrome Trace Viewer/Tracecompass and so on.

Investigate Semantic Coloring

It doesn't necessarily need to be 'semantic'. Vs Code (and so does TS Server) not provide any semantic coloring, but instead the lexical coloring uses a more complex grammar, that identifies certain tokens to be used as type name, variable, etc. But that is purely syntactically. Also, it is not always correct, but it doesn't seem to bother anyone :-).

Also, there is ongoing work on the redhat side : https://www.youtube.com/watch?v=_pLDXlndgXA

After investigating with some tracing libraries I suggest we keep that on hold for now, and that we should further refine what exactly are our needs for that. Some things I noted when trying out stuff:

  • Web Tracing Framework

    • Tracing framework from Google with custom WTF trace format
    • Lots of undocumented API, and event the viewer has some TODO widgets with WIP stuff so it's not really usable.
    • Originally made for the frontend by injecting the library at runtime, but also supports node experimentally
    • Last commits are from 2017, with pretty much nil activity since then (no issues, no PR, no commits)
    • No Typescript typings and no answers to an issue I've submitted to see if they were interested in it
    • I've managed to put some scope events in it while prototyping in the backend-application and see that the startup time was 55ms on my machine, which was nice info which could be used for performance testing in the future (I would consider simpler, different libraries for this)
  • Catapult Project

    • Very active, seems to me like Google might have switched to this project for everything tracing
    • Contains lots of tools, from the chrome://tracing viewer to some Android tracing stuff
    • Doesn't seem to supply a client library like wtf where you can put tracepoints in your code. AFAIK you use the tools to trace the browser (chrome) or the Android device, but tracing a node.js app or specifically a web page, well that doesn't seem to be the tool for that.
    • Uses the Trace Event Format I believe, which is more standardized than the wtf format.
  • node-trace-event

    • This was very promising, but unfortunately it was only a WIP by Joyent (the guys that made the bunyan logger) and it was clearly abandonned after.
    • Supports the Trace Event Format also
    • Easy to use
    • Generates a trace easily openable in both chrome or TraceCompass (with the help of a plugin).
    • No typescript typings
    • Doesn't support all the events defined in the Trace Event Format, which is a bummer. It was quite easy to modify the js code to add them though.
    • Supposedly had a bunyan integration (although because there's a big lack of documentation I couldn't really figure out how to use it)
  • njsTrace

    • This seemed like a really cool library to instrument an application (function calls, scope durations, counters etc.)
    • Unfortunately last updated 4 years ago with some PRs pending which were supposed to fix some serious issues.
    • For performance testing, I think we should look at something similar (but maybe more modern) and only use informations like scope durations so that we could easily start documentating things like Theia startup times and whatnot.

Again, I believe we should clearly define what we want to do with that @marcdumais-work, do we want performance regression testing? Do we want to generate CTF traces from the backend? Do we want to provide an api for extension developers to be able to trace their extensions? In that case we don't want to clash with the current bunyan logger as maybe this could more than enough to test stuff. We might even be able to do performance regression testing with bunyan logs alone (although I'm not sure how accurate it would be).

So basically we could use primsjs. This seems like it would greatly reduce the complexity as we might not have to supply grammars like with textmate. Did I get that correctly? Let me try to prototype this.

@svenefftinge the only issue I see with using prismjs is that we cannot really have an extension add support for a language. Looking at the download page it seems you have to bundle core + language grammar/definitions you're interested in. Maybe if we could bundle core in an extension and the definitions in extensions but I'm not sure yet. Otherwise maybe we can select them all (thus adding a ~250kb .js to the frontend initial loading). It might be okay also.

edit: Nevermind it seems that we can define our own definitions. So we can probably do that per extension maybe?

Update: It seems even prismjs wouldn't provide much more than the basic monaco editor coloring. vscode-textmate really sounds like the appropriate solution. i.e

let fn = () => {

}

fn();

In vscode both the fn declaration and call would be colored the same as there is a state machine that checks if the token is the same later in the file whereas prismjs seems to only check with basic regexes, i.e it doesn't know that fn declared like this is a function, so it doesn't color it the same way as calling fn().

For this reason, I believe we should really use vscode-textmate. The only thing is that so far it can only run on node because of a dependency on oniguruma, which is a node library for regexes, so we would need a json-rpc ws communication to get syntax classes from the backend (or tokens if we process the classes in the f-e). The problem with the POC from redhat is that it's a lot of copy-pasted code from vscode (which is fine from a POC) but I think we should start with something much simpler for now, while still keeping a similar architecture of having vscode-textmate in the backend.

This is the working branch with prismjs if you want to see it for yourself

@epatpol true, it's not perfect. But it's still at least an improvement over what we have today. Do you have an idea how prismjs and vscode-textmate might compare, performance-wise? What about the range of supported languages?

I tried your branch. Overall, I like the result for the light theme. But I noticed a problem when using the dark theme, that makes some characters difficult to read: for example: < > = -

Java, dark theme. Left: master, right: this PR:
image

With the light theme it's better:

Java, light theme. Left: master, right: this PR:
image

C++, light theme. Left: master, right: this PR:
image

TypeScript, light theme. Left: master, right: this PR:
image

go, light theme. Left: master, right: this PR (one has LS, the other not - unrelated to PR):
image

I wonder if we could tweak prismjs so that we match the default colors we are already used-to? e.g. comments in green instead of gray, etc?

@simark opinions about this?

We should probably use the same tokens as monaco to keep the styling consistent with the theme I think, as such we probably wouldn't need the prism css file. However @elaihau mentionned that the tokens generated in vscode (mtk1, mtk2, mtk3) seems to be dynamic per language, so I don't know if it's a bit more complex to do.

@marcdumais-work I'm surprised go worked though, as it seems I only included some basic language grammars in the prims.js file here.

@svenefftinge WDYT? Should we go with prismjs frontend coloring (which seems to offer just a bit more than the basic monaco one), or do like redhat POC (rewrite it from scratch in a simpler way imo) which uses vscode-textmate library in the backend. The advantages are that it's a bit smarter. It can't run in the browser however, so we would really have to use websockets for this (which might be fine).

We took another look with @epatpol. One side is that the theme used doesn't do full justice to the prismjs engine, sometimes the engine recognized some words as a particular token, but the theme colours multiple tokens with the same colour (whereas vscode doesn't), so it makes it look like prismjs didn't understand a particular construct. So we can improve a little bit the prismjs experience by tweaking the theme. But the way prismjs works seems to be just by recognizing some predefined keywords and applying some regexes. It will never be as good as something that actually parses the program and colours things based on the semantic.

Here's a comparison of:

  • Monaco colouring
  • PrismJS with some tweaking
  • VScode (vscode-textmate)

base
after
vscode

I'd say that vscode-textmate > PrismJS > Monaco. Even vscode-textmate is not perfect, I think it would make sense to make "callback3" yellow and make "MyClass" at the bottom the same colour as other instances of "MyClass".

As @epatpol pointed out, the problem of vscode-textmate is its dependency on node, which means we have to run it on the backend. The prototype transfers the whole file every time we want to run the colourizer, which is not ideal. We could implement some incremental syncing to avoid transferring the whole file. But then it looks like we are just re-implementing a language server...

So, if we must go to the backend anyway to get good semantic colouring, we might as well query the language server. Given that we are planning to work on adding support for semantic colouring to the LSP this year anyway, maybe it would just make sense to go that route.

Have you considered wasm-oniguruma? Just asking. I don't how the performance of it is.

"So, if we must go to the backend anyway to get good semantic colouring, we might as well query the language server. Given that we are planning to work on adding support for semantic colouring to the LSP this year anyway, maybe it would just make sense to go that route."
+1 :-)

Have you considered wasm-oniguruma? Just asking. I don't how the performance of it is.

Nope I had not seen that, thanks!

And yet another try to bring onigurama to the browser: https://github.com/NeekSandhu/onigasm

Should we add this topic to next week's meeting agenda?

So, if we must go to the backend anyway to get good semantic colouring, we might as well query the language server. Given that we are planning to work on adding support for semantic colouring to the LSP this year anyway, maybe it would just make sense to go that route.

I agree that for languages we deeply care about, that's the way forward. It will be more work, but worth it. However, something like PrismJS could still be useful for other languages, that may not get a LS that supports semantic coloring, anytime soon.

So, if we must go to the backend anyway to get good semantic colouring, we might as well query the language server. Given that we are planning to work on adding support for semantic colouring to the LSP this year anyway, maybe it would just make sense to go that route.

We need both. Semantic coloring is always a little bit slower, because it needs to wait for parsing and linking. So what IDEs like Eclipse or IntelliJ do is to have a fast lexical strategy that is triggered as you type and a slower semantic coloring that updates with a latency (usually not even on keystroke, but when the user stops typing). Together with positions, i.e. color annotations move with the text as you modify the text, this provides a good experience.

Vs Code however doesn't do any semantic coloring but relies on fancy syntactic coloring (textmate) solely. This seems to be a good-enough experience for many developers, although if you look closer there are many 'incorrect' colors applied. But it's colorful :-)

We need for sure a lexical coloring pass that runs as fast as possible after the user typed. I don't think a server roundtrip is acceptable (and needed) here.

We need for sure a lexical coloring pass that runs as fast as possible after the user typed. I don't think a server roundtrip is acceptable (and needed) here.

That's what I was wondering. In case like this then prismjs might be more acceptable than having to communicate with the backend for as fast as possible coloring.

That's what I was wondering. In case like this then prismjs might be more acceptable than having to communicate with the backend for as fast as possible coloring.

The question is: we already have some simple syntax highlighting for free in monaco, that kind of covers this part of the field. Is it worth introducing another library to cover approximately the same part?

Good question. I think we should aim for textmate support, not only because of what it can do, but because of the amount of existing grammars. So ideally someone looks into running vscode-textmate with e.g. https://github.com/NeekSandhu/onigasm.

It doesn't look that easy to do afaik: https://github.com/Microsoft/monaco-editor/issues/171

Seems like the simplest is really a backend communication with vscode-textmate.

It doesn't look that easy to do afaik: Microsoft/monaco-editor#171

I don't see any new information that hasn't been discussed before.
This looks promising, too: https://github.com/Microsoft/vscode-textmate/pull/44

I don't see any new information that hasn't been discussed before.
This looks promising, too: Microsoft/vscode-textmate#44

Indeed, I can try using this PR in the frontend as a POC for now, thanks!

Closing, because April is over.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Beetix picture Beetix  路  3Comments

akosyakov picture akosyakov  路  3Comments

kpge picture kpge  路  3Comments

jankeromnes picture jankeromnes  路  3Comments

marechal-p picture marechal-p  路  3Comments