Xterm.js: Improve testing and tracking of performance critical components

Created on 14 Sep 2018  路  7Comments  路  Source: xtermjs/xterm.js

Rendering performance has regressed over 100% since 3.3.0 https://github.com/xtermjs/xterm.js/issues/1677, we should improve how we test and track this. I'd love to hear from people on how we could go about doing this is a good way but this is what I think we want:

  • A benchmark test suite which prints numbers for things like

    • Filled line render time

    • Empty line render time

    • Full viewport render time

    • Buffer write/read

    • Fill buffer

    • etc.

  • Eventually a dashboard or some way to track this periodically
areperformance help wanted typdebt

All 7 comments

We need an APM for this, lets stall xterm.js development for 3 ys and make the performance tooling first (or gather some bucks and at least 20 highly skilled C++/JS developer to get the job done). :laughing:

There are a few profiling tools that will cover parts of your list with reliable results (esp. components that could be tested in nodejs with a high res synchronous timer), as soon as the browser engine gets involved we are stuck with the nerfed timer due to Spectre. Since in the end all that counts is the user perceived performance the latter is still testable by doing "full runs" with typical actions (like my current ls benchmark) and comparing the numbers from the integrated profilers. Those numbers are less reliable though and contain noise, imho chrome and firefox use a statistical approach to get the numbers by peeking into the JS callstack periodically. This testing could be done in a selenium env, maybe electron allows there additional interaction. Last but not least chrome exhibits many debug switches that would help with tracing tasks.

Since you wrote this issue from the canvas perf regression perspective - imho this is even more tricky to test in a reliable manner, it heavily relies on system specifics like the OS, installed GPU and might even be driver version dependent. Under such circumstances a "once and for all" optimal solution does not exists.

TL;DR

  • We could test core components isolated with standard nodejs tooling.
  • We could test end user experience with typical actions in selenium envs or with electron and the built in profilers.
  • No clue how to get reliable numbers for GPU driven stuff, lol.

Edit: This might come handy - https://github.com/ChromeDevTools/timeline-viewer. It even has a compare mode.

Edit2: For in browser tests we can use https://github.com/paulirish/automated-chrome-profiling. With this we can run test cases in chrome and grab the profiling data. From there its only a small step to some dashboard thingy tracking changes over time. To get something like this running, we will need decent cloud storage (the profile data tend to get really big).

Here is a proof of concept perf tool, that gets the timeline data from chrome: https://github.com/jerch/perf-test. To run it, edit the options in example.js to your needs, start the xterm.js demo and run the example. It talks with chrome via the debugging protocol, I was not able to get the data with the webdriver (the timeline data was removed several versions ago from the selenium chromedriver).

Current plan:

  1. Improve chrome-timeline https://github.com/jerch/chrome-timeline/issues
  2. Create xtermjs/xterm-benchmark which integrates with chrome-timeline and adds things like adding a baseline to compare, cleaning baseline
  3. Create some reasonably reliable/consistent benchmarks
  4. Use xterm-benchmark when we're testing perf changes in PRs/versions
  5. Integrate with CI to run a baseline on PR base branch against the PR change, comment on PR (only when a benchmark label is present to reduce noise?)

@Tyriar
chrome-timeline should now work from npm. There are a few changes:

  • timeline returns now summaries for traces
  • by default trace data is not written to disk, can be changed via tracingStartOptions or tracingEndOptions

@Tyriar

Offtopic: I already found a rather big perf regression in the parser, remember those numbers here: https://github.com/xtermjs/xterm.js/pull/1399#issuecomment-386107171 - print has dropped to 50 MB/s :scream: . Others also dropped but only slightly. Not sure yet what causes it, Imho there were only small fixes done to the code after those numbers.

Which leads to a more ontopic question: I have those benchmark data files and scripts from the parser, also used them to get the numbers here https://github.com/xtermjs/xterm.js/pull/1731#issuecomment-428696799 - I think we can use those for some first systematic perf regression testing. But where to put it? Into xterm-benchmark? Some subfolder in xterm.js for now until we got xterm-benchmark properly set up and integrated? What are your plans with xterm-benchmark?

To get the ball rolling a few ideas from my side:

  • similar layout to test cases:
    Imho the perf case layout should work similar to test cases - means there are perf cases in some perf files with special symbols from xterm-benchmark to make life easier writing those perfs (pretty much like mocha/jasmine does).
  • provide a cmdline interface:
    Maybe for a starter less important, xterm-benchmark itself could provide a cmdline interface to run those perf files and do its magic (like tracking stats over several branches and reporting regressions).
  • data storage:
    To aggregate data and spot regressions xterm-benchmark would also need some persistent storage to aggregate and compare the data with previous runs. The easiest way is imho the common pattern to create a dedicated subfolder in the source repo. Not sure yet, how to save those data efficiently, maybe some json files will do. Dont want to pull in DB stuff from the beginning.
  • no test case mixing:
    Doing chrome-timeline my initial goal was a nice integration with mocha test cases - well thats a bad idea, the debug settings are likely to screw the performance numbers. At least worth a note to not mix test and perf cases (unless someone really wants to test the debugging performance lol).

Made some progress: https://github.com/xtermjs/xterm-benchmark

  • basic working cli
  • mocha like perf case file creation with preparation and cleanup functions
  • extendible perf case classes via mixins
  • baseline creation
  • eval runs against a given baseline with automated tests
  • configurable tolerance settings for eval runs

There are a few early xterm.js tests. Those are currently hardlinked against an existing xterm.js (just check out the repo next to the xterm.js repo folder).
Run the cli by:

#> cd xterm-benchmark
#> npm install
#> node lib/cli.js --help

More see the https://github.com/xtermjs/xterm-benchmark/blob/master/README.md.
Its still pre alpha, so dont expect everything to work as intended.
Enjoy :smile_cat:

We've done lots of work on this and can now run benchmarks via npm scripts.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pfitzseb picture pfitzseb  路  3Comments

circuitry2 picture circuitry2  路  4Comments

kolbe picture kolbe  路  3Comments

goxr3plus picture goxr3plus  路  3Comments

7PH picture 7PH  路  4Comments