Mdx: Babel or estree?

Created on 15 Dec 2020  ·  6Comments  ·  Source: mdx-js/mdx

Subject of the discussion

With https://github.com/mdx-js/mdx/pull/1382, we now have a JavaScript syntax tree.

The tree starts out in estree: as markdown + mdx.js is parsed simultaneously, I needed a JavaScript parser in micromark-extension-mdxjs, and I chose a small and fast one: acorn. Which comes with estree.
Acorn is small, 30kb minzipped. acorn-jsx is 4kb. astring (a generator) is also 4kb.

Previously, in this project, we used Babel for plugins.
Babel is giant. @babel/core, which has methods to run Babel plugins, is like 220kb minzipped. @babel/generator is 63kb. @babel/parser is 60kb.@babel/traverse is 165kb (it includes both the parser and the generator).

Estree has the drawback of being a fragmented ecosystem: there are no nice parsers that support comments; there are no tree-wakers or compilers that support JSX. And importantly, as as we use JSX, we’d want to turn JSX into function calls (React/preact/vue), but those are all Babel plugins. We could use estree but then users would still need to run Babel afterwards.

Babel has the drawback of being giant and slow. But the good thing is that the JSX -> JS compilers all live there.

Problem

What should we go with?
We can’t turn JSX -> JS unless we’re using Babel (well, we could, the babel plugin to turn JSX -> _jsx() / React.createElement is 800l). Most users probably want to use Babel plugins to turn their fancy features into whatever.
An estree-only system as a base for MDX would be ✨✨✨. @mdx-js/runtime is now 350kb minzipped. That could go down to 100kb or less?

💬 typdiscussion 🥂 statumerged

Most helpful comment

@ChristianMurphy I definitely wouldn't hold up any changes here based on the work in /rust. If our priority is small, then wasm is probably not the answer at the moment. swc is what I'm planning to use for /rust's js parsing and we could invest there more in the future but it's not a solution for today's in-browser use cases IMO.

that said, swc is hella faster than babel in my experience from working with it in toast (via the Rust APIs), and will work well for node-backed stuff if we're looking for a speed boost at some point in the future (TBD, caveats apply, /rust is an experiment, etc)

All 6 comments

Estree has the drawback of being a fragmented ecosystem: there are no nice parsers that support comments; there are no tree-wakers or compilers that support JSX

ESLint's parser and walker have solid ESTree + Comment + JSX support
https://github.com/eslint/espree
https://github.com/eslint/eslint-visitor-keys

Prettier has espree with Comment + JSX support for code gen https://github.com/prettier/prettier/blob/902d524d2f1776efe0b110c1a24813d4d7fcb9d0/src/language-js/printer-estree.js
escogen is close to having ESTree + JSX support https://github.com/estools/escodegen/pull/391

Coming from the perspective of personally using MDX more as a build tool than as a runtime component, and liking both using proposals and typescript features.
I'm drawn more towards babel, having the ability to parse new syntax, having the option to support typescript syntax, and the broad support for babel within node/javascript tools are a draw.
Because of mostly using it as a build tool, bundle size is less of a priority for me.

If we have to pick just one, I'd lean babel.

That being said, do we need to pick just one?
Could the JavaScript parsing strategy be made pluggable?

Offering another consideration, if bundle size is the primary goal.
Acorn may not be the smallest option, wasm can pack smaller than JS, for example https://bundlephobia.com/result?p=@swc/core@1.2.40
and still allows for custom transforms if needed https://swc.rs/docs/usage-plugin
or other estree like javascript based parsers such as https://github.com/meriyah/meriyah and https://github.com/KFlash/seafox

/cc @ChristopherBiscardi since this approach has some potential tie ins to https://github.com/mdx-js/rust


edit: correction bundlephobia ignores wasm, the library may be faster, but it is not smaller https://unpkg.com/browse/@swc/wasm@1.2.40/

Thanks for all this research folks! I'd lean towards something smaller than Babel but I'm not very opinionated there. There are lots of client-side usages of MDX that won't go away, and Babel is pretty huge and pretty slow in comparison to other options. Considering we're mostly only using Babel for internals we could port it away without users really needing to know the difference.

Also, with wooorm's new JSX parsing, we can drop a bunch of the internals we use and manipulate the AST directly!

@ChristianMurphy I definitely wouldn't hold up any changes here based on the work in /rust. If our priority is small, then wasm is probably not the answer at the moment. swc is what I'm planning to use for /rust's js parsing and we could invest there more in the future but it's not a solution for today's in-browser use cases IMO.

that said, swc is hella faster than babel in my experience from working with it in toast (via the Rust APIs), and will work well for node-backed stuff if we're looking for a speed boost at some point in the future (TBD, caveats apply, /rust is an experiment, etc)

ESLint's parser and walker have solid ESTree + Comment + JSX support
[...]
escogen is close to having ESTree + JSX support [...]
— @ChristianMurphy

espree seems to be a tiny wrapper around acorn and acorn-jsx 🤔
And a year old stalled PR is not really “close” 😅
Those visitor keys are great btw! Especially as espree is ± the same ast as acorn + acorn.jsx!

Porting our internals from Babel to estree is not a lot of work. Three small plugins: https://github.com/mdx-js/mdx/blob/68ff02c8129e2922f48b59bf51f4b967d248f397/packages/mdx/mdx-hast-to-jsx.js#L6-L8.

For a nice JSX serializer, we could look into adding that to either escodegen/astring/or whatever else is nice.
But as we’re thinking of compiling JSX away, that’s not needed. Rather, forking babel-helper-builder-react-jsx-experimental for estree seems to be the way to go (not sure about Vue though...).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AlmeroSteyn picture AlmeroSteyn  ·  32Comments

NMinhNguyen picture NMinhNguyen  ·  15Comments

loganmccaul picture loganmccaul  ·  13Comments

aress31 picture aress31  ·  19Comments

pedronauck picture pedronauck  ·  18Comments