size, bloat, install
The typescript package is large, and it only getting larger.
Version 3.1.3 is a whopping 40MB.
TypeScript is used in many contexts.
A TypeScript formatter (e.g. prettier) does not need an entire compiler. It only needs a parser. And 45MB scripted parser is orders of magnitude larger than one would normally expect. (For reference, the installed npm package for Esprima -- the most compatible and compliant ES parser in the ecosystem -- is a mere 0.3MB.)
Optionally, there could be separate packages for typescript-config and typescript-i18n.
There is a lot of code duplication between
Don't duplicate the code.
My suggestion meets these guidelines:
Some backstory in #23339.
Interesting reading, thanks.
TypeScript [2.9.0] has doubled in size since v2.0.0 - now 35 MB
It was "fixed" by #25901, released in 3.1.1, which was 40MB. :slightly_frowning_face:
It won't be hard at all to shrink the package size. For example, lib/tsserver.js
and lib/tsserverlibrary.js
are 98% identical.
$ du -b node_modules/typescript/lib/tsserver.js node_modules/typescript/lib/tsserverlibrary.js
7290127 node_modules/typescript/lib/tsserver.js
7308140 node_modules/typescript/lib/tsserverlibrary.js
$ comm -12 <(sort node_modules/typescript/lib/tsserver.js) <(sort node_modules/typescript/lib/tsserverlibrary.js) | wc -c
7207205
And 99% of lib/typescript.js
is identical to those.
$ du -b node_modules/typescript/lib/typescript.js
6859801 node_modules/typescript/lib/typescript.js
$ comm -12 <(sort node_modules/typescript/lib/tsserver.js) <(sort node_modules/typescript/lib/typescript.js) | wc -c
6850490
And lib/typescriptServices.js
is byte-for-byte identical to that.
$ sha1sum node_modules/typescript/lib/typescript.js node_modules/typescript/lib/typescriptServices.js
0cff9734eba3d721a7ba3c72026e16f267610e24 node_modules/typescript/lib/typescript.js
0cff9734eba3d721a7ba3c72026e16f267610e24 node_modules/typescript/lib/typescriptServices.js
And 99% of lib/typingsInstaller.js
is identical to that.
$ du -b node_modules/typescript/lib/typingsInstaller.js
5285788 node_modules/typescript/lib/typingsInstaller.js
$ comm -12 <(sort node_modules/typescript/lib/typingsInstaller.js) <(sort node_modules/typescript/lib/typescriptServices.js) | wc -c
5246999
And 80% of lib/tsc.js
is identical to that
$ du -b node_modules/typescript/lib/tsc.js
3912404 node_modules/typescript/lib/tsc.js
$ comm -12 <(sort node_modules/typescript/lib/typingsInstaller.js) <(sort node_modules/typescript/lib/tsc.js) | wc -c
3219205
That's nearly 30MB of duplication just in those few files (and this doesn't even include declaration files).
I can't begin to guess at the kinds of design decisions that produce this (or what kind of compatibilities the TS team needs to support), but I trust there is a solution the maintainers would be happy with.
I can't begin to guess at the kinds of design decisions that produce this (or what kind of compatibilities the TS team needs to support), but I trust there is a solution the maintainers would be happy with.
It's done this way so that every file can be used by itself without having to deal with the nastiness of modules in JavaScript. Every file is a functional library/program in itself. I think that is a great thing, at the cost of some disk space.
Reading the linked issue #23339, it appears that it desire is in fact to (eventually) use modules.
https://github.com/Microsoft/TypeScript/issues/23339#issuecomment-380632662
If we used modules, we'd be able to share each file and avoid this duplication
it is something we want to do, but no plans for the short term. that is where the majority of savings would come from.
nastiness of modules in JavaScript
ES module systems in general can be hit-and-miss, but reminder that we're talking specifically about an _npm package._
npm, npm packages, node_modules, package.json, etc. are relate to Node.js (or clones) which supports CommonJS. Right?
I have two ideas, but I am not sure which one is better.
split code in the source code
for now, some common utils or helper has been shared with a different component, we could split them by function, eg: utils.ts -> utils.ts( common), utils.factory.ts(depend on factory), utils.emitter.ts(depend on emitter), etc.
if you want a factory or emitter only. just create a tsconfig.json file that include the depended file,
analyze and transform the bundled file
the namespace has been compiled to many iife and injected the namespace instance,
we could compile with target esnext and merge those iife, then transform the ts.xxx = xxx
to export xxx
,
and then, we could pack them as a normal esm project and tree shark
ping @DanielRosenwasser
What do you think about that🧐?
I am skeptical that tree-shaking is useful for shipping our own package because presumably everything we ship is used in some capacity, or is part of our public API - at which point, our consumers would actually be the ones winning from tree-shaking.
Splitting source on its own can help, but practically speaking the larger components like services and TSServer will need the entire core compiler.
I think that converting to modules is the most practical and obvious way to avoid duplicating most of the contents of tsc.js
3+ times.
A simpler solution: inspired from Busybox.
Combine N near-duplicate files into 1 polymorphic file that can do N things based on a parameter passed in.
It would introduce a performance overhead of parsing tiny % of unnecessary JS code, but can make the tool integration story way simpler. Maybe worth it?
One trivial way to know which feature is expected would be to directly copy Busybox approach: symlink all the duplicate files and differentiate at runtime based on the __filename
. Saves disk space, package size, bandwidth. There are more interesting options too.
From speaking with @RyanCavanaugh, it sounded like @orta was interested in working on this.
+1 for splitting up typescript into multiple packages. One major benefit would be that these individual packages (other than the "typescript" package) could use semantic versioning on at least their APIs then other libraries could just depend on the packages they need. Right now it's kind of a pain to maintain a library that has a peer dependency on the typescript package (without being super strict about the supported version).
Yeah, I'm chatting with folks internally this week, but my goal is roughly:
typescript
be the same as right now (as removing things would break the world) which provides all toolingThen have subset packages which are smaller and focused on a specific task:
@typescript/tsc
for folks who are just doing compilation (e.g. tsc compiles on the server, prettier for the AST)@typescript/services
for folks building dev tools like monaco-typescript, or executeprogram etcI doubt I can offer any useful semver on them, as they link to the main TS version. That'd need the API to actually be classed as "stable" which doesn't look like that's happening soon.
Figuring out how/if we can reduce the main "typescript"
is hopefully something I can get an idea about during ^
Removing tools from the package doesn't reduce overall size. Compilation, dev tools etc reuse a lot of the same code that is now copied to multiple commands without changes. The issue is how to share the very duplicated part between the tools, reduce the duplication, or pack the tools into one bundle.
Yeah, I'm chatting with folks internally this week, but my goal is roughly
Oh, we're generally for it (and have been for years, provided we still provide a services bundle for our (browser) consumers who use it) - we just need an _automated_ way to remap the current namespace-based code layout into modules, this way we can keep a PR doing the migration up to date and not stop development on other things. I have a branch from two years ago that migrated all of src/compiler
to modules (by hand) - checker.ts
had something like 100 lines of imports on it. And that took quite awhile to make. That gave some of us some pause and reduced enthusiasm, but... I'm hoping the final result is still seen as worth it.
With respect to said automation, I _think_ we could probably write a kind of codemod for it using the APIs we have today, but nobody's put in the effort yet.
@orta VS Code is very interested in this work. Right now we consume TypeScript in two ways:
tsserver.js
— Used by our JS/TS extension typescript.js
— Used by our html extensionEach of those files is around 8MB on disk. Additionally, are interested in shipping built-in support for tsc (tsc.js
), but that's another 4.5MB and that's difficult for me to justify. It seems to me like all these various TypeScript components should be able to share a lot of code.
Let me know if you would like any additional info about how VS Code consumes TS
As a side note, typingsInstaller.js
is pretty huge too (6MB)!! Does it pull in a lot of stuff from TS core?
I brought this up during the most recent design meeting - https://github.com/microsoft/TypeScript/issues/34899
Where the end result was basically, we're meeting about trying to get modules happening again
As mentioned above - all of these files are _basically_ the same but with a bit of flavor difference because they represent different sets of the compiler + services - for example _I think_ you can probably use tsserverlibrary
for both the html + JS/TS cases in vscode, buttsc.js
doesn't look like it lives in there.
https://github.com/microsoft/TypeScript/pull/35561 is looking like the answer to this, I'll keep my eye on PR to see how things change
I am skeptical that tree-shaking is useful for shipping our own package because presumably everything we ship is used in some capacity, or is part of our public API
If it's not much effort to add into the build, this could still be a worthy goal. There are a few consumers, like Prettier and the new VS Code JS debugger extension, who ship TypeScript in a bundled form. It would double the size on disk if you shipped both ESM and CommonJS in a single package--maybe it could be a separate set of /typescript.*-esm/
packages?
@connor4312 You'll be able to give it a shot when it's migrated, I'm just saying to temper expectations about the savings you'll see.
npm install [email protected]
results in a 60MB node_modules
on my Mac (56MB of which is typescript itself). Typescript is by far the largest module in our stack (and we have 146 explicit deps in package.json) – would love to see some reduction here 🙏
Yup. This is the second largest module in my stack. [email protected]
is taking up 52M
on disk - while its fine for prod since people typically dont ship typescript as well in images but the transpiled js files, still a reduction in size can impact the dev env significantly.
Most helpful comment
Yeah, I'm chatting with folks internally this week, but my goal is roughly:
typescript
be the same as right now (as removing things would break the world) which provides all toolingThen have subset packages which are smaller and focused on a specific task:
@typescript/tsc
for folks who are just doing compilation (e.g. tsc compiles on the server, prettier for the AST)@typescript/services
for folks building dev tools like monaco-typescript, or executeprogram etcI doubt I can offer any useful semver on them, as they link to the main TS version. That'd need the API to actually be classed as "stable" which doesn't look like that's happening soon.
Figuring out how/if we can reduce the main
"typescript"
is hopefully something I can get an idea about during ^