JS sources seem be shipped as is, converted into a .cc file. They could be shipped minified though. Has there been previous discussions around this?
Obviously publicly visible identifiers must not change. Advantages are faster scanning, lower shipping binary size, and slightly lower memory use due to shorter identifiers. Disadvantage would be worse debuggability, but solvable by making this an option or providing source maps; and different inlining decisions with Crankshaft, which is going to be the past soon though.
V8 does minify its source in a step in the js2c build target. The minifier script is kind of bad though, as it uses regexes for parsing. It serves its purpose okayish, but sometimes parses incorrectly, resulting in confusing bugs.
Another option is to not ship the JS sources at all, but the precompiled bytecode. This likely increases the binary size though. V8's uses about 4x memory in the startup snapshot for precompiled bytecode vs equivalent minified source. That would also increase GC pressure.
I have a faint recollection about this from two years ago, but couldn't find pointers (@indutny?)
I would add an optimizer step (de-deopter).
+11 for source-maps.
I've often considered this but have yet to form a solid opinion on it. The one major immediate impact it would have is on error stack traces, which would force it to be a semver-major change. I'm definitely open to exploring it if it can be done in a way that minimizes any potential for breaking changes.
I'm leaning towards -1. Being able to look up the line number from a stack trace in the original source without any other tooling/resources is important to me.
FWIW I think this should not actually be labelled with "V8 Engine" since the files I would consider for minifying are in lib/
.
But it's the closest? And it's V8 dependent?
I'm leaning towards -1. Being able to look up the line number from a stack trace in the original source without any other tooling/resources is important to me.
@mscdex I partially agree, since in the bigger picture way more stack traces are created for userland code. Internal frames are almost useless to anyone but us. Also it might be solvable? (Easiest/fastest solution is to exclude from debug builds, or even make it flagable)
Internal frames are almost useless to anyone but us.
I disagree. They are useful for everyone.
Everyone? I'm sure you mean that we differ on the ratio of people who they are interested them 😉
From my experience with asynctrace
(my own longjohn
ancestor) the most common request was to filter those out...
FWIW, there's my somewhat related February PR (https://github.com/nodejs/node/pull/11417), but I'm not really sure at this point that it is a good idea, not to mention minifying the sources completely.
Advantages are faster scanning, lower shipping binary size, and slightly lower memory use due to shorter identifiers.
^--- No none of those are an advantage until the values of those numbers are provided.
Disadvantages are obvious. In my experience, no one wants to see node internal line numbers... until they have a problem that leads to node internals, then they can't live without them.
I would suggest:
Although I think that is we have a --no-minify
flag for people to use for debug traces, it's a decent fallback.
@hashseed do you know if source-maps are supported by the Stack-Trace-API
loc: MaybeHandle<Object> ErrorUtils::FormatStackTrace
Worse case we could use Error.prepareStackTrace
No, and I would prefer not using prepareStackTrace
for this. I could imagine a simple C++ callback that provides the source offset and source URL and expects a line/column pair in return.
(everyone has their own favorite 🛠️) Where would you hook it to?
Could also return a triplet with func name
.
@hashseed @nodejs/v8 Do you have any thoughts on adding source map support to V8 itself (#6471)? I think if that were to happen, there wouldn’t be much standing in the way of this, right?
Not sure whether it makes sense for V8 to support source maps internally, or just offer a callback mechanism to allow the embedder implement this (as mentioned in earlier).
I don't see any point in minification of code that isn't being sent to some client web browser. How many bytes, and it would certainly be measured in less than megabytes since the current lib
directory is 1.1MB, do you think will be saved by this proposal? I don't think it would be enough to warrant:
I don't see any point in minification of code that isn't being sent to some client web browser. How many bytes, and it would certainly be measured in less than megabytes since the current lib directory is 1.1MB, do you think will be saved by this proposal? I don't think it would be enough to warrant:
IMHO binary size will not be a see benefit (especially if we add source-maps), but as @hashseed points out "faster scanning" and "slightly lower memory use" will benifit. Also some code "cosmetics" cause deopts (like character count of a function, including comments).
the added build complexity
Writing it once, yes, but manageable. Running it on every build IMHO negligible.
the reduced clarity for users stepping through bugs
This is a blocking issue, but a target to achieve anyway (source-maps support in debugger).
@hashseed Yeah, I’m not sure either, but if Node wants to have source map support in general (and I think we’d want it for this feature, and users have been requesting it independently too), then practically speaking it’s something that all of V8’s major consumers use, right?
like character count of a function, including comments
That’s gone with TF+I, I wouldn’t care about this peculiar thing.
After some internal discussions, I think it's unlikely that V8 will include support for source maps. To prevent XSS attacks, V8 would require, in a browser context, the source map to be the same origin as the script itself. For Node.js this is not an issue at all. I think it's a lot easier to only implement a callback so that the embedder can implement source map support.
Ok, can we help?
Advantages are faster scanning, lower shipping binary size, and slightly lower memory use due to shorter identifiers.
@hashseed, still, what are the numbers here?
I expect to see less than 0.5MiB binary size decrease (somewhere around 1% of the total binary size) and near-zero memory usage change if we don't include the sourcemaps.
I expect to see an _increase_ in distrubution size if we do include the sourcemaps. Not sure about the memory usage — that depends on when the sourcemaps will be loaded, but at the moment when they will be loaded — I expect to see an increase on the memory usage.
Also note that loading sourcemaps will either slow things down by default when errors are generated, or will require some kind of opt-in, and that will make things harder to debug (e.g. for issues reported on this issue tracker, etc.).
My current estimation is that it's simply not worth it, hence -1.
Fair enough. I agree that there is no net win if source maps are mandatory.
Most helpful comment
^--- No none of those are an advantage until the values of those numbers are provided.
Disadvantages are obvious. In my experience, no one wants to see node internal line numbers... until they have a problem that leads to node internals, then they can't live without them.