Bug report or feature request?
bug
uglify-js version (uglifyjs -V)
uglify-js 2.8.14
JavaScript input - ideally as small as possible.
var pc = 0;
function f(x) {
pc = 200;
return 100;
}
function x() {
var t = f();
pc += t;
return pc;
}
console.log(x()); // 300
_(code updated)_
uglifyjs CLI command executed or minify() options used.function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x()); // 100
Got bitten by this bug today because there was something asserting the variable (y above) which was stripped from dist build. So when minifying that build it would make that x += f() which broke the build.
Details: https://qfox.nl/weblog/404
Looks like ES6: https://github.com/mishoo/UglifyJS2#uglifyjs-2
Note: release versions of
uglify-jsonly support ECMAScript 5 (ES5). If you wish to minify ES2015+ (ES6+) code then please use the harmony development branch.
With the stated version of uglify-js and the example input, I get:
Parse error at ..\test.js:1,4
let x = 10;
^
SyntaxError: Unexpected token: name (x)
at stack.get (Function code:87:17)
Ah excuse me, the problem is not related to let.
@alexlamsl Please re-open this.
The problem is the compound operator. I'll update the original description not to use let (used it out of habit)
With the updated example I get:
function f(){return x=20,5}var x=10,y=f();x+=y;
which gives me x === 25 - so I still don't see where the issue is.
Cannot reproduce with default compress options:
$ cat 1631.js
var x = 10;
function f() {
x = 20;
return 5;
}
var y = f(); // x = 20, y = 5
x += y; // "20 + 5", so x = 25
$ bin/uglifyjs -V
uglify-js 2.8.14
md5-b9b03aba29a07228f5bb7978afbc5ddc
$ bin/uglifyjs 1631.js -c
function f(){return x=20,5}var x=10,y=f();x+=y;
Please provide an example that will fail from the command line as per example above.
Fair enough. The case was burried in a large auto-generated build file. Here's the concise test case that breaks:
$ cat z
var pc = 0;
function f(x) {
pc = 200;
return 100;
}
function x() {
var t = f();
pc += t;
return pc;
}
console.log(x()); // 300
$ node_modules/.bin/uglifyjs --version
uglify-js 2.8.14
md5-b9b03aba29a07228f5bb7978afbc5ddc
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:9,8]
function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x());
md5-a2b243692a31275af9d35e54dd77d4f0
function f(x){return pc=200,100}function x(){return pc+=f()}var pc=0;console.log(x()); // -> 100
Thank you for the working example.
$ node test.js
300
$ uglify-js test.js -c | node
WARN: Collapsing variable t [test.js:8,8]
100
$ uglify-js test.js -c collapse_vars=0 | node
300
@kzc so I think this is related to our discovery back in https://github.com/mishoo/UglifyJS2/pull/1620#issuecomment-287555659, and in this case collapse_vars will need to take compound assignment as effectively using the AST_SymbolRef twice?
I have a fix. Some collapse_vars tests involving assignment will be non-optimal as result.
--- a/lib/compress.js
+++ b/lib/compress.js
@@ -614,6 +614,7 @@ merge(Compressor.prototype, {
|| node instanceof AST_IterationStatement
|| (parent instanceof AST_If && node !== parent.condition)
|| (parent instanceof AST_Conditional && node !== parent.condition)
+ || (node instanceof AST_Assign && node.operator.length > 1)
|| (parent instanceof AST_Binary
&& (parent.operator == "&&" || parent.operator == "||")
&& node === parent.right)
Edit: patch was revised.
Maybe fix can be refined if the assignment RHS does not have side effects.
Trying to construct the test case I noticed that the case was already handled properly in general. It was only when it got wrapped in a local function that it broke.
Unfortunately I'm not up to speed with uglify internals to be of much more help.
Found a similar but different failure mode which https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288497566 does not cover:
// test.js
var a = 0, b = 1;
function f() {
a = 2;
return 4;
}
function g() {
var t = f();
b = a + t;
return b;
}
console.log(g());
gives:
$ node test.js
6
$ uglify-js test.js -c | node
WARN: Collapsing variable t [test.js:8,10]
4
$ uglify-js test.js -c collapse_vars=0 | node
6
@alexlamsl Oh boy. That's a problem that is not easily solved. In the expression b = a + t; the variable a is treated as side-effect-free, which it is. However f() has a dependency on a beforehand.
Tracing into each function for possible side-effects is not tenable - full program flow analysis would be required. This could be a show-stopper for collapse_vars.
Perhaps a side-effect-free variable used in the statement (such as a above) cannot be skipped over if it is not in the same scope as the statement under analysis.
Perhaps a side-effect-free variable used in the statement (such as a above) cannot be skipped over if it is not in the same scope as the statement under analysis.
That is probably true, just as with the case of undeclared variables (though for a different reason).
This is a more general fix that should handle both cases:
--- a/lib/compress.js
+++ b/lib/compress.js
@@ -614,6 +614,7 @@ merge(Compressor.prototype, {
|| node instanceof AST_IterationStatement
|| (parent instanceof AST_If && node !== parent.condition)
|| (parent instanceof AST_Conditional && node !== parent.condition)
+ || (node instanceof AST_SymbolRef && node.definition().scope !== self)
|| (parent instanceof AST_Binary
&& (parent.operator == "&&" || parent.operator == "||")
&& node === parent.right)
@alexlamsl PTAL
Bad news 馃槄
// test.js
function g() {
var a = 0, b = 1;
function f() {
a = 2;
return 4;
}
var t = f();
b = a + t;
return b;
}
console.log(g());
gives:
$ node test.js
6
$ uglify-js test.js -c reduce_vars=0 | node
WARN: Collapsing variable t [test.js:8,10]
WARN: Dropping unused variable b [test.js:2,13]
4
But as illustrated, reduce_vars seems to cover this one up (though may not be the case in general)
$ uglify-js test.js -c | node
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
6
$ uglify-js test.js -c
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
function g(){var a=0,t=function(){return a=2,4}();return a+t}console.log(g());
if it is not in the same scope as the statement under analysis.
Why would scope make a difference here?
function g() {
var a = 0, b = 1;
function f() {
a = 2;
return 4;
}
var t = f();
b = a + t;
return b;
}
console.log(g());
Same scope, why would this not have the same problem?
Why would scope make a difference here?
A function call can only change the value of a var in different scope.
It would have worked, I think, if collapse_vars scans inside the inner function.
The external function can never alter the inner/same-scope variable, that's true. But the inner function could - but so could the TreeWalker/TreeTransformer be able to reach within and scan for usage.
@kzc https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288519869 is tested with https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288517793 applied.
It would have worked, I think, if collapse_vars scans inside the inner function.
It would never end - everything within it would have to scanned as well, ad nauseam.
It would never end - everything within it would have to scanned as well, ad nauseam.
... except that inner function is already being scanned, which is why a couldn't be collapsed in b = a + t;
Can we reuse that information somehow?
as illustrated, reduce_vars seems to cover this one up (though may not be the case in general)
I see that but how exactly did reduce_vars prevent the problem in this case?
$ bin/uglifyjs test.js
function g(){var a=0,b=1;function f(){a=2;return 4}var t=f();b=a+t;return b}console.log(g());
$ bin/uglifyjs test.js -c collapse_vars=0
WARN: Dropping unused function f [test.js:3,11]
WARN: Dropping unused variable b [test.js:2,13]
function g(){var a=0,t=function(){return a=2,4}();return a+t}console.log(g());
except that inner function is already being scanned
Every function called within it would have to be scanned as well. It's non-trivial at that point.
I see that but how exactly did reduce_vars prevent the problem in this case?
It optimises function f(){...} var t = f(); into var t = function(){...};, which seems to then suppresses collapse_vars from acting on t.
Every function called within it would have to be scanned as well. It's non-trivial at that point.
Only if they are also function defined within this scope. Any outer functions won't be able to access the inner variable in question.
And any functions within this scope would have been scanned anyway, so no need to trace and re-scan?
The t in statement var a=0,t=function(){return a=2,4}();return a+t; won't be collapsed due to this specific condition:
As in:
function w() {
function x() {y(); a = 1; z(); ...}
var a = 0;
return x();
function y() {...}
}
function z() { ... }
Consider calling x():
z() can't modify a and can be safely ignoredy() can modify a, but processing w() as a scope both x() and y() would have been scanned once already, so no need to trace y() again within x()The
tin statementvar a=0,t=function(){return a=2,4}();return a+t;won't be collapsed due to this specific condition:
Not that we want to optimise this now, but is there a reason why we cannot collapse AST_Lambdas?
is there a reason why we cannot collapse AST_Lambdas?
I just assumed at the time it would lead to problems due to possible captured vars. Function calls are definitely side effects, but function expressions may not be.
Regarding https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288527937 - the function in each function call is not necessarily easy to determine. Had the z in z() been a variable then it depends on program analysis to determine.
@alexlamsl I don't see an easy fix for collapse_vars. I recommend putting in the fix from https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288517793 as well as the first two tests, and disabling collapse_vars by default until this issue is better understood.
I don't have much time to devote to Uglify unfortunately.
Yeah, I just realised you didn't scan those inner functions to suppress a in my example above - you rely on SymbolDef.references populated by AST_Toplevel.figure_out_scope() instead.
I'll sleep on this one and see if I can come up with a better solution - failing that, I'll take the fallback you've proposed.
@alexlamsl I think I know of a heuristic to determine whether a side-effect-free variable is safe to skip over - all references to that variable must be in the same scope as the statement under analysis. Then we can be sure that no one is changing its value behind our back.
@kzc and we can check that by iterating through SymbolDef.references() to check for that.
FYI, I'm half way through making that PR, i.e. all the new tests are in place.
I think this handles all cases in this Issue.
--- a/lib/compress.js
+++ b/lib/compress.js
@@ -614,12 +614,22 @@ merge(Compressor.prototype, {
|| node instanceof AST_IterationStatement
|| (parent instanceof AST_If && node !== parent.condition)
|| (parent instanceof AST_Conditional && node !== parent.condition)
+ || (node instanceof AST_SymbolRef
+ && !are_references_in_scope(node.definition(), self))
|| (parent instanceof AST_Binary
&& (parent.operator == "&&" || parent.operator == "||")
&& node === parent.right)
|| (parent instanceof AST_Switch && node !== parent.expression)) {
return side_effects_encountered = unwind = true, node;
}
+ function are_references_in_scope(def, scope) {
+ if (def.scope !== scope) return false;
+ var refs = def.references, num_refs = refs.length;
+ for (var i = 0; i < num_refs; i++) {
+ if (refs[i] !== scope) return false;
+ }
+ return true;
+ }
},
function postorder(node) {
if (unwind) return node;
Unfortunately it also causes most of the collapse_vars tests to fail.
if (refs[i] !== scope) return false; should probably read if (refs[i].scope !== scope) return false;
D'oh!
Much better - only 2 (expected) test failures.
@alexlamsl If you have time to look at those 2 failing tests, can you see if there's some other condition we've neglected to consider that we can incorporate in the heuristic to allow them to pass?
This is what I've got as an extra +ve case that makes those two tests pass:
function are_references_in_scope(def, scope) {
+ if (def.orig.length === 1 && def.orig[0] instanceof AST_SymbolDefun) return true;
if (def.scope !== scope) return false;
var refs = def.references;
for (var i = 0, len = refs.length; i < len; i++) {
if (refs[i].scope !== scope) return false;
}
return true;
}
If you can confirm that a function definition cannot be overridden thus this is a safe assumption to make, that would be cool.
@alexlamsl Awesome!
What the heck is def.orig? :-)
It's the definition node of the symbol:
https://github.com/mishoo/UglifyJS2/blob/9e6b128374c62ee9f6238134fdc207ec9dc86284/lib/scope.js#L48
https://github.com/mishoo/UglifyJS2/blob/9e6b128374c62ee9f6238134fdc207ec9dc86284/lib/scope.js#L294
When you have duplicate definitions, they form a nice array:
https://github.com/mishoo/UglifyJS2/blob/9e6b128374c62ee9f6238134fdc207ec9dc86284/lib/scope.js#L299
Nice.
If you're good with the newly revised patch, I'm good with it.
Two mocha failures:
// actual
function foo(o){var n=2*o;print("Foo:",n)}var print=console.log.bind(console);
// expected
function foo(o){print("Foo:",2*o)}var print=console.log.bind(console);
// actual
var print=console.log.bind(console),a=function(n){return 3*n}(3),b=function(n){return n/2}(12);print("qux",a,b),function(n){var o=2*n;print("Foo:",o)}(11);
// expected
var print=console.log.bind(console);print("qux",function(n){return 3*n}(3),function(n){return n/2}(12)),function(n){print("Foo:",2*n)}(11);
Ah, they have the same issue - var print; blocking inner functions from collapsing their variables.
We can live with that I think.
... and because we can't tell var n=2*o from say var n=o() apart, we can't optimise any further.
So yeah, let's accept this sub-optimal case for now.
@alexlamsl Thanks for your help.
@kzc any time! 馃槑
Great work. Thanks!
I've been thinking about the collapse var general case. You don't know what happens underwater with observable (and affectable) side effects. I think you can't really deal with this. Consider this contrived example:
function f() {
var a = 'fail';
function g() {
a = 'passe';
return 's';
}
var o = {
toString: g,
};
var t = '' + o;
a += t;
console.log(a);
}
f(); // "passes"
$ node_modules/.bin/uglifyjs --version
uglify-js 2.8.14
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:15,7]
WARN: Collapsing variable o [z:14,15]
function f(){function g(){return a="passe","s"}var a="fail";a+=""+{toString:g},console.log(a)}f();
// "fails"
_(Side note: that could have safely been console.log(a +=""+{toString:g}) but perhaps slightly too complex for analysis and too much golfing)_
While the inlining of the object is nice, the point is that the assignment of the function may happen sneakily. The string concat may look innocent but it may not be. I'm not sure you could ever statically detect this case.
There might be a generic workaround by rewriting a += '' + o to a = a + '' + o instead.
On that note, here's another case that you may want to consider;
function f() {
var a = 'fail';
function g() {
a = 'passe';
return 's';
}
var o = {
toString: g,
};
var t = '' + o;
a = a + t;
console.log(a);
}
f();
$ node_modules/.bin/uglifyjs z -c
WARN: Collapsing variable t [z:15,10]
WARN: Collapsing variable o [z:14,15]
function f(){function g(){return a="passe","s"}var a="fail";a+=""+{toString:g},console.log(a)}f();
to be precise; it rewrites the explicit a = a + .. to a += ... which could lead to the same problem as above. This may be considered a different bug with the same outcome.
The toString assignment here is contrived but could easily happen in an abstracted function, a class constructor, or whatever. And dynamic so you can't possibly detect it at parse time.
Now var x = <expr>; y += x; to y += x could be rewritten to y = y + <expr> which is still shorter than the original but safer in terms of the compound assignment problem in this ticket.
Initially I thought there were counter examples to that rewrite but right now I can't think of any. All examples I can think of actually rely on the compound assignment "caching" the value like x++ does.
And https://github.com/mishoo/UglifyJS2/issues/1631#issuecomment-288643900 is going to be an even worse problem for ES6+ code.
Your example works on the latest version of uglify-js:
$ node test.js
passes
$ uglify-js --version
uglify-js 2.8.15
$ uglify-js test.js -c
function f(){function g(){return a="passe","s"}var a="fail",o={toString:g},t=""+o;a+=t,console.log(a)}f();
$ uglify-js test.js -c | node
passes
That's great (I mean that) but do you think it holds in the generic case? This is contrived and the artifact is pretty obvious and only contained in the same scope. Does it actually taint and track tainted variables to conclude x + y is a problem not to be collapsed? There are so many outs that I'm not expecting this to be the case...
If you are asking for a formal mathematical proof on the level of how Regular Expression is computable and correct, no I haven't come up with one yet.
I think we are taking the approach of incremental progress, which means if you can find a corner case which uglify-js generates bad code, please file an issue and I'd be delighted to investigate and fix the problem.
I'm just worried that somewhere a developer is going to cry because his code magically breaks due to a minifier being too eager. However, I can't judge on how safe the heuristics are for applying the collapse-var rewrite and I think you can. So while you don't have to proof something about regular expressions, you could at least let me know what you think about the generic case outlined.
Right now all I see you do is check the explicit test case and bail, like happened when I opened this ticket. It still led to exposing a clear bug.
I'm not asking for sugar. But could you at least give me your thoughts about why this is or isn't going to be a problem for uglifyjs? Rather than just copy-paste-run-shrug, again. There's many syntactical cues that could cause the heuristics in uglifyjs to decide to bail on the collapse. Just like the initial example in this ticket needed some more work before actually exposing a bug. In this case it could be the object literal or the literal appearance of toString.
So please, if you tell me why you think this generic case won't be a problem for uglifyjs, I could be convinced to dig deeper into building a test case that it fails on. I do have some related background on the matter.
FWIW, did some quick testing with 2.8.15 and couldn't produce a failing case. So that's good!
Thanks for the testing. I'm afraid the best I can do is point you to the code, as it is the layout of which we tackle each optimisation.
A good place to start is perhaps understanding what TreeWalker and TreeTransformer do.
@qfox This is a volunteer effort. No one is paid to work on Uglify. We work on it because we find it to be interesting.
Although we try to ensure code correctness with extensive unit tests while balancing minification size and speed, no results are guaranteed. Every new release can potentially inadvertantly introduce a new bug while fixing an old one despite our best efforts. Users are encouraged to lock a version of uglify-js that they've tested against using yarn or npm shrinkwrap. Users are also free to disable minification options if there is an issue. People are welcome to submit bug reports and contribute fixes.
@kzc Well that wasn't necessary. I think you got me wrong there. I'm not getting paid to help out either. But that also means that I don't have the time to dig deep into uglifyjs code and heuristics.
Somebody familiar with the code should be able to point out easily whether or not and why a certain problem would or wouldn't be a problem. That's all I was asking. My offered help here is just as voluntary, sans the credits. Which I don't need.
@qfox No slight intended. Just offering my perspective on the thankless job of helping to maintain open source projects in general.
Any help is welcome - and indeed your bug report brought to light a flaw in collapse_vars.
We could certainly beef up testing. In another post you mentioned you have some experience in code fuzzing. We were hoping to build a fuzz test that executes original and minified code and compares the results. There's already some basic fuzzing in test/mozilla-ast.js just for AST trees, but it does not execute code.
So, it's old, but here's the fuzzer for my ES5 parser: http://qfox.github.io/zeparser2/test/tokenfuzzer.html
It concats random tokens then runs it through eval and through the parser to check whether or not they both pass or fail. It uncovered some obscure bugs in my parser, but also a few super obscure ones in browsers. Good times.
You can do a similar approach for minification testing. Create random code, include the emergency brake boilerplate I mentioned for loops, check the outcome, and match that against minified outcome. Rinse and repeat.
I know OS can be hard and the reward is knowing the software is used in millions of setups.
Btw I would recommend writing a fuzzer from scratch. It's quite fun and rewarding, especially once it starts finding its first bugs. They're surprisingly easy to write, especially once you're at the level of maintaining a minifier anyways. Seeing a fuzzer run is just magic :)
@qfox That's good advice. Thank you.
As @alexlamsl mentioned this minifier doesn't have formal proofs to back it up, just heuristics we think are safe and generally produce smaller code. A fuzzer that executes code would certainly help.
@kzc here's a setup: https://github.com/qfox/uglyfuzzer
So far it didn't find anything, and it's relatively slow. I think something that just calls uglifyjs from within a nodejs script would be much faster. But whatever, I just whipped this up quickly. Should get you going if you want to build on that. If you need more help let me know (would suggest a new ticket for it).
here's a setup: https://github.com/qfox/uglyfuzzer
@qfox - Great stuff. Clearly it's already paying off as evidenced by #1639
If you look at test/run-tests.js you'll see a way to run code in isolation in the same process. Should make it 100 times faster.