Bug report or feature request?
ES input?
No.
Uglify version (uglifyjs -V)
uglify-es 3.1.8
This is a regression since uglifyjs-3.1.5 (non-es version)
JavaScript input
var x = (function() {
function test() {
return "foo";
}
return function b() {
return [1, test];
}
})()
// x()[1] === x()[1]
The uglifyjs CLI command executed or minify() options used.
uglifyjs -c -b -- x.js
JavaScript output or error produced.
var x = function() {
return function() {
return [ 1, function() {
return "foo";
} ];
};
}();
// x()[1] !== x()[1]
This is a huge semantic difference, as the anonymous function returns a different instantiation (i.e. previous !== next) on every call, while the input code returns the same function each time.
This causes a large problem, because React uses function equivalence to check whether the function is the same component. For the minified code it assumes that a completely new component is passed, discarding all state. I can send a specific example where this causes innocent react code to behave completely different / broken for the minified code if needed. It also causes hard to notice performance issues due to the failing reconciliation.
Thanks for the report. Yep, it's a bug.
$ bin/uglifyjs -V
uglify-js 3.1.8
$ cat t2450.js
var x = (function() {
function test() {
return "foo";
}
return function b() {
return [1, test];
}
})()
console.log(x()[1] === x()[1]);
md5-37b8bca55074eeb5a1cff5c667891add
$ cat t2450.js | node
true
md5-37b8bca55074eeb5a1cff5c667891add
$ cat t2450.js | bin/uglifyjs -c | node
false
md5-0bed6d5651cc3f202f7c199a740b8c2d
$ cat t2450.js | bin/uglifyjs -c reduce_vars=false | node
true
[email protected] works fine:
$ bin/uglifyjs -V
uglify-js 3.1.6
$ cat t2450.js | node
true
$ cat t2450.js | bin/uglifyjs -c -b | node
true
The bug was introduced in a8aa28a7a6c0cb415965d055119956d4333de8fa.
I think an extra constraint has to be added to prevent AST_Defun to AST_Function conversion if its symbol is used outside of an AST_Call.expression context.
@alexlamsl On a related note, could we make all AST_Defun to AST_Function conversions subject to the compress inline option being enabled? This way users can disable this conversion with -c inline=false without the need to disable reduce_vars.
@kzc inline concerns function(){return 42}() :arrow_right: 42 which is different, so I think we should just keep the current flag arrangement as is - just a matter of squashing the remaining bugs :hammer_and_wrench:
If not under the "inline" option, it'd be nice if the AST_Defun to AST_Function conversion could be gated by its own option (still subject to reduce_vars). Not unlike like unsafe_arrows or unsafe_methods (minus the unsafe part).
The primary reason is for performance:
$ cat perf.js
function foo(x, y, z) {
return x < y ? x * y + z : x * z - y;
};
var sum = 0;
for (var i = 0; i < 1e8; ++i) {
sum += foo(i, i+1, i*3);
}
console.log(sum);
$ cat perf.js | /usr/bin/time node800
3.333333483334019e+23
1.95 real 2.80 user 0.28 sys
md5-664da5f84f6864e9286ab5c53b4f7a0a
$ cat perf.js | bin/uglifyjs -c -b --toplevel | /usr/bin/time node800
3.333333483334019e+23
7.36 real 11.63 user 0.62 sys
```
@kzc I'll refrain from any "it's 2017 and…" jokes and recognise that as a fair concern.
Any suggestions on the name? We can also do reduce_vars:"nofun" like we do for toplevel and unused.
@kzc actually if that is the only case, we don't need the extra flag at all, because it's a bug:
$ cat test.js
var a;
function f(b) {
console.log(a === b);
a = b;
}
function g() {}
for (var i = 3; --i >= 0;)
f(g);
$ cat test.js | node
false
true
true
$ uglifyjs test.js -c toplevel | node
false
false
false
So is AST_Defun to AST_Function conversion from differing scopes fundamentally unsound or just in this case?
$ bin/uglifyjs -V
uglify-js 3.1.8
$ bin/uglifyjs test.js -c toplevel -b bracketize
for (var a, i = 3; --i >= 0; ) {
!function(b) {
console.log(a === b), a = b;
}(function() {});
}
I think the only thing missing is just the loop exception - otherwise things like https://github.com/mishoo/UglifyJS2/pull/2451/files#diff-25caad7de41e394d6c7ba323e8ec5a6dR3874 is safe.
function f() in https://github.com/mishoo/UglifyJS2/pull/2451/files#diff-25caad7de41e394d6c7ba323e8ec5a6dR3959 is an example of cross-scope replacement that is also safe.
It's not a loop thing - it's a function symbol reference thing outside of AST_Call.expression. That would fix it in a more general way.
Any suggestions on the name? We can also do reduce_vars:"nofun" like we do for toplevel and unused.
I still think it belongs under inline. Think of it from a user's perspective - they don't want to convert function declarations into inlined functions for speed critical code.
It's not a loop thing - it's a function symbol reference thing outside of
AST_Call.expression. That would fix it in a more general way.
As mentioned in https://github.com/mishoo/UglifyJS2/issues/2450#issuecomment-342548624, I don't see how this replacement of g is unsafe:
function f() {
function g() {}
return g;
}
And it certainly isn't of the form AST_Call.expression.
AST_Defun over AST_Function) so I don't see the need for any extra flags.function f() in https://github.com/mishoo/UglifyJS2/pull/2451/files#diff-25caad7de41e394d6c7ba323e8ec5a6dR3959 is an example of cross-scope replacement that is also safe.
The code in question was f(g) where both f and g were AST_Defun. f can be inlined because it's only used in AST_Call.expression contexts in the entire program, but g cannot be because it's a naked function symbol reference.
function f() { function g() {} return g; }
Fair point.
I think the general rule is: do not convert AST_Defun to AST_Function if its symbol is used in a non-AST_Call.expression context unless it is in the same scope.
May be we are talking about the same thing, so let me put it differently.
When function f() is used only once but in a different scope, d.escaped needs to stay false:
https://github.com/mishoo/UglifyJS2/blob/bbedbf4ea03580c5b7aa39f32b92fdda9216c5b4/lib/compress.js#L624-L641
Your stated AST_Call.expression case is covered above by _not_ being matched by any of the conditions listed above, but there are other cases that are safe as well, e.g. typeof f == "function"
Does that satisfy your requirements or do you see something missing in here?
We posted roughly at the same time. What do you think of my suggestion above?
We posted roughly at the same time. What do you think of my suggestion above?
I think we differ in what we allow in the cross-scope case - your suggestion only white-list AST_Call.expression, whereas I have the following blacklist, assuming function f():
x = f;var x = f;x(y, f, z);return f;... which extends to f being part of an array:
x = [y, f, z];var x = [y, f, z];x([y, f, z]);return [y, f, z];... or part of an object:
x = {y: f};var x = {y: f};x({y: f});return {y: f};Does setting the escaped property to true mean it can be substituted?
I think we differ in what we allow in the cross-scope case - your suggestion only white-list AST_Call.expression, whereas I have the following blacklist, assuming function f():
The blacklist is too complicated for me. Excessive rule exceptions lead to errors.
Allow single-use function inlining in these cases only:
AST_Call.expressionin any scopeor can you think of any valid counterexamples?
Does setting the
escapedproperty totruemean it can be substituted?
def.escaped=true _prohibits_ any cross-scope substitution.
or can you think of any valid counter-examples?
Off the top of my head:
typeof f === "function""" + fFWIW mark_escaped() has existed for a long while and is responsible for guarding reduce_var more than just functions, and has been well exercised by test/ufuzz.js.
typeof f === "function"
That's valid. No information leakage possible here. Add it to the whitelist.
"" + f
This one is a tougher call. It would lead to an invalid result if the AST_Defun would be converted to an AST_Function without a name (keep_fnames: false is the default). Mind you, if mangle is in effect the function name would be changed regardless. But since this construct is so rare (and useless) in practice I'd say do not inline it. Not inlining it would also avoid some false positives in the fuzzer.
@alexlamsl What's the verdict on the AST_Defun -> AST_Function option name? Your call.
We can also do reduce_vars:"nofun" like we do for toplevel and unused.
vs:
I still think it belongs under
inline(still subject toreduce_vars). Think of it from a user's perspective - they don't want to convert function declarations into inlined functions for speed critical code.
I think with "" + f it's just a matter of whether other non-function values can be substituted - mark_escaped() isn't specially crafted for AST_Defun, and unless there is a very specific need to deviate from the collective behaviour I'd prefer maximising code reuse.
I can't see a simple whitelist that would pass all the existing test cases. IMO that's just too much chore with loss in compression ratio. It's certainly less work to fix corner cases than to rewrite existing functionality at this point.
What's the verdict on the AST_Defun -> AST_Function option name?
Let's decide on whether to do so and with what form at a later date - I have a feeling we should stabilise the existing features (hoist_props included) before thinking about fringe use cases.
I can't see a simple whitelist that would pass all the existing test cases. IMO that's just too much chore with loss in compression ratio. It's certainly less work to fix corner cases than to rewrite existing functionality at this point.
It's a question of correctness. If we can identify all the corner cases surrounding AST_Defun to AST_Function conversion, sure that'd work, but I'm not sure we have.
Anyway, no point deliberating about it. Once this PR is in I'll see if I can break it. Or just wait for the Internet to test it.
Let's decide on whether to do so and with what form at a later date - I have a feeling we should stabilise the existing features (hoist_props included) before thinking about fringe use cases.
I don't think function substitution resulting in a substantial loss in performance is a fringe case, but sure it can be done in a future PR. Users can use the reduce_vars=false mallet in the meantime if need be.
It's a question of correctness. If we can identify all the corner cases surrounding AST_Defun to AST_Function conversion, sure that'd work, but I'm not sure we have.
While I agree with your sentiment, I'm trying to be practical (read: lazy) here :ghost:
Anyway, no point deliberating about it. Once this PR is in I'll see if I can break it. Or just wait for the Internet to test it.
Thanks for your understanding - if it turns out there are major issues and I need to rewrite that portion of code, I'll go with your suggestion next time round. :wink:
Users can use the
reduce_vars=falsemallet in the meantime if need be.
:+1:
Regarding the perf thing, the loop detection in this PR can be defeated by a level of function indirection...
$ cat perf2.js
function foo(x, y, z) {
return x < y ? x * y + z : x * z - y;
}
function indirect_foo(x, y, z) {
return foo(x, y, z);
}
var sum = 0;
for (var i = 0; i < 1e8; ++i) {
sum += indirect_foo(i, i+1, i*3);
}
console.log(sum);
$ cat perf2.js | time node800
3.333333483334019e+23
1.96 real 2.85 user 0.28 sys
md5-37629952953f9835c836f2e2341630a4
$ cat perf2.js | bin/uglifyjs -c toplevel,passes=9 -b | time node800
3.333333483334019e+23
7.50 real 11.94 user 0.65 sys
```
Noted - just to confirm, correctness is not hampered in this case, because if indirect_foo is instead:
function indirect_foo() {
return foo;
}
Then foo will not be substituted, as the identity is leaked and may otherwise be misused.
correctness is not hampered in thise case
The code is fine from a correctness point of view. It's solely a performance issue.