Ecma262: Infinite loop in the specification.

Created on 9 Jul 2019  路  16Comments  路  Source: tc39/ecma262

I've found that there is an infinite loop in the specification.
I've confirmed that the following code fragment puts most of JavaScript engines (V8, JavaScriptCore - from Safari, SpiderMonkey) into an infinite loop.

The example code is as follows:

var o = /reg/;
Object.defineProperty(o, "global", {value: true});
"regular expression!".match(o);

In the specification of "@@match", it loads "global" from the corresponding regular expression object field (Note: https://tc39.es/ecma262/#sec-regexp.prototype-@@match, step 4).
Thus, "global" would be true - ToBoolean(true).

Then, it takes the loop at step f.
In each iteration, it calls RegExpExec (at step 6.f.i)

However, in the specification of RegExpBuiltinExec (called by RegExpExec),
it considers [[OriginalFlags]] of the regular expression object.
Thus, here the value of "global" will be "false" because [[OriginalFlags]] is an empty string.
(Note: https://tc39.es/ecma262/#sec-regexpbuiltinexec, step 5 and 6)
Then, it sets the "lastIndex" to 0 if the global is false (at step 8).

As a result, every iteration of the loop at ('@@match', step f), "lastIndex" will be 0 and it will not return "null", and the loop will be executed infinitely.

This is clearly a bug in the JavaScript specification, and I believe that it should be fixed for security reason because it can break the availability property of a program.

Thank you.

question

Most helpful comment

There has always been many ways a programmer can write code that results in an infinite loop. It has never been a goal of TC39 to make infinite loops impossible and only in a few places (I'm thinking of setting [[Prototype]]) have we tried to mitigate the possibility.

What is an important goal is that the spec. is precise enough that it clearly describes, for any snippet of source code, whether an infinite loop will happen.

All 16 comments

This is clearly a bug in the JavaScript specification, and I believe that it should be fixed for security reason because it can break the availability property of a program.

I consider this more of a footgun for developers than a security issue. If you can execute code, there's plenty of other, easier ways to enter infinite loops.

I'd also tend towards keeping the [[OriginalFlags]] access in RegExpBuiltinExec for efficiency (it's likely cheaper for the VM to implement than a Get call).

I agree. I don't think this is a security issue at all.

Maybe sub-classing regexps was a mistake after all? :)

I consider this more of a footgun for developers than a security issue. If you can execute code, there's plenty of other, easier ways to enter infinite loops.

I believe that the issue of this semantic bug is to provide a way to put an infinite loop into a specific program point.

In the short term point of view, it doesn't seem like a serious problem because most people believe that JavaScript is not robust and is easily manipulated by another injected code.

However, we are still trying to find a way to deploy a robust piece of JavaScript program. One good example would be the Defensive JavaScript (link: https://www.defensivejs.com/usenixsec13.pdf). I'm currently working on extending the subset.

And, this bug can provide a way to put an infinite loop in the middle of the robust code, which breaks the availability property.

Thus, in the long term point of view for providing a robust JavaScript programming, I don't think it is a good idea to keep this semantic bug.

If you ask me why do we need a robust JavaScript program, I would say that now JavaScript is everywhere, including server-side applications, standalone applications, and even IoT devices.

@mir597 if you Object.freeze(Object.prototype) before untrusted code runs, as SES does, then this couldn't happen - have you looked into using SES to create a frozen realm?

@ljharb That won鈥檛 help in this particular case, because, it doesn鈥檛 prevent the adventurous programmer to try var o = Object.defineProperty(/reg/, "global", {value: "idk"});.

@ljharb First of all, freezing cannot avoid this problem because it happens when the regular expression object has a 'global' property with a wrong typed value. This problem is not related to the prototype object. If you want to avoid this issue, you have to freeze regular expression objects that you want to use. Maybe it makes sense because regular expression objects are immutable. But, still, we have a bug in the semantics.

Here is a question. Do you like SES? It does not allow many existing JavaScript programs.
Oops. Sorry. You are the one related to this project somehow. :)

I mean, a valid JS program is while (true) {} - is that a bug?

@ljharb No. the code is not a bug. You can inject such a code and make another program stop.
However, when you can inject your code at the specific program point, you can make the program stop at the specific program point.

For example, let's assume that there is a function 'f' which performs very important jobs.
And I will consider ECMAScript 5.1 here.

var o = /reg/;
var g = (function() {
  var match = String.prototype.match;
  var tos = Object.prototype.toString;
  return function(k) { if (k === 0) return match; else if (k === 1) return tos; };
})();
function f() {
  // assume that 'match' and 'tos' are the corresponding native built-in functions and have the native 'Function.prototype.apply' as its own immutable property 'apply'.
  var match = g(0);
  var tos = g(1);

  // do some important thing 1
  if (tos.apply(o) === '[object RegExp]')
    // if we consider a higher version of ECMAScript than 5.1, we should check whether the value of properties 'global', 'unicode', 'lastIndex' are getter functions or not because it means the following code can call untrusted function.
    match.apply("string", [o]);
  // do some important thing 2
}
// assume that 'this' points to the global object.
Object.defineProperty(this, "f", {value: f, writable: false, configurable: false});
Object.defineProperty(this, "g", {value: g, writable: false, configurable: false});
eval(any_input1);
f();
eval(any_input2);

By injecting any program code into 'any_input1' or 'any_input2', can you put an infinite loop in the middle of the function f? What you can do is either executing the function 'f' or not executing the function.

Let's assume that 'do some important thing 1' is the operation to open a door and 'do some important thing 2' is the operation to close a door. Any malicious user can make the door open left by manipulating the regular expression object with the semantic bug.

However, without the semantic bug, the function f is robust:

  • in the function "tos", it will only read the internal property [[Class]] (unless it is a wrong designed host object, it is not possible to manipulate this value), and
  • in the function "match", it will only read some inherent properties which cannot be manipulated in the specification (although it has been changed in the recent version of ECMAScript).

Thus, even if you put 'getter' or 'setter' functions into the regular expression object 'o', you cannot manipulate the execution of the function 'f'.

@mir597 If you are serious about both allowing arbitrary dangerous code any_input1 and any_input2, and striving to keep the integrity of f even when the world is collapsing, you will quarantine the execution of the former in a distinct Realm, and, above all, you will not allow them to directly fiddle with objects used by the latter.

To be more concrete, if I want to sabotage f, instead of writing:

Object.defineProperty(o, "global", {value: "idk"});

I鈥檒l rather try:

Object.defineProperty(o, "global", { get() { 
    while (true) try { alert("pwnd!"); } catch { };
} }):

So, no, it is not a security issue: because if someone else is allowed to corrupt your own objects, you are already doomed, even in absence of the issue raised in this thread.

@claudepache Oh. Sorry. You got back to the original example code.

If you consider a robust version of JavaScript, you should check whether the regular expression object contains a malicious code or not.
Before executing the "match" function, indeed, we can check whether the object has a 'getter' or 'setter' function as properties. If the 'getter' or 'setter' is an untrusted code, we can assume that this field access can make some problem.
Basically, we can check everything whether this program execution is robust or not in JavaScript level.

But, this one is a purely JavaScript semantic bug.
Do you mean that we should consider any built-in functions can make an infinite loop if an input is not proper?
I thought that many people use JavaScript because it does not break the execution even if there is a type error, which is not a true anymore.

... because it does not break the execution even if there is a type error, which is not a true anymore.

I think, this has never been true.

Browsers already take various smart measures, not included in the language semantics, in order to protect coders against themselves, e.g. clamping timeouts in order to prevent bad code to drain batteries and ultimately participate to the global warming. But trust me, they will always find creative ways to shoot themselves in the foot.


About the issue itself. My opinion is that it is an interesting issue, but an edge case, like many other ones that necessarily arise in a language like JS where you can mutate and redefine virtually everything at runtime.

@schuay @hashseed @ljharb @claudepache Thank you for the good opinions on this issue.
I just hope that this issue will be resolved in the future version of ECMAScript whether committee members consider this as a security bug or not. (Maybe we can always use [[OriginalFlags]]. It is obviously not consistent.)

By the way, I've got the final version of the example code:

var o = /a/;
Object.defineProperty(o, "global", {value: true});
"a".match(o);

I don鈥檛 believe this can be changed without impacting the current design for subclassing regexps, which explicitly intends to allow a regexp subclass to ignore its RegExp slots by overriding all of the lookup points - like .global.

In other words, there are many ways a misbehaving regex can create problems - o[Symbol.match] = function () { while(true); }; is another - so I鈥檓 not sure why this one is unique.

There has always been many ways a programmer can write code that results in an infinite loop. It has never been a goal of TC39 to make infinite loops impossible and only in a few places (I'm thinking of setting [[Prototype]]) have we tried to mitigate the possibility.

What is an important goal is that the spec. is precise enough that it clearly describes, for any snippet of source code, whether an infinite loop will happen.

@allenwb Thank you for the comment. Now it is very clear.

@mir597 is it ok to close this as answered?

Was this page helpful?
0 / 5 - 0 ratings