Ecma262: Access the `.next()` method once, at the beginning, in the iteration protocol

Created on 16 Aug 2017 · 21Comments · Source: tc39/ecma262

Note: all of the following applies to Arrays, Strings, Sets, and Maps but I'm using Arrays as an example for simplicity.

Edit: I'm withdrawing my first proposal in favor of the second. I'm leaving striked out for posterity.

Since %ArrayIteratorPrototype%.next does not specify any attributes, it is currently not-writable but is configurable. From an implementation standpoint, the fact that the next function can change each iteration of a loop makes it substantially harder to optimize for-of loops over Arrays. In particular, this makes writing optimizations harder. While it's not impossible to work around it substantially complicates the implementation.

I would propose changing the %ArrayIteratorPrototype%.next property to be non-configurable and non-writable. I would argue that changing the %ArrayIteratorPrototype%.next property adds little value as all syntactic iteration of Arrays would provide an opaque object anyway. A much simpler way to change the iteration of Arrays is to just change the Symbol.iterator property. In fact, I don't believe there is any loss of functionality by making %ArrayIteratorPrototype%.next property non-configurable and non-writable.

Another, perhaps more controversial, option would be to change how the syntactic iteration works to fetch the next property exactly once, before starting the loop. This is more likely to be a breaking change but would likely have the benefit of speeding up start up times for iterators. Like before I don't think any functionality is limited by getting the next property exactly once since it's possible to simulate changing the next function inside the next function itself.

Thoughts?

Source

kmiller68

👍3 🎉1 👎1

Most helpful comment

1 . We make breaking changes all the time. There is a policy of making them assuming there is sufficient value. Now, you might disagree there is sufficient value here but I have heard much evidence to the contrary. I know web developers have told me they are not permitted by style rules to use iterators in their code because it regresses their page load time v.s. indexed loops. Keep in mind the place where using iterators makes a major difference is not in the highest tier of the optimizing compiler, there is practically no difference there. In fact, in JSC we will not even allocate the iterator object or the result object most of the time. The place where this does matter is in the interpreter and/or non-optimizing JIT, which is where most of the execution time on page load is spent. As far as I know, almost every major website regularly tracks the performance of their page load time.

2 . By the same token, the way that Map/Set constructors were implemented then would be just as bad... Also, I highly doubt anyone would reasonably consider changing the next function on each iteration of the for-of loop so this feels like a red herring. Additionally, if for-of does not observe the fact that they changed the next function then why would anyone assume that it should change on each iteration? The only reason they would is because we define it that way. That's not how you defined the Map/Set constructors, for example.

4 . My goal here is to improve the performance of iterators in the interpreter not in optimizing compliers. While the spec should not favor builtin iterators at the expense of custom iterators, the reality is that builtin objects are use disproportionately in most JS applications. Any VM that did not make special optimizations for builtin operations, not even just limited to the interpreter, would be at a practical disadvantage to one that does. Keep in mind that all VMs try to optimize custom operations as much as possible. For what it's worth, I agree that in the long run it's actually to a VM's advantage to try to build a general solution to most problems. My goal here, ultimately, is to make iterating Arrays/Strings/Sets/Maps in our interpreter as fast as possible. Right now, without special optimizations in the JSC interpreter, iterating an Array/String is 6-7x slower than using an indexed loop. In the case of Arrays/Strings, ideally, it would be faster to use an iterator than using an indexed loop. Not making some change here makes that task much harder and takes time from making other, more general, optimizations.

Overall, I agree with your perspective that it's important to keep the language as internally consistent as possible. I disagree, however, that consistency should be the only factor that matters. I actually don't even think this change is internally inconsistent; @ajklein already has provided examples where in the JS spec differs from your claim of consistency. I think it's important for TC-39 to operate under the idea that as long as developer economics are not appreciably impacted it's valuable to enable as many optimizations outside optimizing compilers as possible. It's also important to make sure the language is as useful as possible. Historically, in the case of iterators, the death sentence for them has been that they are much slower than an indexed loop.

kmiller68 on 18 Aug 2017

👍3

All 21 comments

Both are potentially breaking changes. Of the two, I strongly prefer the second. Existing monkey-patches (aka shims, polyfills) of primordials are likely broken by the first and not broken by the second. There monkey patches generally do a static initialize-time replacement and do not change the methods dynamically, or indeed at all after initialization.

Note that the second resembles the proxy optimization that Tom and I proposed, that everyone claimed would not help implementations optimize.

erights on 16 Aug 2017

👍3

@erights I totally agree that both are potentially breaking changes that we cannot do if things actually break. I actually prefer the [[get]] of the next property once more than freezing the next property too. Also, that option works better in the general case for an interpreter.

kmiller68 on 16 Aug 2017

👍1

Besides the issue of potentially breaking existing code (which is probably not a big deal in this case), I generally like the idea. I suppose every engine has to jump through some hoops to guard against reconfiguration of the next methods at the moment (or just don't optimize aggressively).

bmeurer on 17 Aug 2017

👍2

cc @bzbarsky

annevk on 17 Aug 2017

I should point out that these issues were all discussed within TC39 during the development of ES6 and the current design represents the consensus decisions. At a minimum any proposal to change the design should review the past discussion (see, meeting notes and es-discuss)

allenwb on 17 Aug 2017

@allenwb As far as I can tell from greping es-discuss / meeting notes for "next" there's no mention of the next method being swizzled mid-stream. The closest thing are code examples happen to show the current semantics but they appear in unrelated discussions to what's being proposed here. There are, however, lots of discussions about the next methods returning new objects but that's unrelated. Do you recall a specific conversation about either of these proposals?

kmiller68 on 17 Aug 2017

I don't recall details of any specific conversation. It probably would have been about ArrayIterator prototype rather than specifically about the next methods. Might also have been on bugs.ecmascrpt.org (currently offline, but I recently heard that there is a chance of it coming back).

The pros/cons of mutable built-in prototypes have been discussed many times over the last 20 years. The conclusion has always been to maintain consistency with the status quo established by the early design of JS. In particular we (eg, the TC39 consensus) has generally be to avoid special case deviations from established patterns if they are solely motivated by speculation about performance or ease of optimization. (imagine if all sorts of accommodations and irregularities had been made in 2005 motivated by the state of the art of JS engines at that time).

In this particular case, how is the the optimization hazard of a mutable %ArrayIteratorPrototype% any different from the hazard of a mutable Array.prototype or String.prototype method used in a hot while loop or for that matter any prototype access that an implementation may wish to optimize. At the highest level of optimization complex statements such as for-of are decomposed into primitive operations to which the actual optimization analysis/techniques are applied. User defined objects and the built-ins are always mutable by default so optimizations must always guard against and deal with mutations that invalidate caches or observations that drive dynamic specializations.

A similar related argument: for-of may be used with user defined iterators as well as built-in automatically generated iterators. Why should the built-in iterators be handled any differently from user defined iterators. Why should they perform better? Rather than adding special case runtime checks to see if an iterator is an Array Iterator isn't it better for an implementation to develop techniques that work for all iterators.

allenwb on 17 Aug 2017

@kmiller68 said:

Since %ArrayIteratorPrototype%.next does not specify any attributes, it is currently not-writable but is configurable.

This is incorrect. Clause 17 says:

Every other data property described in clauses 18 through 26 and in Annex B.2 has the attributes { [[Writable]]: true, [[Enumerable]]: false, [[Configurable]]: true } unless otherwise specified.

%ArrayIteratorPrototype%.next is a data property defined in clause 22.1.5.2.1 and does not explicitly state any deviations from the normal built-in method/data property attribute settings.

allenwb on 17 Aug 2017

@allenwb your argument above seems addressed at one of @kmiller68's proposed changes (making next non-configurable) but not the other (only fetching next once at the start of a for-of loop). I don't believe there's any deep JS history about that question; do you recall discussion about that issue?

An advantage of that approach is that it also makes it easier to optimize the author-defined iterators (addressing your concern in your last paragraph). There's also some precedent in ES2015 for such behavior, e.g., the Map and Set constructors, which only lookup the set or add methods once before looping over the passed-in iterable.

ajklein on 17 Aug 2017

@kmiller68 said:

As far as I can tell from greping es-discuss / meeting notes for "next" there's no mention of the next method being swizzled mid-stream.

Pretty much everything in JS is swizzle-able across any operation that has the potential of invoking a function (pretty much any operation, given the existence of Proxy). This was generally understood by TC39 members during the development of ES2015 and since the possibility of swizzling is the norm it would be unremarkable for it to not be explicitly mentioned. If there was a discussion (and I pretty sure there was) it would of been about "should we try to prevent such swizzling" rather `should we enable such swizzling".

@ajklein I don't see how making it non-configurable helps since as I pointed out above, %ArrayIteratorPrototype%.next is writable.

So how does changing the attributes of %ArrayIteratorPrototype%.next make it easier to optimize author-defined iterators?

the Map and Set constructors, which only lookup the set or add methods once before looping over the passed-in iterable.

Ah, but they lookup next in the iterator each time through the loop. Admittedly, that is an internal inconsistency. Generally, for built-ins such as these constructors I assumed that they might be implemented in a manner that was opaque to aggressive interprocedural optimization. So I did the obvious optimizations at the spec. levels. I probably didn't do it for next because it would have required inlining two levels of iterator related abstract operations which would have generally obscured what was going on.

allenwb on 17 Aug 2017

@allenwb The conversation I think we aught to have is about whether or not we look up "next" once or on every iteration. Your previous comments seem to be speaking more to the proposal of making "next" non-configurable, non-writable, on builtin iterator prototypes. I agree that such a change would generally be bad as it would be inconsistent with the rest of JS. But I think looking up "next" only once is not bad, and worth a serious discussion, even if it were implicit in earlier conversations.

saambarati on 17 Aug 2017

I agree with Saam. At this point I think the consensus is that everyone prefers the look up of next once proposal over changing the builtin next properties. Are you opposed to considering the lookup next once spec change @allenwb?

Ah, but they lookup next in the iterator each time through the loop. Admittedly, that is an internal inconsistency. Generally, for built-ins such as these constructors I assumed that they might be implemented in a manner that was opaque to aggressive interprocedural optimization. So I did the obvious optimizations at the spec. levels. I probably didn't do it for next because it would have required inlining two levels of iterator related abstract operations which would have generally obscured what was going on.

I think I would have disagreed with making the optimization of pulling next out of the loop if it didn't match what for-of and spread did, for internal consistency reasons. FWIW, if we did the lookup next once change in for-of / spread I would also change the Map and Set constructors. I made a rough diff of what the spec changes could look like here.

kmiller68 on 17 Aug 2017

On Thu, Aug 17, 2017 at 10:45 AM, Keith Miller notifications@github.com
wrote:

@allenwb https://github.com/allenwb As far as I can tell from greping
es-discuss / meeting notes for "next" there's no mention of the next method
being swizzled mid-stream.

What do people in this thread mean by "swizzle"? (It is certainly not what
I thought the word meant)

The closest thing are code examples happen to show the current semantics
but they appear in unrelated discussions to what's being proposed here.
There are, however, lots of discussions about the next methods returning
new objects but that's unrelated. Do you recall a specific conversation
about either of these proposals?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tc39/ecma262/issues/976#issuecomment-323144663, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAQtzKyd_DsDTRUVDp5LA5QRfo3dlWMsks5sZHwdgaJpZM4O5UR7
.

--
Cheers,
--MarkM

erights on 17 Aug 2017

@erights By "swizzle" I meant change the value of a builtin method. Specifically, changing the iterator's next property during a for-of loop to alter the way the next value is produced for the next iteration of the loop.

kmiller68 on 17 Aug 2017

I agree with Saam. At this point I think the consensus is that everyone prefers the look up of next once proposal over changing the builtin next properties. Are you opposed to considering the lookup next once spec change @allenwb?

Yes I oppose such a change.

1) It is a breaking change. Consider:

let itr = {
  [Symbol.iterator]() {return this}, 
  next() {
   console.log("original next");
   return {value:undefined};
 }
};
let ctr=0;
for (let x of itr){
    if (ctr++ >= 5) itr.next = () => {
     console.log("replacement next");
     return {done:true}
    }
}
console.log("loop done");

Currently (I tested it) the loop runs 6 times, outputting "original next" 5 times and "replacement next" once. With the suggest spec. change this is an infinite loop, outputting "original next" each iteration.

2) Regardless of whether this particular code snippet is good style or not very useful, it is understandable with only a basic understanding of iterables and the for-of statement. Basic understanding: for-of calls the next method of the iterator before each loop iteration and terminates if the next returns an object whose done property has the value __true__. The proposed change would make it impossible to understand the above code within having a much deeper understanding of detailed for-of semantics (the next method of the iterator is only accessed once and its value is cached until the loop terminates.

3) It is a refactoring hazard that must be considered anytime a for-of loop is transformed to/from any other control abstraction (perhaps an application defined iteration function) that uses an iterator.

4) Are there any measurements that suggest that this change would make a difference in any real, non-synthetic benchmark code.

5) Multiple property lookups of the next method should be optimizable using a PIC. It is the same situation as any other property access on an object with a stable shape and stable method values. Implementations should focus on optimizing the general case of such property accesses rather than worry about this one specific occurrence.

The above is basically a sample of the sort of thing that was discussed many times about many different details of the existing language design. Basically: keep things internally consistent, don't be tricky, do the obvious thing, leave optimization to the engines.

allenwb on 18 Aug 2017

kmiller68 on 18 Aug 2017

👍3

I think this should be advanced as a proposal or needs-consensus PR. Any volunteers to write one up?

bterlson on 1 Sep 2017

👍2

I think that making something non-configurable / non-writable can cause a bad precedent which will cause a serious problem for polyfills in the future. If you wanna optimize for-of - you could check not only that iterator is an instance of ArrayIterator / [Symbol.iterator] of iterable is Array#values, but also that .next iterator method is original ArrayIterator#next method / ArrayIterator#next without changes. 1 additional property access is not something critical.

zloirock on 1 Sep 2017

I would much prefer the property be accessed once and cached, than made unchangeable - for polyfill reasons particularly.

I think the likelihood of "changing the next method mid-iteration" is significantly low, because it's so unergonomic to do so.

Is there any data as to who would be broken by either change? My assumption is that "grab .next once" won't actually break any real code, even if it breaks contrived examples.

ljharb on 1 Sep 2017

Again reusing ideas from https://github.com/tvcutsem/es-lab/issues/21 , a similar change that would be almost unobservable is:

If the .next property is a non-writable non-configurable data property and either
* it is an own property, or
* it is inherited by a non-extensible inheritance chain that does not override it,
then the for-of loop safely fetches it only once.

This change would only be observable if the involved objects are proxies are extremely exotic. Otherwise it could not be observed, and so should break essentially no code.

erights on 2 Sep 2017

👍1

This is addressed by #988/#1021.

bterlson on 9 Jul 2018

Was this page helpful?

0 / 5 - 0 ratings