Ckeditor5: Generators considered harmful - get rid of those that are not needed

Created on 19 Mar 2020  路  5Comments  路  Source: ckeditor/ckeditor5

Provide a description of the task

From time to time I stumble upon a bug which seems weird at first and then it becomes clear that it is a problem with a generator.

Let's face it. Our code has a lot of methods that return generators instead of arrays without specific reason. For example, DocumentSelection#getRanges returns a generator. There are a lot of getters like that in the code.

What are the problems with them?

Well, a lot of times it is not obvious that you deal with them if you haven't looked into documentation or you didn't get deep into code debugging. Why? For two reasons: first - you simply expect to be returned an array. We are all used to working on arrays. Second - in the most common case - single loop through values - they behave the same.

What problems do we have with generators?

1. They are finished after one loop. Consider following snippet:

for ( const range of ranges ) {
    for ( const selectionRange of selectionRanges ) {
        doSomething( range, selectionRange );
    }
}

You'd assume that doSomething will be called for each pair of range and selectionRange. Wrong! If selectionRanges is a generator, you will iterate through it only once. So, doSomething will be called only for the first element in ranges. Of course, you can go with ... of selection.getRanges() but the problem starts when the ranges are passed from a function to function. Or you had other reason to assign it to a variable.

2. Generator is "live" -- the array is not "cached".

If you use a generator to loop through some items and in the loop you do something that have an impact on those items, the generator "gets updated". For example, let's say that we want to remove certain children from an element:

for ( const child of element.getChildren() ) {
    if ( condition ) {
        writer.remove( child );
    }
}

If the generator does not cache values itself, we might have a problem. For example, we might skip some values.

3. Generators do not have a lot of useful methods that are available in arrays or even sets.

The two above points and this one leads to using Array.from() everywhere in the code. We convert generators to arrays most of the time anyway, so why use them... It makes the code more obfuscated.

It's also funny how difficult it is to get the first or the last value from a generator...

4. Efficiency gain is negligble.

Because usually we convert to arrays anyway and a lot of features need to loop through all the items and because our collections are relatively small, I think that any efficiency gain here is negligble.

5. When to use generators?

Generators are good for some scenarios. One of them is having a stream or an infinitely large data source. You can't "cast" that to an array. You can use generator then to stop working on that data source when needed.

You can also use them to implement iterable interfaces in your classes.

And probably there are some sophisticated usages. For example, in one of our algorithms we "merge together" two iterators - two tree walkers. We get values from them alternating - one value from the first one, one value from the other one. The tree walkers have different amount of values. And, in this case, we might actually deal with a lot of elements. So, this is a good place to use generators.

dx dx refactor task

Most helpful comment

I believe that it makes some sense that a generator is used in a TableWalker or TreeWalker but in a lot of cases like .getItems() or .getRanges() it brings more harm than good (well it doesn't bring any good).

I mean, when you use walker you kind of feel that you are iterating over a "living" structure but when you do .getRanges() you expect to get something "stable".

All 5 comments

At this very moment I am changing a handful of selection.getRanges() from used as a generator to used as an array because of some unexpected results in track changes (selection is being changed while ranges are processed -- for ( const range of Array.from( selection.getRanges() ) works correctly).

Of course, it is hard to figure out from just looking at the code because selection is changed somewhere in a function two or three levels deeper than the loop. And of course, at the first sight, you have no idea that you are iterating over a variable that may change (not a stable array).

I've just figured out that the same problem was/is in TableWalker - you have to remember taking a snapshot with Array.from() when modifying table structure using walker.

I believe that it makes some sense that a generator is used in a TableWalker or TreeWalker but in a lot of cases like .getItems() or .getRanges() it brings more harm than good (well it doesn't bring any good).

I mean, when you use walker you kind of feel that you are iterating over a "living" structure but when you do .getRanges() you expect to get something "stable".

MarkerCollection#getMarkerGroup is a generator for no reason.

Again, I had to fix a test because selection.getRanges() was used.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

devaptas picture devaptas  路  3Comments

Reinmar picture Reinmar  路  3Comments

benjismith picture benjismith  路  3Comments

MCMicS picture MCMicS  路  3Comments

metalelf0 picture metalelf0  路  3Comments