With a data set of 2 million rows, it took the Refine server 50 seconds to compute the 65K text facet choices, but then my browser took over 4 minutes to render the list containing them.
I'm pretty sure this is at least O(n**2) if not higher. One immediate problem I see is that the list is re-sorted every time a new entry is added, rather than once at the end.
Removing the facet with 65K choices took 11 minutes so there's obviously another problem on that end of things.
We might want to consider doing the sorting server side for large facets, but that would require re-architecting some stuff, so for right now I'll just try to fix the most egregious performance issues.
Pushing to post-2.6
Seems like it's possible that some of the insertion from jQuery is done live in the DOM instead of creating a dom fragment or otherwise building the nodes outside of the dom and doing a single insert.
I've done some optimization work on this (and would appreciate additional help from any jQuery pros), but have come to the conclusion that a fundamentally different approach is required. It's been a topic of interest with a couple different clients, but hasn't reached the threshold of pain to actually get implemented, but I've got an approach in mind that I'll implement when I've got a few spare cycles.
Or just don`t render them at all, filter and show only the top 10, etc
This isn't actually a bug, just a performance issue, so removing the tag.
Following the report on #2302 I had a look at some of the code here but didn't make a huge amount of progress, but just to make a note here:
https://github.com/OpenRefine/OpenRefine/blob/03c860d9612f052d2506eb572183c0a5df690e5f/main/webapp/modules/core/scripts/facets/list-facet.js#L328-L329
Based on https://jsperf.com/innerhtml-vs-removechild/15 this can be replaced with
while (this._elmts.bodyInnerDiv.lastChild) {
this._elmts.bodyInnerDiv.removeChild(this._elmts.bodyInnerDiv.lastChild);
}
or similar and this would substantially improve performance - some basic testing suggests does work (although not sure if there are browser compatibility implications)
However, this doesn't resolve the overall performance issues as there seems to be another problem with line 399:
this._elmts.bodyInnerDiv.html(html.join(''));
which also seems to take a substantial amount of time. I did look at performance issues around jquery .html()[1] but I'm not sure this is the issue
It feels like performance improvements are possible here - but not sure how at the moment, and whether worth doing outside a more substantial re-working of how the UI does facets.
Based on https://jsperf.com/innerhtml-vs-removechild/15 this can be replaced with
while (this._elmts.bodyInnerDiv.lastChild) { this._elmts.bodyInnerDiv.removeChild(this._elmts.bodyInnerDiv.lastChild); }
That jsperf shows that the proposed update is 30X faster (!) so the 18 second time for 13K facets would go to less than a second. That's a pretty dramatic improvement that it seems like we should take while looking for other improvements.
Most helpful comment
That jsperf shows that the proposed update is 30X faster (!) so the 18 second time for 13K facets would go to less than a second. That's a pretty dramatic improvement that it seems like we should take while looking for other improvements.