Plotly.js: Improve implementation of computeFrames

Created on 12 Oct 2016  路  5Comments  路  Source: plotly/plotly.js

extendDeepNoArrays was a simple way to prevent the frame logic from doing a deep extend on data arrays. The assumption was array attributes were a negligible concern. Transforms are a notable exception and should be deep-extended when computing frames. That can be handled with a special case. If this seems like a common situation though, then the attributes should be extended with proper knowledge of the plot schema.

@monfera, I know you've dealt with very similar issues when implementing transforms. Is merging attributes with knowledge of the plot schema a problem you've already solved elsewhere?

cc: @etpinard

discussion needed bug maintenance

All 5 comments

groupBy uses the schema crawler, which got reworked lightly to fit the purpose. Now the callback function, which @etpinard moved into coerce.js, tells apart attributes based on a simple condition.

If I understand, you're planning an extendDeepExceptSomeArrays which doesn't just simply do an Array.isArray check, but, uses the above logic. (Side note: Array.isArray(new Float32Array(10)) === false but we don't have typed arrays yet except pointcloud.)

Would it work or did I perhaps misunderstand what you seek?

Other thoughts:

The groupBy I iterated on wasn't particluarly efficient in that there's a three-level loop nesting, partly because there's unnecessary reiteration per group over the array (could be done in one pass for all groups), i.e. it would benefit from an efficiency refactoring. But it's useful to consider what's most efficient overall, for the entire system, before doing peephole optimization.

Continuing with this question: if the array size is small, i.e. up to the OOM 0.1k-1k items the DOM can handle, efficiency doesn't matter much. If the array size is large, chances are we're in WebGL territory, i.e. there are possibly large arrays, up to around 1m vertices i.e. ndarray buffer length can be in the millions. As we're discussing and playing around around this topic incl. your fancy version, I think we all think that GL animation is ideally done in the shaders, although another example shows that an array buffer update can work, although with more jank, lower FPS.

So, as that we're heading towards faster initialization time, to solve other outstanding issues, there's the eventual plan to let other WebGL charts work with direct typed array input; and the animation data (e.g. the target state) may eventually also be uploaded into a vertex buffer similar to the marker tweening example above or regl-tween which you mentioned. If we want to be fast, then avoiding array copies and minimizing new array creations help a lot. It's purely speculative but we can think of the mentioned dual vertex buffer approach for tweening, or even, uploading the entire sequence, and controlling where we are in the animation with uniforms. For example, there can be N keyframes, and the shader code could smoothly interpolate between the two current adjacent keyframes at any time, using smootherstep, Catmull-Rom, a choice of easing functions or a primitive physics model.

In such a case, the larger (10k..1m OOM) arrays would ideally use either positional encoding of the frames (e.g. [indexFrom..indexTo] denotes one specific keyframe, but all keyframe geometry is in the same one array) or an attribute encoding, e.g. a separate byte array of equal length to the geometry array that identifies which keyframe the particular piece of geometry belongs to. Either way, there's no splitting or copying of the input array, and it's still just one gl.bufferData transfer call.

Sorry if it all sounds a bit of a rehash as we chatted a bit about such topics. Probably you're looking for more of an immediate solution to fit into what you already developed for animations, and maybe the 'Other thoughts' section is not relevant to your question.

If I understand, you're planning an extendDeepExceptSomeArrays which doesn't just simply do an Array.isArray check, but, uses the above logic.

Correct! Goal: avoid deep merges of data arrays but permit deep merges of things like arrays of transforms.

And thanks for the additional thoughts! As a result and before going farther, I'm going to turn my extendDeepNoArrays into an extendDeepSureWhyNotArraysToo and check the performance. There's a good chance that this extend deep, though unsatisfyingly inefficient, will not be a limiting factor until long after the number of data points has pushed SVG rendering behind 16ms/frame.

A general principle I've been aiming for is to avoid adding _any_ additional iteration through the full data, but if we need to handle webgl much more carefully, then it's probably poorly-done premature optimization anyway to avoid this.

I don't know the answer yet, but thanks for bringing up these concerns!

Love the name extendDeepSureWhyNotArraysToo, and one reason for that (or a distinct name in general) is that when working on certain client-driven tickets for initial display and Plotly.relayout/Plotly.relayout updates, we were forced to use extendDeepNoArrays to get the desired performance characteristics. For example, when emitting an event when a (re)plot is done, the data are passed on in the event, and deep copying huge arrays became the hot spot after a couple of rounds of other optimizations.

(which, to be clear, is just extendDeep once you strip out the snarkiness 馃槃 )

The tentative conclusion is that a less general solution is probably best here. I wanted to avoid handling lots of special cases, but after talking with @etpinard, it sounds like transforms is currently the only container array in traces so that it's probably not accruing much technical debt to stay with extendDeepNoArrays and manually recurse into transforms. Other cases will likely arise, but it shouldn't be difficult to maintain if clearly annotated and is an order of magnitude easier than a more general solution.

Was this page helpful?
0 / 5 - 0 ratings