The ordering of a categorical axis is currently determined by the order of values supplied in traces. It would be useful if we could override this ordering in the layout.
I primarily need this for our ggplot2 converter, as we have no way to guarantee ordering of the axis when there are missing values in the first trace (as in https://github.com/ropensci/plotly/issues/242, https://github.com/ropensci/plotly/issues/399, https://github.com/ropensci/plotly/issues/379)
right I was thinking about adding two axis attributes: categorymode and categories:
categorymode would be a enumerated with possible values: 'alpha increasing', alpha decreasing', 'array' and maybe more ...'array', then users could set fully custom category order with categories(from @chriddyp):
eventually the min-max -> alphabetical -> natural sorting would be a really nice interactive addition to the bar charts
Looking into this.
Here are my initial thoughts in the form of (obviously, currently failing) test cases. Basically, the category order spec would override the order in which the points were supplied. Glad to receive any feedback on whether it's the right track. https://github.com/plotly/plotly.js/compare/master...monfera:189-ordinal-scale-domain-item-ordering?diff=unified&expand=1&name=189-ordinal-scale-domain-item-ordering
To clarify on ordering: for simplicity, my plan is to rearrange the data points according to the category order spec (if supplied), arriving at a trace order different from what would follow from the user-supplied point series.
However, if adjacent points are linked via lines or curves, then in general, there is a difference between
It might eventually be desirable to reorder the columns without impacting the tracing order. As a visual analogy, consider something like the snail trail chart, e.g. http://southpoint.frbatlanta.org/.a/6a011572565d3f970b0120a540feff970b-pi
Again, I'm not planning to make this distinction in this CR as I see no requirement for it and its utility is questionable, but wanted to expose the issue for future consideration.
@monfera
First, categorymode should be a cartesian axis attribute. That way axes that host several traces with overlapping categories can order the categories as a whole.
Resolving the issue should:
categorymode and categories. categorymode should be _enumerated_ value type with possible values 'alpha increasing', alpha decreasing' and 'array'. Here is an example of enumerated value type.axis_defaults.js if and only if the axis type is set or auto-typed to 'date'. The defaults will have to be _smart_ similar to how tickmode, tickvals and ticktext interact.categorymode and categories attributes.@etpinard yes, I started off with them as seen in the test cases. Added a couple more categorymodes: https://github.com/monfera/plotly.js/blob/189-ordinal-scale-domain-item-ordering/src/plots/cartesian/layout_attributes.js#L449-L474
Step 3 hasn't achieved the desired effect. Doing some insertion sort at the suggested point makes ax.d2c return incorrect array indices (as subsequent insertions may cause array element shifts, making previously returned indices out of sync). As an alternative, I sorted right before the loop ax.d2c is called from, but it doesn't reorder the Y values into the corresponding order, i.e. the wrong Y values are plotted. I'm in the process of identifying what the best point is for the sort, it might need to be an earlier point. I'll also consider the corresponding Y value reordering in ax.makeCalcdata as it's closest in the call stack to the point you specified.
Currently, the use of ax.d2c() 'locks in' the index corresponding to the categorical value - which it can do now, as the [].push() doesn't disturb previous point indices -, and this little function is being passed around subsequently. However, unless told otherwise, we need to handle situations like this:
Plotly.newPlot('embedded-graph', [
{x: ['a','b', 'd','e','f' ], y: [100,110, 130,140,150 ]},
{x: ['a', 'c', 'e','f','g'], y: [101, 121, 141,151,161]}],
{ xaxis: {type: 'category'}}
);
I'll figure out something that represents a later mapping from values-to-indices than what seems to be implemented now, because we can no longer rely on indices found with [].indexOf() to remain unchanged. A single additional line overlay can insert a new category and the previous order indices become invalid.
Btw. the above snippet illustrates how the tick order and trace order are two separate concepts even after this CR is done (this image is obv. the current situation: tick order is in order of encountering new categories; trace order is whatever is specified in the x/y arrays):

We could either apply the same sorting for the ticks and the line tracing, or (eventually) allow independent configuration for each. A decision needs to be made on this as, in my understanding, the very point of this CR is to give ordering guarantees even if the order of the supplied input is arbitrary.
:arrow_double_up: I have PoC for categorymode === 'array' - it's easier because the final ordering is known ahead of time. It renders how it's expected:

Minor comment: an old commit explicitly deletes an axis layout attribute called categories:
delete ax.categories; // replaced by _categories
For safety, and to avoid confusing it with _categories, I renamed the new attribute categorylist - if the deletion line is obsolete, it's an easy switch back.
@monfera Looks like you're already pretty far along. Nice :beers:
Would you mind making a PR to this repo with what you've got already? This would make the comment & review process easier. Thanks!
@etpinard Here's changeset https://github.com/monfera/plotly.js/pull/1 with these two parts:
categorymode = 'array'. The 'array' mode implementation is simple, and it alone might already be an acceptable solution to the problem described by @cpsievert in which case I'm glad to progress it into a PR (coercing isn't added and test cases don't work yet), which could be followed by a subsequent PR for the other options which are a bit trickier. Looking forward to guidance and input. Glad to hop on a call or chat if that's more efficient.
@etpinard I wrote on the distinction between tracing order and (X) axis tick order. Just found that one of the most popular public charts demonstrate an analogy to it:
https://plot.ly/~RhettAllain/1487/foot-and-head-trajectories/

This specific example uses linear axis scales, but in theory it's possible to put together a chart that has categories on one axis, yet is of this snail trail type. Just showing this example to be explicit about how the above PR is only about the axis tick order and does NOT reorder the traces themselves. The CR description from @cpsievert mentioned the problem with gaps, rather than a problem with totally unpredictably ordered input points in the trace.
Personally, I don't need anything that fancy (yet?). At this point, I would settle for a way to do the following:
category_tuples = [(0, 'first_name'), (1, 'second_name'), (2, 'third_name')]
I'd happily do my plots with [x[0] for x in category_tuples] for the order & spacing, if I can then swap in or overlay [x[1] for x in category_tuples] for the labels.
Right now I don't think it's even possible to do that.
Having said that, it would be great if it were also possible to do discontinuous axis spacing with this, i.e. if my x-axis is going to look like this:
[-50, -40, -30, -20, -10, 0,1,2,3,4,5,6,7,8]
and then swap in labels for display purposes after the ticks are positioned.
Most helpful comment
Looking into this.