Plots.jl: heatmap automatically skipping tick labels

Created on 2 May 2017  路  21Comments  路  Source: JuliaPlots/Plots.jl

Hi, I want to have all the tick labels displayed in my heatmap. This didn't happen before, but now all the tick labels skip after 40 columns or rows. Here's a sample.

using Plots, plotlyjs()
default(size=(800,800), leg=true)

n = 40
xs = [string("x",i) for i = 1:n]
ys = [string("y",i) for i = 1:n]
z = float(1:n) * float(1:n)'
heatmap(xs,ys,z,aspect_ratio=1, xrotation=90)

If you change the "n" to 30, then all the ticks are displayed. This is true for pyplot() backend as well. How do I display all tick labels?

Thanks,

Most helpful comment

We agree then

All 21 comments

This is not only related to heatmap, but happens for all plot commands

plot(rand(30), xticks = 1:30)

works, while

plot(rand(31), xticks = 1:31)

does not.

@youngjaewoo Do you know when this changed? I think it cannot be that recently, because I suppose that the reason for this behavior can be found here.

@mkborregaard I don't know, what the rationale for this check was, but I think, if users want to have 40 ticks, they should be able to.

I agree but I don't think that's currently implemented.

replacing

if length(cv) > 30
    rng = Int[round(Int,i) for i in linspace(1, length(cv), 15)]
    cv[rng], dv[rng]
else
    cv, dv
end

with

cv, dv

here: https://github.com/JuliaPlots/Plots.jl/blob/master/src/axes.jl#L241 should allow any number of user-specified ticks. I was just wondering, if you knew what the reason for this if condition was.
If you don't see an issue, I can implement the change tonight.

My take on it would be that it is in place 1) in case any of the automatic code ends up generating longer cv (in principle, that won't happen now) or 2) to make it easier to just pass a long vector as ticks and have it work automatically. Whether 30 or 40 is the preferred number may differ based on the size of the plot, but over a certain number it almost never will give a nice result.
I am fine with experimenting with commenting out the if statement.

OK,
1) I think this can happen e.g. if the x input arguments are strings, which are not converted to numbers, as in @youngjaewoo's code above. There, no ticks were specified ... We could still keep the if statement in these cases.
2) IMHO: If I want it to just work automatically I don't pass ticks at all, and if I specify ticks I expect these ticks to be shown. I think it's for the user to decide, whether the result is "nice".

I think 1. only applies to heatmap and bar, right?
WRT 2., that makes sense - so does the alternative interpretation, though. Imagine you have a bar diagram with 50 bars. You pass the bar names as x, and I would naturally expect the number of names to be slimmed down so they can all fit. Plots is also about being intuitive.
A solution could be to say if length(cv) > 30 && ticks == :auto, which would allow the user to override it by explicitly specifying xticks, or if length(cv) > 30 && (! typeof(ticks) <: NTuple{2}), which would only override if ticks positions and labels were specified?

I think 1. only applies to heatmap and bar, right?

You can also do something like plot(["a", "b", "c"], rand(3)) with more than 30 values.

Imagine you have a bar diagram with 50 bars. You pass the bar names as x, and I would naturally expect the number of names to be slimmed down so they can all fit.

Yes, but in that case, you did not specify the ticks, but you passed (probably) strings as x values and I suggested to keep the if statement in these cases ... so I don't really see this as an alternative interpretation (I think we are more or less talking about the same thing).

I would go for if length(cv) > 30 && ticks == :auto, because then you would get 31 ticks if you do

plot(rand(31), xticks = 1:31)

as you would probably expect. With

plot([string("x", i) for i in 1:31], rand(31))

on the other hand, the number of ticks would still be reduced.

We agree then

Just an idea. Would providing ticks option as ":auto", ":all", or "user-defined Tuple" be useful?

Here, ":auto" is the current implementation. ":all" would be displaying all tick labels that are in String format as x and y like the following.

plot([string("x", i) for i in 1:31], rand(31), xticks=:all)

And "user-defined Tuple" would be providing labels matching with the x index like the below.

x = 0:2:4
s = ["a", "b", "c"]
plot(rand(3), xticks = (x,s))

I think @daschw 's PR fixed this?

No, his PR failed and the fix didn't happen yet.

Ah, I see it is still WIP - thought it hasn't failed. Thanks! reopening the issue.

It is fixed on my machine

plot(rand(31), xticks = 1:31)

returns 31 ticks.

@youngjaewoo Are you on master?

Also to get what you want in your example you have to change the last line to

heatmap(z,aspect_ratio=1, xrotation=90, xticks = (1:40, xs), yticks = (1:40, ys))

ah it's probably because it's my first time being involved in github fixes. How do I get "on" master?

Pkg.update in Julia doesn't apply new edits? Sorry for the trouble.

Pkg.checkout("Plots") to checkout master.
But a new tagged version has been released today, so in this case Pkg.update() should be sufficient ;-)

Perfect! It works just as you mentioned :) Thanks

I don't understand the logic behind the tuple version of xticks. I can write scatter(-5:5, (-5:5).^2, xticks=-3:0.5:3), where the number of ticks is independent of the number of data points. That makes sense. But with categorical data on the x-axis, then it becomes scatter(ydata, xticks=(1:length(xdata), xdata))? Why do the x coordinates of my data points end up in xticks? Why not scatter(xdata, ydata, xticks=sort(unique(xdata))?

EDIT: scatter(["a", "b", "c"], [1,2,3], xticks=(1:3, ["a", "b", "c"])) is broken.

The tuple version is for passing ticks and tick labels like scatter(1:10, yticks = ([2,3,5,7], ["Prime $i" for i in 1:4])).
I'm not really sure I understand the issue here. Why not just scatter(xdata, ydata)? Could you provide a MWE of what you would like to do?

scatter(["a", "b", "c"], [1,2,3], xticks=(0.5:2.5, ["a", "b", "c"])) would return the same plot as scatter(["a", "b", "c"], [1,2,3]).

Ah, sorry, I seem to have misunderstood a comment. xticks makes more sense now.

Why not just scatter(xdata, ydata)

Because it doesn't show all the categorical ticks (I have 100 of them). scatter(["a", "b", "c"], [1,2,3], xticks=(0.5:2.5, ["a", "b", "c"])) should fix the issue, will try tomorrow. Thank you!

... I kinda agree with @holocronweaver's post that displaying all ticks when axes are categorical is a better default (it's what matplotlib does, at least). Most categories don't have a natural ordering; missing ticks make the plot useless in that case. And ordered categorical data rarely has > 30 categories.

Because it doesn't show all the categorical ticks (I have 100 of them).

OK, I see.

... I kinda agree with @holocronweaver's post that displaying all ticks when axes are categorical is a better default (it's what matplotlib does, at least). Most categories don't have a natural ordering; missing ticks make the plot useless in that case. And ordered categorical data rarely has > 30 categories.

I definitely see your point and I don't really have a strong opinion about the default. Maybe adding ticks = :all (like @mkborregaard suggested) and ticks = :some (or whatever name for the current behaviour) would be really a good idea for easy switching.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

daschw picture daschw  路  3Comments

PallHaraldsson picture PallHaraldsson  路  4Comments

crstnbr picture crstnbr  路  3Comments

Cody-G picture Cody-G  路  4Comments

kleinschmidt picture kleinschmidt  路  3Comments