Seaborn: Wishlist for sns.scatterplot

Created on 3 Oct 2014  Â·  6Comments  Â·  Source: mwaskom/seaborn

Open comment thread for functionality that would be good to have in a sns.scatterplot function. Some things that have been mentioned are:

  • Scaling point size off a continuous variable that isn't directly interpretable as a point size (cf. #310)
  • Better default colors (although #293 would also fix this)
  • A better interface for handling continuous point coloring in a FacetGrid context, this currently requires a bit of a hack as color is the fourth positional argument so you would need to pass in a dummy argument for size.
  • Jittering x and y points (currently build into regplot)

What else?

plots wishlist

Most helpful comment

In general, it'd be great to be able to change marker size/style depending on the data. E.g.

sns.scatterplot(data, x='PC1', y='PC2', hue='Disease', marker='Sex', size='SampleQuality')

It's currently a drag to do this with sns.lmplot.

All 6 comments

To follow-up on your third bullet (though this is also a bit of separate issue), I've found that continuous point coloring with a FacetGrid/plt.scatter can be _very_ slow when using the hue argument (e.g., it can easily take seconds), because FacetGrid makes separate plots for every color. A faster interface for this would definitely be appreciated.

Ah, yeah, currently I wouldn't even say that's something that's supposed to work – everything in FacetGrid is structured so that hue is categorical. I would say that if you want to use a continuous variable you're better off binning it yourself outside of seaborn.

What about some interactivity? Printing the index or the Series upon klicking the point is very useful when you need to check the raw data underlying a certain point. Such functionality is readily available in matplotlib.
This would maybe even be more suitable for sns.pairplot, as this function is particularly handy for quick data exploration.

One thing I often wish for is the ability to specify a label for each point, that would be shown as text next to the point. This is very useful for scatterplots with a relatively small number of points which you want to identify individually (e.g., countries, company names, specific products), and it's currently a pain with matplotlib because you can't pass a vector of strings to a single text call. Ideally the point labels could then be turned on and off with a quick command.

It would be possible for a seaborn function to accept a vector of strings and draw them, but it would be bound by the same limitations as when you are using matplotlib directly. So each point would have to be a separate object "behind the scenes" and I don't think it would be possilbe to do what you describe in terms of turning the labels on and off.

In general, it'd be great to be able to change marker size/style depending on the data. E.g.

sns.scatterplot(data, x='PC1', y='PC2', hue='Disease', marker='Sex', size='SampleQuality')

It's currently a drag to do this with sns.lmplot.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

phantom0301 picture phantom0301  Â·  3Comments

amelio-vazquez-reina picture amelio-vazquez-reina  Â·  4Comments

Bercio picture Bercio  Â·  3Comments

TDaltonC picture TDaltonC  Â·  3Comments

queryous picture queryous  Â·  4Comments