Seaborn: Map_diag(): Also pass column names?

Created on 10 Sep 2018  路  6Comments  路  Source: mwaskom/seaborn

In 0.9.0, the API for seaborn.PairGrid.map_diag() (https://github.com/mwaskom/seaborn/blob/master/seaborn/axisgrid.py#L1352) changed: Now it no longer passes pd.DataFrames to the handling function, but only np.ndarrays. It would be good to have also the information on the column name or the index passed to the function. Maybe it would be possible to pass this as a keyword argument?

In more detail, in our application we have a very simple quadratic plot where every pair of columns is mapped directly to a single subplot. To construct it, we also take information into account which is not contained in x, y, but in another matrix k which has the same columns as the data dataframe.

Most helpful comment

I encountered a similar problem, fortunately, I found this: https://datascience.stackexchange.com/questions/57673/how-to-put-the-variable-names-of-pandas-data-frame-on-diagonal-of-seaborn-pairgr and used an iterator inside a function for map_diag.

So, I defined a method like this:

__next_colname = iter(df.columns.tolist()).__next__

and used __next_colname() instead of x.name

All 6 comments

I'm not sure what change you're talking about but I don't think DataFrames would ever have been used for a 1D plot. It won't work to inject column names into keyword arguments because it would cause plotting functions that aren't expecting them to crash. Without any more information about what you're trying to do, I can't really help, but I'd remind you that you can pass additional keyword arguments yourself to map_diag.

As far as I could see, in the latest version np.asarray is called explicitly, so that beforehand we could use the information in the dataframe for our purposes, which is now no longer possible. But in our case we now already found a different solution. Thanks for your response though!

I also encountered this. Previously I could pass a custom function to map_diag() that could access the name of the pd.Series returned from hue_grouped.get_group(label_k) like so:

def annotate_colname(x, **kws):
    ax.annotate(x.name, ...) 

but since https://github.com/mwaskom/seaborn/commit/b3981872425fcfb1ca462541920a5e622260c6e7, this is no longer possible since the pd.Series is casted as an np.array as @yannikschaelte pointed out.

A workaround would be to annotate the diagonal plots separately:

for ax, col in zip(np.diag(g.axes), data.columns):
    ax.set_title(col)

@joelostblom I am unable to get your comment, can you please fix the given code below to put the dataframe column name in the diagonal of seaborn pairgrid ?
`import matplotlib.pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')

def diagfunc(x, **kws):
ax = plt.gca()
ax.annotate(x.name, xy=(0.05, 0.9), xycoords=ax.transAxes)

sns.PairGrid(iris).map_diag(diagfunc)`

@Gkchandora Use the for loop to iterate over the axes in the FaceGrid (g = the FacetGrid in my snippet above). You can see a complete example in my second code chunk in this SO answer.

I encountered a similar problem, fortunately, I found this: https://datascience.stackexchange.com/questions/57673/how-to-put-the-variable-names-of-pandas-data-frame-on-diagonal-of-seaborn-pairgr and used an iterator inside a function for map_diag.

So, I defined a method like this:

__next_colname = iter(df.columns.tolist()).__next__

and used __next_colname() instead of x.name

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chanshing picture chanshing  路  3Comments

tritemio picture tritemio  路  3Comments

rrbarbosa picture rrbarbosa  路  3Comments

JanHomann picture JanHomann  路  3Comments

songololo picture songololo  路  4Comments