pandas.DataFrame.plot(): Labels do not appear in legend

Created on 24 Feb 2015 · 20Comments · Source: pandas-dev/pandas

The following code plots two lines. The column names appear in the legend.

x=np.linspace(-10,10,201)
y,z=np.sin(x),np.cos(x)
x,y,z=pd.Series(x),pd.Series(y),pd.Series(z)
df=pd.concat([x,y,z],axis=1)
df.columns=['x','sin(x)','cos(x)']
df=df.set_index('x')
df.plot()
plt.show()
plt.clf();plt.close()

figure_1

However, the following equivalent code shows None as legend, even though the labels are explicitly set.

ax=df.plot(y='sin(x)',label='sin(x)')
df.plot(y='cos(x)',label='cos(x)',ax=ax)
plt.show()

Of course there are alternative ways to make this work, but it appears to me that passing the labels should suffice. In particular, I don't think their values should be replaced with None, or printed as x-label.

figure_2

Bug Visualization

Source

JohnNapier

Most helpful comment

Similar to https://github.com/pandas-dev/pandas/issues/9542#issuecomment-438580042, label is ignored when plotting a one-column DataFrame:

import pandas as pd
df = pd.DataFrame({'a': [2, 3]}, index=[0, 1])
df.plot(label='b')

df.plot(label=['b']) also doesn't work. I'm on 0.25.3.

MaxGhenis on 13 Dec 2019

👍3

All 20 comments

Is anyone working on this? I could take a stab if not.

schmohlio on 1 Mar 2015

@schmohlio Certainly try!

I think there are two issues that could be fixed:

first, the fact that df.plot(y='sin(x)') gives a label "None". This can certainly regarded as a bug. And maybe a regression.
In previous versions (I tested 0.14.1), this just gave no legend (which is better that a legend with "None")
second, label is not passed through. However, this is not a documented keyword in the pandas plot method. But, extra keywords are passed through to the matplotlib plotting method, so in that way this should maybe also work? (and indeed, it worked previously (in 0.14.1), however, only after explicitely calling ax.legend())

jorisvandenbossche on 2 Mar 2015

@jorisvandenbossche thanks for the notes!

I think some of the confusion also stems from the purpose of the label= kwargs, which adds labels to the axis. Labels are set in the plot by first checking for an x index, and then overwriting based on the presence of the label arg. In the second graphic, we see that "cos(x)" successfully overwrites the x index name, and then overwrites "sin(x)".

How do we want to handle multiple series? Should the label default to the index if multiple series are called, or continue to represent the last label assignment? Should we automatically generate a legend for multiple series filled with label kwargs vals? Or, should we perhaps start by removing the legend altogether?

Thanks!

schmohlio on 2 Mar 2015

Just circling back to your notes, though, it makes sense that the legend handlers should not yield "None", and that the labels should pass through to the labels. I'll be taking a deeper look.

schmohlio on 2 Mar 2015

@schmohlio IIRC, I think we have handling in there for what to do when a second Series is plotted on an axes that already contains a Series.plot (with or without a legend).

I can't look now, but I think my last tweak to this code was fixing a bug where the label wouldn't show up if it was False. That could have introduced a regression, or maybe not. Just ping me if you get stuck anywhere.

TomAugspurger on 2 Mar 2015

Some more observations (and these can then be turned into tests):

df.plot(y='sin(x)') -> gives a legend with label 'None' -> this should give _no_ legend instead (as it plots one series, and then we don't automatically add a legend, see behaviour of df['sin(x)'].plot())
df.plot(y='sin(x)', legend=True) -> gives a legend with label 'None' -> this should of course give a legend with label 'sin(x)' (behaviour as df['sin(x)'].plot(legend=True))
df.plot(y='sin(x)', label='something else', legend=True) -> gives a legend with label 'None' -> should be a legend with label 'something else', as we want that the label kwarg overwrites the column name.

And above things then should also work when adding multiple series to the same ax (as the original example)

jorisvandenbossche on 2 Mar 2015

Thanks @jorisvandenbossche.

Looks like the addtl kwargs were running through a loop with .pop() more than desired, and calling self.label which was always None.

One last question. For your last bulleted test, what should the x axis of the graph be labeled with? should it be the index ('x') or the label ('something else'). It feels like it should be the index when legend=True and the label when legend is default to False.

schmohlio on 3 Mar 2015

Nvm. Will maintain behavior of df['sin(x)'].plot() and df['sin(x)'].plot(label='something')

schmohlio on 3 Mar 2015

@jorisvandenbossche -

I modified some tests within the pandas.tests.test_graphics module where series tests did not match dataframe plot tests. I have a working commit (passed all your tests when exploring in a notebook). Also, 5 tests have errors on master, and thus they continue to fail on my branch.

Just submitted a pull request. I think the changes are straightforward.

Thanks,
Matt

schmohlio on 3 Mar 2015

New behavior:

x = np.linspace(-10,10,201)
y, z = np.sin(x), np.cos(x)
x, y, z= pd.Series(x), pd.Series(y), pd.Series(z)
df = pd.concat([x, y, z], axis=1)
df.columns = ['x', 'sin(x)', 'cos(x)']
df = df.set_index('x')
ax = df.plot(y='sin(x)', label='sin(x)')
df.plot(y='cos(x)', label='cos(x)', ax=ax)
plt.show()

This code produces a chart with two series where the legend is properly labeled (sin(x), cos(x)) and the x axis is labeld x.

Series indices are no longer mutated and labels are default to column names if label= args are not provided. Use legend=False to avoid printing a legend.

schmohlio on 16 Mar 2015

Closed by https://github.com/pydata/pandas/pull/9574

TomAugspurger on 31 Mar 2015

@schmohlio @TomAugspurger I think I have identified this issue causing a regression in legend plotting. Today I upgraded pandas via conda to 0.16.1, and the behaviour of my plotting code changed! Up until now I have been setting labels, without problems, like this (based on @schmohlio 's example above, reduced from the real code as much as possible):

fig, axes = plt.subplots(1, 2)
ax_1 = axes[0]
ax_2= axes[1]
df['sin(x)'].plot(ax=ax_1, label='mylabel_1')
df['cos(x)'].plot(ax=ax_2, label='mylabel_2')
ax_1.legend(loc='best')
ax_2.legend(loc='best')
plt.show()

Expected behaviour: Until now, this plotted the legend entries "mylabel_1" and "mylabel2", respectively.
Observed behaviour: With 0.16.1, the legend entries are "sin (x)" and "cos(x)". :-( (tested with an Ipython notebook)

Question 1: Was this an intended change or an unintended regression of this bug?
Question 2: (most important) How can I get the old behaviour again?
Question 3: Should I report this as a new issue?

bilderbuchi on 1 Jun 2015

👍1

@bilderbuchi Sorry about that. It was a regression. It's fixed by https://github.com/pydata/pandas/pull/10131 so your options are to

revert back to 0.16.0 for now
build pandas from source and use the current master if you're comfortable with that
change your code to use one of the workarounds posted in https://github.com/pydata/pandas/issues/10119.

TomAugspurger on 1 Jun 2015

Thank you. I searched the tracker, but only for open issues >.<.
I have downgraded to 0.16.0 now and confirm that this fixes my problem. I assume the next release (0.16.2? 0.17?) will be fine?

bilderbuchi on 1 Jun 2015

awesome response time, btw!

bilderbuchi on 1 Jun 2015

Yes, we’ll possibly have a 0.16.2 release. Otherwise 0.17 will contain the fix.

On Jun 1, 2015, at 7:59 AM, Christoph Buchner [email protected] wrote:

Thank you. I searched the tracker, but only for open issues >.<.
I have downgraded to 0.16.0 now and confirm that this fixes my problem. I assume the next release (0.16.2? 0.17?) will be fine?

—
Reply to this email directly or view it on GitHub https://github.com/pydata/pandas/issues/9542#issuecomment-107433036.

TomAugspurger on 1 Jun 2015

I tested @bilderbuchi 's code and it's fine. But if I select the dataframe column with double quote [[]], it will ignore the assigned label and legend the column name. I think this is not the expected behaviour?
Can just compare the difference of following two plots. What's the difference and could it be fixed?

df[['sin(x)']].plot(ax=ax_1, label='mylabel_1')
df['cos(x)'].plot(ax=ax_2, label='mylabel_2')

elviseno on 14 Nov 2018

Selecting with `[[]]`` will return a DataFrame, not a Series. I don't think this added support for multiple labels.

TomAugspurger on 14 Nov 2018

Have we resolved multiple label issue? I want to add multiple labels for each column of my df.

lsbmsb on 22 Apr 2019

Similar to https://github.com/pandas-dev/pandas/issues/9542#issuecomment-438580042, label is ignored when plotting a one-column DataFrame:

import pandas as pd
df = pd.DataFrame({'a': [2, 3]}, index=[0, 1])
df.plot(label='b')

df.plot(label=['b']) also doesn't work. I'm on 0.25.3.

MaxGhenis on 13 Dec 2019

👍3

Was this page helpful?

0 / 5 - 0 ratings

Related issues

frame _apply_standard error when operating on 0 or NaN values

venuktan · 3Comments

AttributeError: Cannot use pandas from a script file

songololo · 3Comments

ValueError plotting bar plot from DataFrame with existing Axes

swails · 3Comments

Incompatibility between pandas.infer_freq and pandas.to_timedelta

idanivanov · 3Comments

Better display of negative Timedelta

scls19fr · 3Comments