The following code plots two lines. The column names appear in the legend.
x=np.linspace(-10,10,201)
y,z=np.sin(x),np.cos(x)
x,y,z=pd.Series(x),pd.Series(y),pd.Series(z)
df=pd.concat([x,y,z],axis=1)
df.columns=['x','sin(x)','cos(x)']
df=df.set_index('x')
df.plot()
plt.show()
plt.clf();plt.close()

However, the following equivalent code shows None as legend, even though the labels are explicitly set.
ax=df.plot(y='sin(x)',label='sin(x)')
df.plot(y='cos(x)',label='cos(x)',ax=ax)
plt.show()
Of course there are alternative ways to make this work, but it appears to me that passing the labels should suffice. In particular, I don't think their values should be replaced with None, or printed as x-label.

Is anyone working on this? I could take a stab if not.
@schmohlio Certainly try!
I think there are two issues that could be fixed:
df.plot(y='sin(x)') gives a label "None". This can certainly regarded as a bug. And maybe a regression.label is not passed through. However, this is not a documented keyword in the pandas plot method. But, extra keywords are passed through to the matplotlib plotting method, so in that way this should maybe also work? (and indeed, it worked previously (in 0.14.1), however, only after explicitely calling ax.legend())@jorisvandenbossche thanks for the notes!
I think some of the confusion also stems from the purpose of the label= kwargs, which adds labels to the axis. Labels are set in the plot by first checking for an x index, and then overwriting based on the presence of the label arg. In the second graphic, we see that "cos(x)" successfully overwrites the x index name, and then overwrites "sin(x)".
How do we want to handle multiple series? Should the label default to the index if multiple series are called, or continue to represent the last label assignment? Should we automatically generate a legend for multiple series filled with label kwargs vals? Or, should we perhaps start by removing the legend altogether?
Thanks!
Just circling back to your notes, though, it makes sense that the legend handlers should not yield "None", and that the labels should pass through to the labels. I'll be taking a deeper look.
@schmohlio IIRC, I think we have handling in there for what to do when a second Series is plotted on an axes that already contains a Series.plot (with or without a legend).
I can't look now, but I think my last tweak to this code was fixing a bug where the label wouldn't show up if it was False. That could have introduced a regression, or maybe not. Just ping me if you get stuck anywhere.
Some more observations (and these can then be turned into tests):
df.plot(y='sin(x)') -> gives a legend with label 'None' -> this should give _no_ legend instead (as it plots one series, and then we don't automatically add a legend, see behaviour of df['sin(x)'].plot())df.plot(y='sin(x)', legend=True) -> gives a legend with label 'None' -> this should of course give a legend with label 'sin(x)' (behaviour as df['sin(x)'].plot(legend=True))df.plot(y='sin(x)', label='something else', legend=True) -> gives a legend with label 'None' -> should be a legend with label 'something else', as we want that the label kwarg overwrites the column name.And above things then should also work when adding multiple series to the same ax (as the original example)
Thanks @jorisvandenbossche.
Looks like the addtl kwargs were running through a loop with .pop() more than desired, and calling self.label which was always None.
One last question. For your last bulleted test, what should the x axis of the graph be labeled with? should it be the index ('x') or the label ('something else'). It feels like it should be the index when legend=True and the label when legend is default to False.
Nvm. Will maintain behavior of df['sin(x)'].plot() and df['sin(x)'].plot(label='something')
@jorisvandenbossche -
I modified some tests within the pandas.tests.test_graphics module where series tests did not match dataframe plot tests. I have a working commit (passed all your tests when exploring in a notebook). Also, 5 tests have errors on master, and thus they continue to fail on my branch.
Just submitted a pull request. I think the changes are straightforward.
Thanks,
Matt
New behavior:
x = np.linspace(-10,10,201)
y, z = np.sin(x), np.cos(x)
x, y, z= pd.Series(x), pd.Series(y), pd.Series(z)
df = pd.concat([x, y, z], axis=1)
df.columns = ['x', 'sin(x)', 'cos(x)']
df = df.set_index('x')
ax = df.plot(y='sin(x)', label='sin(x)')
df.plot(y='cos(x)', label='cos(x)', ax=ax)
plt.show()
This code produces a chart with two series where the legend is properly labeled (sin(x), cos(x)) and the x axis is labeld x.
Series indices are no longer mutated and labels are default to column names if label= args are not provided. Use legend=False to avoid printing a legend.
@schmohlio @TomAugspurger I think I have identified this issue causing a regression in legend plotting. Today I upgraded pandas via conda to 0.16.1, and the behaviour of my plotting code changed! Up until now I have been setting labels, without problems, like this (based on @schmohlio 's example above, reduced from the real code as much as possible):
fig, axes = plt.subplots(1, 2)
ax_1 = axes[0]
ax_2= axes[1]
df['sin(x)'].plot(ax=ax_1, label='mylabel_1')
df['cos(x)'].plot(ax=ax_2, label='mylabel_2')
ax_1.legend(loc='best')
ax_2.legend(loc='best')
plt.show()
Expected behaviour: Until now, this plotted the legend entries "mylabel_1" and "mylabel2", respectively.
Observed behaviour: With 0.16.1, the legend entries are "sin (x)" and "cos(x)". :-( (tested with an Ipython notebook)
Question 1: Was this an intended change or an unintended regression of this bug?
Question 2: (most important) How can I get the old behaviour again?
Question 3: Should I report this as a new issue?
@bilderbuchi Sorry about that. It was a regression. It's fixed by https://github.com/pydata/pandas/pull/10131 so your options are to
0.16.0 for nowThank you. I searched the tracker, but only for open issues >.<.
I have downgraded to 0.16.0 now and confirm that this fixes my problem. I assume the next release (0.16.2? 0.17?) will be fine?
awesome response time, btw!
Yes, we’ll possibly have a 0.16.2 release. Otherwise 0.17 will contain the fix.
On Jun 1, 2015, at 7:59 AM, Christoph Buchner [email protected] wrote:
Thank you. I searched the tracker, but only for open issues >.<.
I have downgraded to 0.16.0 now and confirm that this fixes my problem. I assume the next release (0.16.2? 0.17?) will be fine?—
Reply to this email directly or view it on GitHub https://github.com/pydata/pandas/issues/9542#issuecomment-107433036.
I tested @bilderbuchi 's code and it's fine. But if I select the dataframe column with double quote [[]], it will ignore the assigned label and legend the column name. I think this is not the expected behaviour?
Can just compare the difference of following two plots. What's the difference and could it be fixed?
df[['sin(x)']].plot(ax=ax_1, label='mylabel_1')
df['cos(x)'].plot(ax=ax_2, label='mylabel_2')
Selecting with `[[]]`` will return a DataFrame, not a Series. I don't think this added support for multiple labels.
Have we resolved multiple label issue? I want to add multiple labels for each column of my df.
Similar to https://github.com/pandas-dev/pandas/issues/9542#issuecomment-438580042, label is ignored when plotting a one-column DataFrame:
import pandas as pd
df = pd.DataFrame({'a': [2, 3]}, index=[0, 1])
df.plot(label='b')

df.plot(label=['b']) also doesn't work. I'm on 0.25.3.
Most helpful comment
Similar to https://github.com/pandas-dev/pandas/issues/9542#issuecomment-438580042,
labelis ignored when plotting a one-column DataFrame:df.plot(label=['b'])also doesn't work. I'm on 0.25.3.