This works fine:
In [13]: df = pd.DataFrame(np.random.randn(300), columns=['a'])
In [14]: df['dtime'] = pd.DatetimeIndex(start='2014-01-01', freq='h', periods=300).time
In [15]: df.plot(x='dtime', y='a')
Out[15]: <matplotlib.axes._subplots.AxesSubplot at 0x118fd17f0>
This raises a KeyError
In [17]: df.plot(x='dtime', y='a', kind='scatter')
We call df._get_numeric_data() which excludes datetimes.
May happen for other kinds too.
I'll fix this; just have to decide how much refactoring.
Matplotlib is ok with datetime.time values, it chokes on datetime values.
Quick update: this will take a bit of work. @jreback what's the current thinking on adding new dtypes for date and time columns (separate from datetime). I'd be OK with adding it if you think it would be useful. I think it might be more useful now that we have the df.dt. accessors.
Once I have that I can use the df.select_dtypes(include=['number', 'bool', 'datetime', 'date', 'time']) to get them. (`select_dtypes is awesome btw. Great work @cpcloud)
can u give an example of what that would return?
u can certainly interpret date/time just not sure what that would mean here
adding a date and/or time dtype would be pretty tricky and not sure a lot of benefit for it
FYI you can infer an object dtype
with lib.infer_dtype (it has to scan the data so can be somewhat time consuming)
For a line plot in won't make much sense, but I think a scatter plot would work. That way you can see the how y varies through the day, across different days. Something like this works fine
In [25]: df = tm.makeTimeDataFrame().reset_index().rename(columns={'index': 'datetime'})
In [26]: df['day'] = df.datetime.dt.day
In [27]: df.plot(x='day', y='A', kind='scatter')
Out[27]: <matplotlib.axes._subplots.AxesSubplot at 0x1181ca550>
Since .dt.day returns an int. I'd like to do the same thing, but for a date or time. I'll look into lib.infer_dtype
you could easily make a DatetimeIndex method to return what u need (like date/time)
and just a add to .dt
The problem now is that the plotting code calls df._get_numeric_data() which drops all datetime/objects. I've already got the date/time data stored, I just need to be able to select it cleanly.
ahah maybe add a way to include it then? or some method of coercing ?
I'm going to close this for now. matplotlib is able to handle it if everything on the axis is the same type (dates or times), but it gets confused when there's a mix.
It might be helpful documentation to add a comment to this (closed) issue indicating how to do this in matplotlib (plot_date ?).
There is still a bug in both pandas and matplotlib, I think, that a datetime.time against datetime.time scatter plot does not work. If that's a different issue, perhaps linking them would be good. (This issue is the closest I've found.)
@JeffAbrahamson By any chance did you find a way to plot time vs time scatterplots or histograms? (I tried, but failed, and Google brought me here and to this SO question.) Of course, other than doing it manually or sticking to seconds since epoch.
I did not figure it out. My use case had smallish times (I was visualizing members of split times in a race) and so I eventually plotted integer seconds against integer seconds (numbers were all in the range from 450 to1350).
Hi @TomAugspurger and @jreback! I think it might be worth re-opening this issue; @jaclynweiser (with a few others) and I have been surprised recently by things like this:
from datetime import datetime
import pandas as pd
df = pd.DataFrame({'x': [datetime.now() for _ in range(10)], 'y': range(10)})
df.plot(x='x', y='y', kind='scatter')
This gives KeyError: 'x'.
Interestingly, you _do_ get a plot with just df.plot(x='x', y='y'); it seems like if you can make a line graph, you should be able to make a scatterplot too.
What do you think? Is there some a good work-around for this? If so, what? It's surprising to me that a datetime scatterplot isn't possible with pandas.
Agreed that it's surprising. Right now time series plots (datetime x axis) are completely separate from everything else. I've had refactoring all that to integrate with all our other plotting code on my todo list for a while.
Best workaround right now is probably df.plot(x=x, y=y, style=".")
Thanks @TomAugspurger!
Yup, found this rather surprising as well, not to be able to scatterplot a datetime object against a numeric object. (If I've followed the thread correctly, I should be okay with a datetime.time object, but it seems pd.to_datetime naturally gives a datetime, and pandas doesn't give me an equivalent method to get a datetime.time?)
Isn't df.plot(x, y, style=".") very misleading? I mean, scatter plots are meant to show the underlying data, which, for instance, need not be sampled at a regular interval. style = "." is just doing a line plot with dots instead of dashes, right? This would give a misleading impression about both the regularity and frequency of sampling (which is usually my main motivation for plotting time series as a scatterplot instead of a line plot in the first place). Seems that explicitly converting the dates decimals would be a better work-around, though it results in much less pretty axis labels (at least without a bit more plot-magic than I can quickly drum up).
(Also, apologies if I'm off the mark here, am relatively new to pandas. Thanks for considering).
Any update on this? As of 0.18, pandas still gives a key error when plot is called with kind='scatter' and a datetime column.
@colin-svds this is a closed issue (from quite a while ago). you can open a new one if you would like. but pls read this one for work-arounds. I don't know if @TomAugspurger has anything more.
Nope I haven't ever made progress on it.
I've reopened it for now if anyone wants to take a shot. It might be as simple as reworking the plotting methods to use something other than ._get_numeric_data, or it could be harder.
FWIW, till this issue is fixed people can still use this directly:
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
df = pd.DataFrame({'x': [datetime.now() for _ in range(10)], 'y': range(10)})
plt.scatter(df.x.dt.to_pydatetime(), df.y)
plt.show()

Any solution for this as if now @TomAugspurger @jreback ?
It's still open. Let us know if you're interested in working on it.
I think this issue can be closed now. @TomAugspurger @jreback
Regarding:
Matplotlib is ok with
datetime.timevalues, it chokes ondatetimevalues.
Since Tom tested it long ago (issue was made 5 years ago), and matplotlib has changed the behaviour, now the situation is opposite: matplotlib is ok with datetime values but chokes on datetime.time values
And if datetime values, i tried on master, and seems look okay to me:

Although x axis looks different, this is due to pandas has its own datetime formatter, so slightly different than matplotlib's formatter. but this is a different issue to me.
@charlesdong1991 can u add a test for the above
@jreback I think the test I used in #30434 is almost identical to this one except for the freq, see below which is in test case of 30434:
dates = pd.date_range(start=date(2019, 1, 1), periods=12, freq="W")
vals = np.random.normal(0, 1, len(dates))
df = pd.DataFrame({"dates": dates, "vals": vals})
The reason I initially xref (other than closes) this issue was the values of example above was datetime.time, and I just tested and found out datetime.time was no longer supported by matplotlib, but datetime is supported. Therefore, I think this could be directly closed.

Do you still want to have the same test for this? I could add one if you prefer this way, though a bit duplicated compared to the existing one.
can u add the test and assert that error; that would be enough to close this i think
ok @jreback see #30602
Most helpful comment
FWIW, till this issue is fixed people can still use this directly: