Pandas: Can't plot pd.Timedelta (or numpy.timedelta64) versus time

Created on 26 Dec 2014  路  12Comments  路  Source: pandas-dev/pandas

Hello,

I did

import numpy as np
import pandas as pd
from matplotlib.pyplot as plt

idx = pd.date_range('20140101', '20140201')
df = pd.DataFrame(index=idx)
df['col0'] = np.random.randn(len(idx))
s_idx = pd.Series(idx, index=idx) # need to do this because we can't shift index
diff_idx = (s_idx-s_idx.shift(1)).fillna(pd.Timedelta(0))
df['diff_dt'] = diff_idx
df['diff_dt'].plot()

but it raises Empty 'Series': no numeric data to plot

In [78]: df.dtypes
Out[78]:
col0               float64
diff_dt    timedelta64[ns]
dtype: object


In [79]: type(df['diff_dt'][1])
Out[79]: pandas.tslib.Timedelta

I don't understand if data inside diff_dt columns are numpy.timedelta64 or pd.Timedelta

df['diff_dt'].map(lambda x: x.value)

raises AttributeError: 'numpy.timedelta64' object has no attribute 'value'

but it seems that I can get valuefor a given data (let's say row 10)

In [97]: df['diff_dt'][10].value
Out[97]: 86400000000000

I don't understand why...

But I also don't understand how I could plot diff_dt column without doing an uggly:

df['diff_dt'].map(lambda x: x/np.timedelta64(1, 'ns')).plot()

But I don't know how I could automatically get this np.timedelta64(1, 'ns')

Maybe Pandas could plot Timedelta (or np.timedelta64) out of the box ?
because that's interesting to know for example if sampling period is constant.

Kind regards

Duplicate Timedelta Visualization

Most helpful comment

@drevicko -- try (x5.task_a / np.timedelta64(1, 'h')).plot.kde()? This would be hours on x-axis distribution, use np.timedelta64(1, 'm') for minutes.

All 12 comments

dupe of #8711

@jreback I don't think this is a dupe of #8711. That's about formatting, this one is about getting an exception when you try to plot timedelta64 data. Or rather, the first code snippet and subsequent exception is not yet solved, and still present in 0.20.3

pandas doesn't suooort plotting timedeltas out of the box

ok. thanks for the quick response.

Is there an easy/best practice workaround?

I tried with astype('timedelta64[m]') and it worked - other units could be used there. There's an SO question about the exception - care to answer it (or I will if you're not keen).

@drevicko -- try (x5.task_a / np.timedelta64(1, 'h')).plot.kde()? This would be hours on x-axis distribution, use np.timedelta64(1, 'm') for minutes.

@pratapvardhan Thanks, ~though for me, x5.task_a.astype('timedelta64[h]') is a bit tidier but it rounds the conversion - perhaps not as a big problem if you use minutes as units ('timedelta64[m]')~. Rounding actually perturbs the resulting plot/histogram, and can give you an incorrect idea bout the data!!!

I like plotting the with the kernel density estimate .plot.kde(). For others that come here, you can also do a histogram with x5.task_a.astype('timedelta64[m]').hist() and the usual matplotlib hist() parameters if you don't like the defaults.

@jreback https://github.com/pandas-dev/pandas/issues/8711 was merged for 0.20.0, so we do support timedelta plotting no?

Actually, this works on master now (only the formatting of the axis is not that informative (uses the integer values)).

Suppose this was closed by #17430

@scls19fr @drevicko could you try on master to verify ?

So closing as duplicate of https://github.com/pandas-dev/pandas/issues/16953 (which was actually also opened by @scls19fr).

I didn't remembered this 2014 bug when I opened https://github.com/pandas-dev/pandas/issues/16953. I'm sorry about this duplicate.
But I'd like to say that plotting only integer values is not very informative (not informative enough I should probably say)

Ah sorry, didn't see this issue was from 2014, I thought it was opened today :-)

You are certainly correct about it not being very informative. So PR https://github.com/pandas-dev/pandas/pull/15067 added better formatting, but I suppose only for the x axis. Do you want to open an issue to generalize this better timedelta formatting to all cases ?

I'm currently building latest Pandas (with a MacBook Air)... so it may be long.
If I open an issue to generalize this better timedelta formatting to all cases, I prefer to do it with latest Pandas version (master) to show some screenshots.

Was this page helpful?
0 / 5 - 0 ratings