This raises
s = Series(range(5),pd.timedelta_range('1day',periods=5))
s.plot()
This will show the timedeltas with a formatted (albeit string index)
s.index = s.index.format()
s.plot()
wonder if we can just register a converter somehow? like #8614
I don't think that matplotlib already has a converter for datetime.timedelta, so just registering our Timedelta type will not be enough. Eg plt.plot(s.index.to_pytimedelta(), s) also fails.
But writing a basic converter should not be that difficult I think (and if it also works for datetime.timedelta it could maybe also be pushed upstream to matplotlib)
Timedelta is s. subclass of datetime.timedelta
I just encountered a MemoryError when attempting to plot a TimedeltaIndex!
pd.Series(range(15), pd.timedelta_range(0, freq='D', periods=15)).plot()
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-113-e9a2d53dcace> in <module>()
----> 1 pd.Series(range(15), pd.timedelta_range(0, freq='H', periods=15)).plot()
/Users/shoyer/dev/pandas/pandas/tools/plotting.pyc in plot_series(data, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
2516 yerr=yerr, xerr=xerr,
2517 label=label, secondary_y=secondary_y,
-> 2518 **kwds)
2519
2520
/Users/shoyer/dev/pandas/pandas/tools/plotting.pyc in _plot(data, x, y, subplots, ax, kind, **kwds)
2322 plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
2323
-> 2324 plot_obj.generate()
2325 plot_obj.draw()
2326 return plot_obj.result
/Users/shoyer/dev/pandas/pandas/tools/plotting.pyc in generate(self)
925 self._make_legend()
926 self._post_plot_logic()
--> 927 self._adorn_subplots()
928
929 def _args_adjust(self):
/Users/shoyer/dev/pandas/pandas/tools/plotting.pyc in _adorn_subplots(self)
1058 ax.set_xticklabels(xticklabels)
1059 self._apply_axis_properties(ax.xaxis, rot=self.rot,
-> 1060 fontsize=self.fontsize)
1061 self._apply_axis_properties(ax.yaxis, fontsize=self.fontsize)
1062 elif self.orientation == 'horizontal':
/Users/shoyer/dev/pandas/pandas/tools/plotting.pyc in _apply_axis_properties(self, axis, rot, fontsize)
1069
1070 def _apply_axis_properties(self, axis, rot=None, fontsize=None):
-> 1071 labels = axis.get_majorticklabels() + axis.get_minorticklabels()
1072 for label in labels:
1073 if rot is not None:
/Users/shoyer/miniconda/envs/rapid/lib/python2.7/site-packages/matplotlib/axis.pyc in get_majorticklabels(self)
1166 def get_majorticklabels(self):
1167 'Return a list of Text instances for the major ticklabels'
-> 1168 ticks = self.get_major_ticks()
1169 labels1 = [tick.label1 for tick in ticks if tick.label1On]
1170 labels2 = [tick.label2 for tick in ticks if tick.label2On]
/Users/shoyer/miniconda/envs/rapid/lib/python2.7/site-packages/matplotlib/axis.pyc in get_major_ticks(self, numticks)
1295 'get the tick instances; grow as necessary'
1296 if numticks is None:
-> 1297 numticks = len(self.get_major_locator()())
1298 if len(self.majorTicks) < numticks:
1299 # update the new tick label properties from the old
/Users/shoyer/dev/pandas/pandas/tseries/converter.pyc in __call__(self)
901 vmin, vmax = vmax, vmin
902 if self.isdynamic:
--> 903 locs = self._get_default_locs(vmin, vmax)
904 else: # pragma: no cover
905 base = self.base
/Users/shoyer/dev/pandas/pandas/tseries/converter.pyc in _get_default_locs(self, vmin, vmax)
882
883 if self.plot_obj.date_axis_info is None:
--> 884 self.plot_obj.date_axis_info = self.finder(vmin, vmax, self.freq)
885
886 locator = self.plot_obj.date_axis_info
/Users/shoyer/dev/pandas/pandas/tseries/converter.pyc in _daily_finder(vmin, vmax, freq)
505 Period(ordinal=int(vmax), freq=freq))
506 span = vmax.ordinal - vmin.ordinal + 1
--> 507 dates_ = PeriodIndex(start=vmin, end=vmax, freq=freq)
508 # Initialize the output
509 info = np.zeros(span,
/Users/shoyer/dev/pandas/pandas/tseries/period.pyc in __new__(cls, data, ordinal, freq, start, end, periods, copy, name, tz, **kwargs)
637 else:
638 data, freq = cls._generate_range(start, end, periods,
--> 639 freq, kwargs)
640 else:
641 ordinal, freq = cls._from_arraylike(data, freq, tz)
/Users/shoyer/dev/pandas/pandas/tseries/period.pyc in _generate_range(cls, start, end, periods, freq, fields)
651 raise ValueError('Can either instantiate from fields '
652 'or endpoints, but not both')
--> 653 subarr, freq = _get_ordinal_range(start, end, periods, freq)
654 elif field_count > 0:
655 subarr, freq = _range_from_fields(freq=freq, **fields)
/Users/shoyer/dev/pandas/pandas/tseries/period.pyc in _get_ordinal_range(start, end, periods, freq)
1317 dtype=np.int64)
1318 else:
-> 1319 data = np.arange(start.ordinal, end.ordinal + 1, dtype=np.int64)
1320
1321 return data, freq
MemoryError:
> /Users/shoyer/dev/pandas/pandas/tseries/period.py(1319)_get_ordinal_range()
1318 else:
-> 1319 data = np.arange(start.ordinal, end.ordinal + 1, dtype=np.int64)
1320
Working on this. Doesn't look too bad.
As an update, it's a bit worse than I thought. I think it was @changhiskhan who put in a ton of heuristics for figuring out what to resolution to draw when plotting datetimes. I wasn't sure if we'd need that for timedeltas, and then I got busy with other thing. My branch is here
As a workaround, the following works with master:
plt.plot(s.index,s.values)
I don't think freq adjustment of different timedeltas is mandatory at initial version. If ok, I'll try.
Coming here from #10650, and adding a little more info just in case it can help. In my case, the bug manifests in _get_ordinal_range's end parameter having a huge ordinal. This means the following line:
data = np.arange(start.ordinal, end.ordinal + 1, mult, dtype=np.int64)
allocates a gigantic array. To be specific, when doing:
pd.Series(np.random.randn(4), index=pd.timedelta_range('0:00:00', periods=4, freq='min')).plot()
the values of start.ordinal and end.ordinal are 0 and 180000000000, respectively.
@lucas-eyer is the mult parameter on that line appropriate, or is it some very small number? That might be the source of the issue...
I don't know what appropriate would be, but it's 1 (one).
Edit: pip freeze | grep pandas gives pandas==0.17.0.
I also just ran into this issue on 0.17.1. I'm not very familiar with the code, but it appears the issue is in pandas.tseries.converter.
The issue is that vmin and vmax as specified in the call to _get_default_locs in the get_major_locator function are in nanoseconds as returned from XAxis.get_view_interval:
def __call__(self):
'Return the locations of the ticks.'
# axis calls Locator.set_axis inside set_m<xxxx>_formatter
vi = tuple(self.axis.get_view_interval()) # THIS IS IN NANOS
if vi != self.plot_obj.view_interval:
self.plot_obj.date_axis_info = None
self.plot_obj.view_interval = vi
vmin, vmax = vi
if vmax < vmin:
vmin, vmax = vmax, vmin
if self.isdynamic:
locs = self._get_default_locs(vmin, vmax) # VMIN AND VMAX ARE IN NANOS
else: # pragma: no cover
base = self.base
(d, m) = divmod(vmin, base)
vmin = (d + 1) * base
locs = lrange(vmin, vmax + 1, base)
return locs
But downstream in _daily_finder the freq parameter is used, which means that the system is interpreting the deltas in terms of minutes/hours/etc. rather than nanos:
def _daily_finder(vmin, vmax, freq):
periodsperday = -1
if freq >= FreqGroup.FR_HR:
if freq == FreqGroup.FR_NS:
periodsperday = 24 * 60 * 60 * 1000000000
# ETC MAPPING periodsperday
# .....
# save this for later usage
vmin_orig = vmin
(vmin, vmax) = (Period(ordinal=int(vmin), freq=freq), # NOW THESE ARE INTERPRETED AS MINUTES (or whatever freq)
Period(ordinal=int(vmax), freq=freq))
Replacing the final line above with
(vmin, vmax) = (Period(ordinal=int(vmin), freq='N'), Period(ordinal=int(vmax), freq='N'))
appears to fix the issue.
@Liam3851 glad you have tracked this down! Any chance you're interested in making a pull request with the fix? :)
Sure, I just have to figure out how to do it lol. Longtime pandas user but kinda new on this github thingy. I'll head over to the FAQ.
Great! Give it a try and let us know if you have any questions :).
On Wed, Jan 13, 2016 at 11:48 AM, Liam3851 [email protected] wrote:
Sure, I just have to figure out how to do it lol. Longtime pandas user but
kinda new on this github thingy. I'll head over to the FAQ.—
Reply to this email directly or view it on GitHub
https://github.com/pydata/pandas/issues/8711#issuecomment-171412395.
Lots of love from me too @Liam3851!
Hmm, ok still slightly more complicated. Was testing the fix and the bounds are now right and the graphs themselves look correct but the axis labels don't always work properly (sometimes they disappear)-- probably something related to how the labels are interpreted. I'm busy these next few days but I'll try to get around to making the fix sound.
Just guessing, but you could be hitting what I ran into. I can't remember how much progress if any I made on that.
@TomAugspurger Hmm.. I'll try your version to see what it does. From the diff it looks like we're taking slightly different paths. It looks like you were building a TimedeltaConverter that worked parallel to DatetimeConverter and TimeConverter; I've been trying to fix the codepath the timedeltas are currently taking (through DatetimeConverter). But it's entirely possible that getting it to look just right will require going down your path.
I’d say getting it somewhat functional is good enough for now. Hopefully you don’t have to go down that rabbit hole.
On Jan 14, 2016, at 10:29 AM, Liam3851 [email protected] wrote:
@TomAugspurger https://github.com/TomAugspurger Hmm.. I'll try your version to see what it does. From the diff it looks like we're taking slightly different paths. It looks like you were building a TimedeltaConverter that worked parallel to DatetimeConverter and TimeConverter; I've been trying to fix the codepath the timedeltas are currently taking (through DatetimeConverter). But it's entirely possible that getting it to look just right will require going down your path.
—
Reply to this email directly or view it on GitHub https://github.com/pydata/pandas/issues/8711#issuecomment-171691022.
Hello. I am using pandas version 0.19.0 and matplotlib version 1.5.3 with python 3 and this issue is still there: If I try to plot a Dataframe where the index is a timedelta I get Memory Error. I am working around this by calling plt.plot(df.index, df.values) but it would be nice if there was a proper fix for this...
@sam-cohan As you can see, the issue is still open, so it's indeed not yet solved. But any help is certainly welcome!
Sorry I was looking at the wrong "Closed" :)
Really wish this was fixed. I'm using datetime as a work around but stringing along 1970-01-01 to do time deltas is not fun.
@TomAugspurger does your branch with a first attempt still exist? (the link above is not working anymore)
So the issue here is that we are trying to use the Int64Index as a base class for TimedeltaIndex but we are trying to use the plotting routines for the PeriodIndex which relies on DatetimeIndex (matplotlib.date) underneath. Matplotlib.date scales the view interval to the selected frequency. Int64Index does not, so this explains the issues above.
@jgoppert you should take a look at pandas/tseries/converter.py and the TimeConverter and DatetimeConverter classes. A possible way forward is to make a new TimedeltaConverter similar to those.
@jorisvandenbossche I did consider that approach, but I think having a separate matplotlib plotting function is cleaner and will require less maintenance. We also won't have to worry about ever seeing jan 1970 on the time delta plot like we do on the period index based plots now. It seems pretty robust and I have added nano-second level precision labels.
@TomAugspurger does your branch with a first attempt still exist? (the link above is not working anymore)
Seems like I deleted that branch when I was cleaning up my fork. I didn't get far beyond the TimedletaConverter, which is pretty straightforward. IIRC the difficult part was getting the dynamic relabeling to work like datetimes do (which can be a separate fix from fixing the memory error).
@TomAugspurger can you take a look at my PR. Totally different approach but seems to work for me.
Does this mean the fix for this will be in next release? If so, what is the timeline for that? Thanks in advance.
@sam-cohan yes it will be in 0.20.0
I think we are still about 1 month away from an rc.
Most helpful comment
As a workaround, the following works with master: