pandas version: 0.16.2
matplotlib version 1.4.3 (and produced different error message on older version)
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(10,2))
df_o = df.astype(np.object)
df_o.hist()
ValueError Traceback (most recent call last)
<ipython-input-1-26253737011d> in <module>()
4 df = pd.DataFrame(np.random.rand(10,2))
5 df_o = df.astype(np.object)
----> 6 df_o.hist()
/usr/local/lib/python2.7/dist-packages/pandas/tools/plotting.pyc in hist_frame(data, column, by, grid, xlabelsize, xrot, ylabelsize, yrot, ax, sharex, sharey, figsize, layout, bins, **kwds)
2764 fig, axes = _subplots(naxes=naxes, ax=ax, squeeze=False,
2765 sharex=sharex, sharey=sharey, figsize=figsize,
-> 2766 layout=layout)
2767 _axes = _flatten(axes)
2768
/usr/local/lib/python2.7/dist-packages/pandas/tools/plotting.pyc in _subplots(naxes, sharex, sharey, squeeze, subplot_kw, ax, layout, layout_type, **fig_kw)
3244
3245 # Create first subplot separately, so we can share it if requested
-> 3246 ax0 = fig.add_subplot(nrows, ncols, 1, **subplot_kw)
3247
3248 if sharex:
/usr/local/lib/python2.7/dist-packages/matplotlib/figure.pyc in add_subplot(self, *args, **kwargs)
962 self._axstack.remove(ax)
963
--> 964 a = subplot_class_factory(projection_class)(self, *args, **kwargs)
965
966 self._axstack.add(key, a)
/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_subplots.pyc in __init__(self, fig, *args, **kwargs)
62 raise ValueError(
63 "num must be 0 <= num <= {maxn}, not {num}".format(
---> 64 maxn=rows*cols, num=num))
65 if num == 0:
66 warnings.warn("The use of 0 (which ends up being the "
ValueError: num must be 0 <= num <= 0, not 1
We (only?) plot numeric types (df._get_numeric_data
IIRC).
What's you use here that you're getting object dtypes? Integer NaNs? You'll typically want to avoid object dtypes since they're much slower for numeric operations.
I agree with what you're saying, and I think a suitable fix would include a more explicit check for the dtypes and show an error. As it is, I spent some time trying to figure out what the issue was, especially because the string representation of the DataFrame doesn't show the dtypes.
I am using floats and integers, but by accident, when I constructed the DataFrame, all entries were NaN objects, and then I populated the DataFrame in a loop.
An error message like hist method requires numerical columns, nothing to plot
or anything clearer would be useful.
Any progress on this? It took me quite a few hours to realize it was a dtype error.
Still open. Interested in submitting a PR to fix it?
On Tue, Aug 21, 2018 at 10:06 AM zhanwenchen notifications@github.com
wrote:
Any progress on this? It took me quite a few hours to realize it was a
dtype error.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/pandas-dev/pandas/issues/10444#issuecomment-414707813,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABQHIrdc3hje4W1xv92Y3_uxXpg7nC0Rks5uTCHbgaJpZM4FMPtc
.
@TomAugspurger i get a nonsense error here from a dataframe with 2 columns, but not when histogramming them one at a time as a series:
ValueError: num must be 1 <= num <= 0, not 1
traceback:
ValueError Traceback (most recent call last)
<ipython-input-41-7cfbfac10616> in <module>()
1 dfi['ml_data'][
----> 2 ['duration_', 'duration__']].dropna().astype('timedelta64[D]').astype(float).hist(bins=20)
/usr/local/lib/python3.6/dist-packages/pandas/plotting/_core.py in hist_frame(data, column, by, grid, xlabelsize, xrot, ylabelsize, yrot, ax, sharex, sharey, figsize, layout, bins, **kwds)
2176 fig, axes = _subplots(naxes=naxes, ax=ax, squeeze=False,
2177 sharex=sharex, sharey=sharey, figsize=figsize,
-> 2178 layout=layout)
2179 _axes = _flatten(axes)
2180
/usr/local/lib/python3.6/dist-packages/pandas/plotting/_tools.py in _subplots(naxes, sharex, sharey, squeeze, subplot_kw, ax, layout, layout_type, **fig_kw)
235
236 # Create first subplot separately, so we can share it if requested
--> 237 ax0 = fig.add_subplot(nrows, ncols, 1, **subplot_kw)
238
239 if sharex:
/usr/local/lib/python3.6/dist-packages/matplotlib/figure.py in add_subplot(self, *args, **kwargs)
1072 self._axstack.remove(ax)
1073
-> 1074 a = subplot_class_factory(projection_class)(self, *args, **kwargs)
1075
1076 self._axstack.add(key, a)
/usr/local/lib/python3.6/dist-packages/matplotlib/axes/_subplots.py in __init__(self, fig, *args, **kwargs)
62 raise ValueError(
63 "num must be 1 <= num <= {maxn}, not {num}".format(
---> 64 maxn=rows*cols, num=num))
65 self._subplotspec = GridSpec(rows, cols)[int(num) - 1]
66 # num - 1 for converting from MATLAB to python indexing
ValueError: num must be 1 <= num <= 0, not 1
colab notebook:
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.33+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
numpy: 1.14.5
matplotlib: 2.1.2
@denfromufa do you want to fix it?
Here you have the general documentation on how to do it: https://pandas.pydata.org/pandas-docs/stable/contributing.html
The fix should be easy, just checking the type and raising an exception with a useful message.
@datapythonista i don't agree on this solution - it should just work. why raise an exception when histogram works for each series? however i did not have a chance to debug this yet.
if you make it work even better, feel free to send a PR for it.
I am struggling with a similar issue. I am building data frames from various sources, each with 2690 rows; from one source I can get the histogram to work, from the other I get the error reported above (and below). My plan was to convert both data frames to a dict using df.to_dict() and back to a data frame using pd.DataFrame.from_dict(), to explore more and reproduce the issue here in case someone could point out what the problem was. But when I do that, they both plot just fine. E.g.
dic=df1.to_dict()
df1=pd.DataFrame.from_dict(dic)
When I try to examine the original data frames for NaNs, etc, I cannot tell a difference. Any idea why converting my dfs to dict and back solves this issue?
df0:
print(df0.sort_values('rate',ascending=False).head(5))
print(pd.isnull(df).sum())
df1:
print(df1.sort_values('rate',ascending=False).head(5))
print(pd.isnull(df).sum())
ValueError Traceback (most recent call last)
9 print(df.loc[:,['obnme','rate','total_et','acres']].sort_values('obnme',ascending=False).to_dict())
10 print(pd.isnull(df).sum())
---> 11 df.hist('rate',bins=np.arange(df['rate'].min(),df['rate'].max(),0.25))
12 plt.title('ET rate for all WR')
13 plt.xlabel('ET rate (ft/yr)')
C:conda3x64envsp3x64libsite-packagespandasplotting_core.py in hist_frame(data, column, by, grid, xlabelsize, xrot, ylabelsize, yrot, ax, sharex, sharey, figsize, layout, bins, **kwds)
2406 fig, axes = _subplots(naxes=naxes, ax=ax, squeeze=False,
2407 sharex=sharex, sharey=sharey, figsize=figsize,
-> 2408 layout=layout)
2409 _axes = _flatten(axes)
2410
C:conda3x64envsp3x64libsite-packagespandasplotting_tools.py in _subplots(naxes, sharex, sharey, squeeze, subplot_kw, ax, layout, layout_type, *fig_kw)
236
237 # Create first subplot separately, so we can share it if requested
--> 238 ax0 = fig.add_subplot(nrows, ncols, 1, *subplot_kw)
239
240 if sharex:
C:conda3x64envsp3x64libsite-packagesmatplotlibfigure.py in add_subplot(self, args, *kwargs)
1237 self._axstack.remove(ax)
1238
-> 1239 a = subplot_class_factory(projection_class)(self, args, *kwargs)
1240 self._axstack.add(key, a)
1241 self.sca(a)
C:conda3x64envsp3x64libsite-packagesmatplotlibaxes_subplots.py in __init__(self, fig, args, *kwargs)
65 raise ValueError(
66 ("num must be 1 <= num <= {maxn}, not {num}"
---> 67 ).format(maxn=rows*cols, num=num))
68 self._subplotspec = GridSpec(
69 rows, cols, figure=self.figure)[int(num) - 1]
ValueError: num must be 1 <= num <= 0, not 1
Anyone working on this? If not, I'd like to take this up as my first issue.
all yours @matsmaiwald, thanks
Most helpful comment
An error message like
hist method requires numerical columns, nothing to plot
or anything clearer would be useful.