pandas.Series.dt.total_seconds() documentation confusing

Created on 30 Oct 2017  路  9Comments  路  Source: pandas-dev/pandas

Code Sample, a copy-pastable example if possible

import pandas as pd

series = pd.Series(pd.datetime(2017, 10, 30))

# Try with regular datetime Series (will fail)
series.dt.total_seconds()
Output: 
AttributeError: 'DatetimeProperties' object has no attribute 'total_seconds'



md5-0b4caae91fbbc50b20eb7549a911bd0f



Output:
0 86400.0
dtype: float64
```

Problem description

The documentation for pd.Series.dt.total_seconds() is a bit confusing. Reading it, one would expect that this method should work on any Series with the dt accessor. This is incorrect; looking at the API reference for datetimeline properties, it's made clear that this attribute is only available on timedelta Series.

It might be helpful to make it more clear in the documentation for pd.Series.dt.total_seconds() that the method is only available on timedelta Series. Most people will get to the page by googling "Pandas total_seconds" or something similar; they won't often see where it falls in the API reference.

Is there any precedence for these kinds of heads-ups? I'm happy to make the documentation change and would like to make sure I follow any existing conventions (if they exist).

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Docs Timedelta Timeseries good first issue

Most helpful comment

Hmm, what about just slipping the type in the short description then? Maybe:

     def total_seconds(self):
         """
-        Total duration of each element expressed in seconds.
+        Total duration of each timedelta expressed in seconds.
         """
         return Index(self._maybe_mask_results(1e-9 * self.asi8),
                      name=self.name)

All 9 comments

Annotating the supported data type in the docstring for each of these methods would be helpful.

@TomAugspurger Do you know if this kind of data type annotation already done elsewhere in the API? If so, could you provide an example? I'd like to make sure I keep things consistent (if applicable).

I do not know of any other examples. It could be as simple as a

------
This method applies to only Series with timedelta64 dtype

Now that I'm looking into this more, this isn't going to be as straightforward as I was expected. It appears Series.dt borrows attributes/methods (and by extension docstrings) from DatetimeIndex and TimedeltaIndex. Documenting the data types in this context would be confusing, I think.

Thoughts?

Hmm, what about just slipping the type in the short description then? Maybe:

     def total_seconds(self):
         """
-        Total duration of each element expressed in seconds.
+        Total duration of each timedelta expressed in seconds.
         """
         return Index(self._maybe_mask_results(1e-9 * self.asi8),
                      name=self.name)

That works for some and doesn't work for others. For example, many of the DatetimeIndex methods refer to the fact that they work on/return DatetimeIndexes specifically (see Series.dt.tz_localize() for a good example).

While I could make them all more general to fit with their use in Series.dt, that doesn't feel right; they're attributed/methods on the DatetimeIndex class, not generic functions.

I think the issue here comes from these methods being "borrowed" by Series.dt. Is there a cleaner way this can be done that allows for expressive, helpful, and unique docstrings for Series.dt and DatetimeIndex/TimedeltaIndex?

Taking the tz_localize example, at this point we would probably turn to a shared_doc docstring and replace DatetimeIndex with %(klass)s, which would substitute in DatetimeIndex or Series depending on what it's being attached to. Unfortunately, I don't know how shared docs interact with accessors. Our current setup may not be capable of handling that.

That said, a PR fixing the ones that can be done easily would be beneficial on its own. After that, you're welcome to spend as much time as you want digging into the shared docs and accessors as you want :)

Was this page helpful?
0 / 5 - 0 ratings