Pandas: to_datetime should support ISO week year

Created on 5 Jun 2017  路  15Comments  路  Source: pandas-dev/pandas

to_datetime does not currently seem to support ISO week year like strptime does:

In [38]: datetime.date(2016, 1, 1).strftime('%G-%V')
Out[38]: '2015-53'

In [39]: datetime.datetime.strptime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', '%G-%V-%u')
Out[39]: datetime.datetime(2015, 12, 28, 0, 0)

In [41]: pd.to_datetime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', format='%G-%V-%u')
        ---------------------------------------------------------------------------
        TypeError                                 Traceback (most recent call last)
        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            443             try:
        --> 444                 values, tz = tslib.datetime_to_datetime64(arg)
            445                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

        pandas/_libs/tslib.pyx in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)()

        TypeError: Unrecognized value type: <class 'str'>

        During handling of the above exception, another exception occurred:

        ValueError                                Traceback (most recent call last)
        <ipython-input-41-7ce30c959690> in <module>()
        ----> 1 pd.to_datetime(datetime.date(2016, 1, 1).strftime('%G-%V')+'-1', format='%G-%V-%u')

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
            516         result = _convert_listlike(arg, box, format)
            517     else:
        --> 518         result = _convert_listlike(np.array([arg]), box, format)[0]
            519 
            520     return result

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            445                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
            446             except (ValueError, TypeError):
        --> 447                 raise e
            448 
            449     if arg is None:

        /Users/Robin/.pyenv/versions/3.6.1/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
            412                     try:
            413                         result = tslib.array_strptime(arg, format, exact=exact,
        --> 414                                                       errors=errors)
            415                     except tslib.OutOfBoundsDatetime:
            416                         if errors == 'raise':

        pandas/_libs/tslib.pyx in pandas._libs.tslib.array_strptime (pandas/_libs/tslib.c:63124)()

        pandas/_libs/tslib.pyx in pandas._libs.tslib.array_strptime (pandas/_libs/tslib.c:63003)()

        ValueError: 'G' is a bad directive in format '%G-%V-%u'

INSTALLED VERSIONS

commit: None

pandas: 0.20.1
pytest: 3.1.0
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.1.10
pymysql: None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Compat Timeseries

All 15 comments

sure could be added.

pull-requests are welcome.

@jreback can you help me start working on this pull request? Like where to look to fix it?

The relevant part should be in array_strptime in /_libs/tslib.pyx.

A good ref impl is datetimes strptime: https://github.com/python/cpython/blob/6f0eb93183519024cb360162bdd81b9faec97ba6/Lib/_strptime.py#L321

@buyology Could I be assigned this task so that I can work on its pull request?

@rosygupta feel free to work on this :-)

@buyology This will be my first attempt to make a PR here. Do you suggest if this would be the right task to take on ?

@rosygupta it involves some cython, but should be pretty straightforward as long as you get the environment up and running properly. otherwise, if you want something more lightweight, go and look for novice-labeled issues :rocket:

@buyology Seems achievable. Where do I look for the tests for this particular piece to check my code?

@rosygupta this is a new feature request, so there aren't existing tests for it. Similar tests to what you would need to add are in https://github.com/pandas-dev/pandas/blob/73930c58e8eac4031608bb8c4bf624d77e1d1dcb/pandas/tests/indexes/datetimes/test_tools.py

Hey, I'm not sure why tests of all classes are not being executed when testing. Only 6 classes being executed. Can someone clear out?

@buyology @TomAugspurger Pulled a PR. Need some guidance.

Hey I'm curious what's happening with this issue. I took a look at the referenced PR @rosygupta made and it looks like the PR fixes the issue, just needs to be rebased / fix any merge conflicts now? Any update into looking to get it merged in?

just needs to be rebased / fix any merge conflicts now?

you are welcome to do this

Sure I can tackle this.

So the datetime code has been moved out of _libs/tslib.pyx making a rebase hard to do. Is it better to move her changes out and reapply them manually?

Was this page helpful?
0 / 5 - 0 ratings