Cudf: [BUG] read_csv for datetime > year 2037 give wrong values

Created on 27 Aug 2020  路  1Comment  路  Source: rapidsai/cudf

Describe the bug
Dates > '2038-01-19 03:14:07' are failing to read.

Code to reproduce bug

import pandas as pd
import cudf
from io import StringIO
s=pd.DataFrame({'d': pd.Series(['1970-1-1', '2037-12-31', '2038-01-19 03:14:07', '2038-01-19 03:14:08', '12/31/2040'])}).to_csv()
print(s)
df=pd.read_csv(StringIO(s), parse_dates=['d'])    #works
gdf=cudf.read_csv(StringIO(s), parse_dates=['d']) #wrong result
print(gdf)

Expected behavior
read_csv should read this date range since it does not fall out of range.
nanoseconds maximum range is Timestamp('2262-04-11 23:47:16.854775807')

Environment overview

  • Environment location: Bare-metal
  • Method of cuDF install: from source
bug cuIO libcudf

Most helpful comment

Clearly RAPIDS is saying the world ends in 2037...

>All comments

Clearly RAPIDS is saying the world ends in 2037...

Was this page helpful?
0 / 5 - 0 ratings