Describe the bug
A clear and concise description of what the bug is.
Steps/Code to reproduce bug
>>> df = cudf.read_orc('to_orc_bug.orc')
>>> df.upc_nbr[(df.visit_nbr == 14600028) & (df.store_nbr == 47)] = 681131184420
>>> df.to_orc('to_orc_bug.orc', compression='snappy')
>>> df2 = cudf.read_orc('to_orc_bug.orc')
>>> df2.upc_nbr[(df2.visit_nbr == 14600028) & (df2.store_nbr == 47)]
999786 2526351652
Expected behavior
Returned value should be 681131184420, not 2526351652.
Environment overview (please complete the following information)
cc @devavret in case #5324 is related
Tried to read with pyarrow and it works.
import cudf
import pyarrow.orc as orc
df = cudf.read_orc("to_orc_bug.orc")
df.upc_nbr[(df.visit_nbr == 14600028) & (df.store_nbr == 47)] = 681131184420
df.to_orc("to_orc_bug2.orc", compression="snappy")
pdf = orc.ORCFile("to_orc_bug2.orc").read().to_pandas()
print(pdf[(pdf.visit_nbr == 14600028) & (pdf.store_nbr == 47)])
999786 6.811312e+11
Name: upc_nbr, dtype: float64
Seems to be a reader issue.
Relabeled issue as such
This was an easy fix but I'm still trying to figure out how to properly add tests for this, or in general, anything in cuIO.