SITUATION
conda create -n rapids16 -c rapidsai -c nvidia -c conda-forge -c defaults rapids=0.16 python=3.7 cudatoolkit=10.2CODE
import cudf
gdf = cudf.read_csv('data.csv')
gdf.head()
ERROR
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-2-301086af0e1f> in <module>
1 #gdf = cudf.read_csv('unused.csv')
----> 2 gdf = cudf.read_csv('data.csv')
3 gdf.head()
~/anaconda3/envs/rapids16/lib/python3.7/contextlib.py in inner(*args, **kwds)
72 def inner(*args, **kwds):
73 with self._recreate_cm():
---> 74 return func(*args, **kwds)
75 return inner
76
~/anaconda3/envs/rapids16/lib/python3.7/site-packages/cudf/io/csv.py in read_csv(filepath_or_buffer, lineterminator, quotechar, quoting, doublequote, header, mangle_dupe_cols, usecols, sep, delimiter, delim_whitespace, skipinitialspace, names, dtype, skipfooter, skiprows, dayfirst, compression, thousands, decimal, true_values, false_values, nrows, byte_range, skip_blank_lines, parse_dates, comment, na_values, keep_default_na, na_filter, prefix, index_col, **kwargs)
84 na_filter=na_filter,
85 prefix=prefix,
---> 86 index_col=index_col,
87 )
88
cudf/_lib/csv.pyx in cudf._lib.csv.read_csv()
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory
SIMILAR ISSUES
Can you provide a sample/subset csv?
If you cannot provide the csv, can you confirm wheather reading the csv via pandas and converting the pandas object to cudf is working without any memory error:
import cudf
import pandas as pd
pdf = pd.read_csv('data.csv')
gdf = cudf.from_pandas(pdf)
print(gdf.head())
Can you provide a sample/subset csv?
If you cannot provide the csv, can you confirm wheather reading the csv via pandas and converting the pandas object to cudf is working without any memory error:
import cudf import pandas as pd pdf = pd.read_csv('data.csv') gdf = cudf.from_pandas(pdf) print(gdf.head())
If I run the above code in jupyter lab it runs for a long time than the loading star right next to the code cell just disappears, no error message.
Terminal
[I 10:51:38.008 LabApp] Starting buffering for long_ID
[I 10:53:36.705 LabApp] Saving file at /folder/notebookname.ipynb
[I 10:55:27.150 LabApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel long_ID restarted
kernel long_ID restarted
[I 10:55:42.032 LabApp] Starting buffering for long_ID
Most helpful comment
Can you provide a sample/subset csv?
If you cannot provide the csv, can you confirm wheather reading the csv via pandas and converting the pandas object to cudf is working without any memory error: