Describe the bug
Error loading files (possible issues with RMM)
Steps/Code to reproduce bug
1) Download this file https://github.com/rapidsai/cudf/blob/branch-0.11/python/cudf/cudf/tests/data/orc/TestOrcFile.test1.orc
2) Run this script
import cudf
path = "/code/cudf/python/cudf/cudf/tests/data/orc/TestOrcFile.test1.orc"
gdf = cudf.read_orc(path)
print(gdf)
3) See results at the end:
boolean1 byte1 short1 int1 long1 ... list.int1 list.string1 map map.int1 map.string1
0 False 1 1024 65536 9223372036854775807 ... 4 chani 5 chani
1 True 100 2048 65536 9223372036854775807 ... 2147483647 mauddib 1 mauddib
[2 rows x 16 columns]
terminate called after throwing an instance of 'thrust::system::system_error'
what(): rmm_allocator::deallocate(): RMM_FREE: initialization error
Aborted
Expected behavior
Should be able to read the file without exceptions.
Environment overview (please complete the following information)
Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details:
`(n103) percy@pctabz:~/Blazing/projects/code/cudf$ ./print_env.sh
Click here to see environment details
**git***
commit 64588be99736aef7b3f84eba0698a009e1e815e6 (HEAD -> branch-0.10, origin/branch-0.10, origin/HEAD)
Merge: 335f10f 2c3cfd3
Author: Ray Douglass <[email protected]>
Date: Fri Oct 11 14:53:50 2019 -0400
Merge pull request #3058 from beckernick/docs/udf-doc-markdown-formatting
[REVIEW] Fix UDF doc markdown formatting
**git submodules***
-b165e1fb11eeea64ccf95053e40f2424312599cc thirdparty/cub
-63f644be44201467e3938d59ed9d89cc8725c35d thirdparty/jitify
***OS Information***
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.4 LTS"
NAME="Ubuntu"
VERSION="16.04.4 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.4 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
Linux pctabz 4.13.0-36-generic #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
***GPU Information***
Fri Oct 11 16:12:37 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 105... Off | 00000000:01:00.0 Off | N/A |
| N/A 45C P8 N/A / N/A | 133MiB / 4042MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 10587 C ./testing-libgdf 123MiB |
+-----------------------------------------------------------------------------+
***CPU***
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Stepping: 9
CPU MHz: 2800.000
CPU max MHz: 3800,0000
CPU min MHz: 800,0000
BogoMIPS: 5616.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti retpoline intel_pt rsb_ctxsw tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp
***CMake***
/home/percy/Applications/anaconda/conda/envs/n103/bin/cmake
cmake version 3.15.4
CMake suite maintained and supported by Kitware (kitware.com/cmake).
***g++***
/usr/bin/g++
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
***nvcc***
***Python***
/home/percy/Applications/anaconda/conda/envs/n103/bin/python
Python 3.7.4
***Environment Variables***
PATH : /home/percy/Applications/anaconda/conda/envs/n103/bin:/home/percy/Applications/anaconda/conda/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/home/percy/Applications/anaconda/conda/bin
LD_LIBRARY_PATH :
NUMBAPRO_NVVM :
NUMBAPRO_LIBDEVICE :
CONDA_PREFIX : /home/percy/Applications/anaconda/conda/envs/n103
PYTHON_PATH :
***conda packages***
/home/percy/Applications/anaconda/conda/condabin/conda
# packages in environment at /home/percy/Applications/anaconda/conda/envs/n103:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
alsa-lib 1.1.5 h516909a_1001 conda-forge
arrow-cpp 0.14.1 py37h5ac5442_4 conda-forge
blazingdb-protocol 1.0 dev_0 <develop>
blazingsql-toolchain 0.4.4 0 blazingsql
bokeh 1.3.4 py37_0 conda-forge
boost-cpp 1.70.0 h8e57a91_2 conda-forge
brotli 1.0.7 he1b5a44_1000 conda-forge
bzip2 1.0.8 h516909a_1 conda-forge
c-ares 1.15.0 h516909a_1001 conda-forge
ca-certificates 2019.9.11 hecc5488_0 conda-forge
certifi 2019.9.11 py37_0 conda-forge
chardet 3.0.4 pypi_0 pypi
click 7.0 py_0 conda-forge
cloudpickle 1.2.2 py_0 conda-forge
cmake 3.15.4 hf94ab9c_0 conda-forge
cudatoolkit 10.0.130 0
cudf 0.10.0a191011 py37_2577 rapidsai-nightly/label/cuda10.0
curl 7.65.3 hbc83047_0
cython 0.29.13 py37he1b5a44_0 conda-forge
cytoolz 0.10.0 py37h516909a_0 conda-forge
dask 2.5.2 py_0 conda-forge
dask-core 2.5.2 py_0 conda-forge
dask-cudf 0.10.0a191011 py37_2577 rapidsai-nightly/label/cuda10.0
distributed 2.5.2 py_0 conda-forge
dlpack 0.2 he1b5a44_1 conda-forge
double-conversion 3.1.5 he1b5a44_1 conda-forge
et-xmlfile 1.0.1 pypi_0 pypi
expat 2.2.5 he1b5a44_1003 conda-forge
fastavro 0.22.5 py37h516909a_0 conda-forge
flatbuffers 1.11 pypi_0 pypi
fontconfig 2.13.1 h86ecdb6_1001 conda-forge
freetype 2.10.0 he983fc9_1 conda-forge
fsspec 0.5.2 py_0 conda-forge
gflags 2.2.2 he1b5a44_1001 conda-forge
giflib 5.1.7 h516909a_1 conda-forge
gitdb2 2.0.6 pypi_0 pypi
gitpython 3.0.3 pypi_0 pypi
glog 0.4.0 he1b5a44_1 conda-forge
gmock 1.10.0 0 conda-forge
grpc-cpp 1.23.0 h18db393_0 conda-forge
gtest 1.10.0 hc9558a2_0 conda-forge
heapdict 1.0.1 py_0 conda-forge
icu 64.2 he1b5a44_1 conda-forge
idna 2.8 pypi_0 pypi
jdcal 1.4.1 pypi_0 pypi
jinja2 2.10.3 py_0 conda-forge
jpeg 9c h14c3975_1001 conda-forge
krb5 1.16.1 h173b8e3_7
lcms2 2.9 h2e4bb80_0 conda-forge
libblas 3.8.0 13_openblas conda-forge
libcblas 3.8.0 13_openblas conda-forge
libcudf 0.10.0a191011 cuda10.0_2577 rapidsai-nightly/label/cuda10.0
libcurl 7.65.3 h20c2e04_0
libedit 3.1.20181209 hc058e9b_0
libevent 2.1.10 h72c5cf5_0 conda-forge
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libiconv 1.15 h516909a_1005 conda-forge
liblapack 3.8.0 13_openblas conda-forge
libnvstrings 0.10.0a191011 cuda10.0_2577 rapidsai-nightly/label/cuda10.0
libopenblas 0.3.7 h6e990d7_1 conda-forge
libpng 1.6.37 hed695b0_0 conda-forge
libprotobuf 3.8.0 h8b12597_0 conda-forge
librmm 0.10.0a191011 cuda10.0_169 rapidsai-nightly/label/cuda10.0
libssh2 1.8.2 h22169c7_2 conda-forge
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.0.10 h57b8799_1003 conda-forge
libuuid 2.32.1 h14c3975_1000 conda-forge
libuv 1.32.0 h516909a_0 conda-forge
libxcb 1.13 h14c3975_1002 conda-forge
libxml2 2.9.9 hee79883_5 conda-forge
llvmlite 0.29.0 py37hfd453ef_1 conda-forge
locket 0.2.0 py_2 conda-forge
lz4-c 1.8.3 he1b5a44_1001 conda-forge
markupsafe 1.1.1 py37h14c3975_0 conda-forge
maven 3.6.0 0 conda-forge
msgpack-python 0.6.2 py37hc9558a2_0 conda-forge
ncurses 6.1 he6710b0_1
numba 0.45.1 py37hb3f55d8_0 conda-forge
numpy 1.17.2 py37h95a1406_0 conda-forge
nvstrings 0.10.0a191011 py37_2577 rapidsai-nightly/label/cuda10.0
olefile 0.46 py_0 conda-forge
openjdk 8.0.192 h14c3975_1003 conda-forge
openpyxl 3.0.0 pypi_0 pypi
openssl 1.1.1c h516909a_0 conda-forge
packaging 19.2 py_0 conda-forge
pandas 0.24.2 py37hb3f55d8_0 conda-forge
parquet-cpp 1.5.1 2 conda-forge
partd 1.0.0 py_0 conda-forge
pillow 6.2.0 py37h34e0f95_0
pip 19.2.3 py37_0
psutil 5.6.3 py37h516909a_0 conda-forge
pthread-stubs 0.4 h14c3975_1001 conda-forge
py4j 0.10.7 py_1 conda-forge
pyarrow 0.14.1 py37h8b68381_2 conda-forge
pyblazing 0.1 dev_0 <develop>
pydrill 0.3.4 pypi_0 pypi
pymysql 0.9.3 pypi_0 pypi
pynvml 8.0.3 pypi_0 pypi
pyparsing 2.4.2 py_0 conda-forge
pyspark 2.4.3 py_0 conda-forge
python 3.7.4 h265db76_1
python-dateutil 2.8.0 py_0 conda-forge
pytz 2019.3 py_0 conda-forge
pyyaml 5.1.2 py37h516909a_0 conda-forge
rapidjson 1.1.0 he1b5a44_1002 conda-forge
re2 2019.09.01 he1b5a44_0 conda-forge
readline 7.0 h7b6447c_5
requests 2.22.0 pypi_0 pypi
rhash 1.3.6 h14c3975_1001 conda-forge
rmm 0.10.0a191011 py37_169 rapidsai-nightly/label/cuda10.0
setuptools 41.4.0 py37_0
six 1.12.0 py37_1000 conda-forge
smmap2 2.0.5 pypi_0 pypi
snappy 1.1.7 he1b5a44_1002 conda-forge
sortedcontainers 2.1.0 py_0 conda-forge
sqlite 3.30.0 h7b6447c_0
tblib 1.4.0 py_0 conda-forge
thrift-cpp 0.12.0 hf3afdfd_1004 conda-forge
tk 8.6.8 hbc83047_0
toolz 0.10.0 py_0 conda-forge
tornado 6.0.3 py37h516909a_0 conda-forge
uriparser 0.9.3 he1b5a44_1 conda-forge
urllib3 1.25.6 pypi_0 pypi
wheel 0.33.6 py37_0
xorg-fixesproto 5.0 h14c3975_1002 conda-forge
xorg-inputproto 2.3.2 h14c3975_1002 conda-forge
xorg-kbproto 1.0.7 h14c3975_1002 conda-forge
xorg-libx11 1.6.9 h516909a_0 conda-forge
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xorg-libxext 1.3.4 h516909a_0 conda-forge
xorg-libxfixes 5.0.3 h516909a_1004 conda-forge
xorg-libxi 1.7.10 h516909a_0 conda-forge
xorg-libxrender 0.9.10 h516909a_1002 conda-forge
xorg-libxtst 1.2.3 h14c3975_1002 conda-forge
xorg-recordproto 1.14.2 h14c3975_1002 conda-forge
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge
xorg-xextproto 7.3.0 h14c3975_1002 conda-forge
xorg-xproto 7.0.31 h14c3975_1007 conda-forge
xz 5.2.4 h14c3975_4
yaml 0.1.7 h14c3975_1001 conda-forge
zict 1.0.0 py_0 conda-forge
zlib 1.2.11 h7b6447c_3
zstd 1.4.0 h3b9ef0a_0 conda-forge
`
CUDA 10.0
FWIW, I'm seeing the same thing when running test_parquet.py from cudf python tests in branch-0.11, not specific to orc.
Yep, same on batched reads of a big CSV. Seems to happen on Python exit of an otherwise ok kernel:
combined_dfs = None
cols = [str(c) for c in cudf.io.read_csv(infile, nrows=1).columns]
print('columns: %s' % cols)
while True:
print("Step", str(offset))
if offset > 100000:
print('EARLY EXIT FOR 1M+ ROWS')
break
df = cudf.io.read_csv(infile, nrows=step, skiprows=offset, names=cols)
if combined_dfs is None:
combined_dfs = df2
else:
combined_dfs = cudf.concat([ combined_dfs, df2 ]).drop_duplicates([col])
print('combined after lines: %s' % len(df2))
print('done, saving to disk')
combined_dfs.to_pandas().to_csv(outfile, index=False, chunksize=100000, header=False)
print('done, exiting.')
=>
...
Step 200001
EARLY EXIT FOR 1M+ ROWS
done, saving to disk
done, exiting.
terminate called after throwing an instance of 'thrust::system::system_error'
what(): rmm_allocator::deallocate(): RMM_FREE: initialization error
real 0m33.327s
user 0m0.028s
sys 0m0.027s
This is on rapidsai-nightly (docker) / p100, and the csv was only taking 20% of GPU RAM. Both cuda9.2-ubuntu16.04 and cuda10.0-ubuntu18.04.
(and no error on rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04)
Should be fixed in branch-0.10 by PR #3061 (but currently blocked by PR #3089 in branch-0.11)
Great! hope we can have the fix soon on the nightly release for 0.10
I just confirm the issue is gone! Thank you!
Most helpful comment
Should be fixed in branch-0.10 by PR #3061 (but currently blocked by PR #3089 in branch-0.11)