Cudf: [BUG] Illegal memory access when using ParquetWriter.write_table for several large amount of data.

Created on 27 Jun 2020  路  9Comments  路  Source: rapidsai/cudf

Describe the bug
This bug is filed in line with original lib's contributor. Original issue is NVIDIA/NVTabular#102.

While using cudf._lib.parquet.ParquetWriter.write_table() to write several large amount of data into one .parquet file, I've encountered the following error.

RuntimeError: CUDA error encountered at: /cudf/cpp/src/io/parquet/writer_impl.cu:341: 700 cudaErrorIllegalAddress an illegal memory access was encountered

This issue happens from optimize_criteo.ipynb. Each DataFrame was partially loaded from large csv file by cudf.read_csv() with byte_range option (https://github.com/NVIDIA/NVTabular/blob/master/nvtabular/io.py#L220-L226). I put stacktrace below and cuda-memcheck result is in original issue (https://github.com/NVIDIA/NVTabular/issues/102#issuecomment-648154896). In addition, the program sometimes hangs with 100% GPU usage...

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-2af48634df31> in <module>
     47         del gdf
     48 path_out = '/data/criteo/parquet/'
---> 49 file_to_pq(train_set, 'csv', output_folder=path_out, cols=cols, dtypes=dtypes)

<ipython-input-1-2af48634df31> in file_to_pq(target_files, file_type, output_folder, cols, dtypes)
     43         if file_path != old_file_path:
     44             writer = ParquetWriter(path)
---> 45         writer.write_table(gdf)
     46         old_file_path = file_path
     47         del gdf

cudf/_lib/parquet.pyx in cudf._lib.parquet.ParquetWriter.write_table()

RuntimeError: CUDA error encountered at: /cudf/cpp/src/io/parquet/writer_impl.cu:341: 700 cudaErrorIllegalAddress an illegal memory access was encountered

Steps/Code to reproduce bug

Note that according to bug analysis in original repo, this issue happens on only Pascal GPU. On Volta GPU, it doesn't happen.

Expected behavior

No error and hanging up.

Environment overview (please complete the following information)

  • Environment location: Docker
  • Method of cuDF install: Docker

    • I used NVTabular's docker container and it uses conda to install cudf.

    • sudo docker run --gpus=all --rm -it -v $(pwd):/ws -v /path/to/data/:/data -p 8888:8888 -p 8797:8787 -p 8796:8786 --ipc=host --cap-add SYS_PTRACE nvcr.io/nvidia/nvtabular:0.1 /bin/bash

Environment details

Click here to see environment details

 **git***
 Not inside a git repository

 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=18.04
 DISTRIB_CODENAME=bionic
 DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"
 NAME="Ubuntu"
 VERSION="18.04.4 LTS (Bionic Beaver)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 18.04.4 LTS"
 VERSION_ID="18.04"
 HOME_URL="https://www.ubuntu.com/"
 SUPPORT_URL="https://help.ubuntu.com/"
 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
 VERSION_CODENAME=bionic
 UBUNTU_CODENAME=bionic
 Linux f15fcfd05d6b 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

 ***GPU Information***
 Sat Jun 27 02:03:54 2020
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  TITAN X (Pascal)    On   | 00000000:06:00.0 Off |                  N/A |
 | 23%   32C    P8    16W / 250W |    122MiB / 12192MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   1  TITAN X (Pascal)    On   | 00000000:0A:00.0 Off |                  N/A |
 | 23%   25C    P8     8W / 250W |     12MiB / 12196MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+

 +-----------------------------------------------------------------------------+
 | Processes:                                                       GPU Memory |
 |  GPU       PID   Type   Process name                             Usage      |
 |=============================================================================|
 +-----------------------------------------------------------------------------+

 ***CPU***
 Architecture:        x86_64
 CPU op-mode(s):      32-bit, 64-bit
 Byte Order:          Little Endian
 CPU(s):              12
 On-line CPU(s) list: 0-11
 Thread(s) per core:  2
 Core(s) per socket:  6
 Socket(s):           1
 NUMA node(s):        1
 Vendor ID:           GenuineIntel
 CPU family:          6
 Model:               79
 Model name:          Intel(R) Core(TM) i7-6850K CPU @ 3.60GHz
 Stepping:            1
 CPU MHz:             1200.191
 CPU max MHz:         4000.0000
 CPU min MHz:         1200.0000
 BogoMIPS:            7195.06
 Virtualization:      VT-x
 L1d cache:           32K
 L1i cache:           32K
 L2 cache:            256K
 L3 cache:            15360K
 NUMA node0 CPU(s):   0-11
 Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d

 ***CMake***
 /conda/envs/rapids/bin/cmake
 cmake version 3.17.0

 CMake suite maintained and supported by Kitware (kitware.com/cmake).

 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
 Copyright (C) 2017 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 ***nvcc***
 /usr/local/cuda/bin/nvcc
 nvcc: NVIDIA (R) Cuda compiler driver
 Copyright (c) 2005-2019 NVIDIA Corporation
 Built on Wed_Oct_23_19:24:38_PDT_2019
 Cuda compilation tools, release 10.2, V10.2.89

 ***Python***
 /conda/envs/rapids/bin/python
 Python 3.7.6

 ***Environment Variables***
 PATH                            : /conda/envs/rapids/bin:/conda/condabin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/conda/bin
 LD_LIBRARY_PATH                 : /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib
 NUMBAPRO_NVVM                   : /usr/local/cuda/nvvm/lib64/libnvvm.so
 NUMBAPRO_LIBDEVICE              : /usr/local/cuda/nvvm/libdevice/
 CONDA_PREFIX                    : /conda/envs/rapids
 PYTHON_PATH                     :

 ***conda packages***
 /conda/condabin/conda
 # packages in environment at /conda/envs/rapids:
 #
 # Name                    Version                   Build  Channel
 _libgcc_mutex             0.1                 conda_forge    conda-forge
 _openmp_mutex             4.5                      1_llvm    conda-forge
 absl-py                   0.9.0                    pypi_0    pypi
 aiohttp                   3.6.2                    pypi_0    pypi
 alabaster                 0.7.12                     py_0    conda-forge
 appdirs                   1.4.3                      py_1    conda-forge
 arrow-cpp                 0.15.0           py37h090bef1_2    conda-forge
 astor                     0.8.1                    pypi_0    pypi
 async-timeout             3.0.1                    pypi_0    pypi
 attrs                     19.3.0                     py_0    conda-forge
 babel                     2.8.0                      py_0    conda-forge
 backcall                  0.1.0                      py_0    conda-forge
 beautifulsoup4            4.9.0                    pypi_0    pypi
 black                     19.10b0                  py37_0    conda-forge
 blas                      1.0                         mkl
 bleach                    3.1.5              pyh9f0ad1d_0    conda-forge
 blis                      0.4.1                    pypi_0    pypi
 bokeh                     1.4.0                    pypi_0    pypi
 boost-cpp                 1.70.0               h8e57a91_2    conda-forge
 bottleneck                1.3.2                    pypi_0    pypi
 brotli                    1.0.7             he1b5a44_1001    conda-forge
 brotlipy                  0.7.0           py37h8f50634_1000    conda-forge
 bzip2                     1.0.8                h516909a_2    conda-forge
 c-ares                    1.15.0            h516909a_1001    conda-forge
 ca-certificates           2020.1.1                      0
 cachetools                4.1.0                    pypi_0    pypi
 catalogue                 1.0.0                    pypi_0    pypi
 certifi                   2020.4.5.1               py37_0
 cffi                      1.14.0           py37hd463f26_0    conda-forge
 cfgv                      3.1.0                      py_0    conda-forge
 chardet                   3.0.4           py37hc8dfbb8_1006    conda-forge
 clang                     8.0.1                hc9558a2_2    conda-forge
 clang-tools               8.0.1                hc9558a2_2    conda-forge
 clangxx                   8.0.1                         2    conda-forge
 click                     7.1.2              pyh9f0ad1d_0    conda-forge
 cloudpickle               1.4.1                      py_0    conda-forge
 cmake                     3.17.0               h28c56e5_0    conda-forge
 cmake_setuptools          0.1.3                      py_0    rapidsai
 commonmark                0.9.1                      py_0    conda-forge
 coverage                  5.1                      pypi_0    pypi
 cryptography              2.9.2            py37hb09aad4_0    conda-forge
 cudatoolkit               10.2.89              hfd86e86_1
 cudf                      0.14.0a0+4291.g6bae03d40          pypi_0    pypi
 cudnn                     7.6.5                cuda10.2_0
 cupy                      7.4.0            py37h940342b_2    conda-forge
 cycler                    0.10.0                   pypi_0    pypi
 cymem                     2.0.3                    pypi_0    pypi
 cython                    0.29.17          py37h3340039_0    conda-forge
 cytoolz                   0.10.1           py37h516909a_0    conda-forge
 dask                      2.16.0+8.g3573b2dd          pypi_0    pypi
 dask-cudf                 0.14.0a0+4291.g6bae03d40          pypi_0    pypi
 decorator                 4.4.2                      py_0    conda-forge
 defusedxml                0.6.0                      py_0    conda-forge
 distributed               2.16.0+2.gf899a994          pypi_0    pypi
 dlpack                    0.2                  he1b5a44_1    conda-forge
 docutils                  0.16             py37hc8dfbb8_1    conda-forge
 double-conversion         3.1.5                he1b5a44_2    conda-forge
 editdistance              0.5.3            py37h3340039_0    conda-forge
 entrypoints               0.3             py37hc8dfbb8_1001    conda-forge
 expat                     2.2.9                he1b5a44_2    conda-forge
 fastai                    1.0.61                   pypi_0    pypi
 fastavro                  0.23.3           py37h8f50634_0    conda-forge
 fastprogress              0.2.3                    pypi_0    pypi
 fastrlock                 0.4             py37h3340039_1001    conda-forge
 flake8                    3.8.1              pyh9f0ad1d_0    conda-forge
 flatbuffers               1.12.0               he1b5a44_0    conda-forge
 freetype                  2.10.2               he06d7ca_0    conda-forge
 fsspec                    0.7.3                      py_0    conda-forge
 future                    0.18.2           py37hc8dfbb8_1    conda-forge
 gast                      0.2.2                    pypi_0    pypi
 gflags                    2.2.2             he1b5a44_1002    conda-forge
 glog                      0.4.0                h49b9bf7_3    conda-forge
 gmp                       6.2.0                he1b5a44_2    conda-forge
 google-auth               1.14.3                   pypi_0    pypi
 google-auth-oauthlib      0.4.1                    pypi_0    pypi
 google-pasta              0.2.0                    pypi_0    pypi
 grpc-cpp                  1.23.0               h18db393_0    conda-forge
 grpcio                    1.24.3                   pypi_0    pypi
 h5py                      2.10.0                   pypi_0    pypi
 heapdict                  1.0.1                      py_0    conda-forge
 hypothesis                5.13.1                     py_0    conda-forge
 icu                       64.2                 he1b5a44_1    conda-forge
 identify                  1.4.15             pyh9f0ad1d_0    conda-forge
 idna                      2.9                        py_1    conda-forge
 imagesize                 1.2.0                      py_0    conda-forge
 importlib-metadata        1.6.0            py37hc8dfbb8_0    conda-forge
 importlib_metadata        1.6.0                         0    conda-forge
 intel-openmp              2020.1                      217
 ipykernel                 5.2.1            py37h43977f1_0    conda-forge
 ipython                   7.14.0           py37hc8dfbb8_0    conda-forge
 ipython_genutils          0.2.0                      py_1    conda-forge
 isort                     4.3.21           py37hc8dfbb8_1    conda-forge
 jedi                      0.17.0           py37hc8dfbb8_0    conda-forge
 jinja2                    2.11.2             pyh9f0ad1d_0    conda-forge
 jpeg                      9c                h14c3975_1001    conda-forge
 json5                     0.9.4                      py_0
 jsonschema                3.2.0            py37hc8dfbb8_1    conda-forge
 jupyter-server-proxy      1.4.0                    pypi_0    pypi
 jupyter_client            6.1.3                      py_0    conda-forge
 jupyter_core              4.6.3            py37hc8dfbb8_1    conda-forge
 jupyterlab                1.2.6              pyhf63ae98_0
 jupyterlab-nvdashboard    0.2.1                    pypi_0    pypi
 jupyterlab_server         1.1.1                      py_0
 keras-applications        1.0.8                    pypi_0    pypi
 keras-preprocessing       1.1.1                    pypi_0    pypi
 kiwisolver                1.2.0                    pypi_0    pypi
 krb5                      1.17.1               h2fd8d38_0    conda-forge
 ld_impl_linux-64          2.34                 h53a641e_0    conda-forge
 libcurl                   7.69.1               hf7181ac_0    conda-forge
 libedit                   3.1.20170329      hf8c457e_1001    conda-forge
 libevent                  2.1.10               h72c5cf5_0    conda-forge
 libffi                    3.2.1             he1b5a44_1007    conda-forge
 libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
 libgfortran-ng            7.3.0                hdf63c60_5    conda-forge
 libllvm8                  8.0.1                hc9558a2_0    conda-forge
 libopenblas               0.3.9                h5ec1e0e_0    conda-forge
 libpng                    1.6.37               hed695b0_1    conda-forge
 libprotobuf               3.8.0                h8b12597_0    conda-forge
 librmm                    0.14.0a200513      cuda10.2_340    rapidsai-nightly
 libsodium                 1.0.17               h516909a_0    conda-forge
 libssh2                   1.8.2                h22169c7_2    conda-forge
 libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
 libtiff                   4.1.0                hfc65ed5_0    conda-forge
 libuv                     1.34.0               h516909a_0    conda-forge
 llvm-openmp               10.0.0               hc9558a2_0    conda-forge
 llvmlite                  0.31.0           py37h5202443_1    conda-forge
 locket                    0.2.0                      py_2    conda-forge
 lz4-c                     1.8.3             he1b5a44_1001    conda-forge
 markdown                  3.2.2                    pypi_0    pypi
 markupsafe                1.1.1            py37h8f50634_1    conda-forge
 matplotlib                3.2.1                    pypi_0    pypi
 mccabe                    0.6.1                      py_1    conda-forge
 mistune                   0.8.4           py37h8f50634_1001    conda-forge
 mkl                       2020.1                      217
 mkl-service               2.3.0            py37he904b0f_0
 mkl_fft                   1.0.15           py37ha843d7b_0
 mkl_random                1.1.0            py37hd6b4f25_0
 more-itertools            8.2.0                      py_0    conda-forge
 msgpack-python            1.0.0            py37h99015e2_1    conda-forge
 multidict                 4.7.5                    pypi_0    pypi
 murmurhash                1.0.2                    pypi_0    pypi
 mypy_extensions           0.4.3            py37hc8dfbb8_1    conda-forge
 nbconvert                 5.6.1            py37hc8dfbb8_1    conda-forge
 nbformat                  5.0.6                      py_0    conda-forge
 nbsphinx                  0.7.0              pyh9f0ad1d_0    conda-forge
 nccl                      2.6.4.1              hc6a2c23_0    conda-forge
 ncurses                   6.1               hf484d3e_1002    conda-forge
 ninja                     1.9.0            py37hfd86e86_0
 nodeenv                   1.3.5                      py_0    conda-forge
 nodejs                    10.13.0              he6710b0_0
 notebook                  6.0.3            py37hc8dfbb8_0    conda-forge
 numba                     0.48.0           py37hb3f55d8_0    conda-forge
 numexpr                   2.7.1                    pypi_0    pypi
 numpy                     1.18.1           py37h4f9e942_0
 numpy-base                1.18.1           py37hde5b4d6_1
 numpydoc                  0.9.2                      py_0    conda-forge
 nvidia-ml-py3             7.352.0                  pypi_0    pypi
 nvstrings-cuda102         0.0.0.dev0               pypi_0    pypi
 nvtabular                 0.1.dev0                  dev_0    <develop>
 oauthlib                  3.1.0                    pypi_0    pypi
 olefile                   0.46                       py_0    conda-forge
 onnx                      1.7.0                    pypi_0    pypi
 openssl                   1.1.1g               h7b6447c_0
 opt-einsum                3.2.1                    pypi_0    pypi
 packaging                 20.1                       py_0    conda-forge
 pandas                    0.25.3           py37hb3f55d8_0    conda-forge
 pandoc                    1.19.2                        0    conda-forge
 pandocfilters             1.4.2                      py_1    conda-forge
 parquet-cpp               1.5.1                         2    conda-forge
 parso                     0.7.0              pyh9f0ad1d_0    conda-forge
 partd                     1.1.0                      py_0    conda-forge
 pathspec                  0.8.0              pyh9f0ad1d_0    conda-forge
 pexpect                   4.8.0            py37hc8dfbb8_1    conda-forge
 pickleshare               0.7.5           py37hc8dfbb8_1001    conda-forge
 pillow                    7.1.2            py37h718be6c_0    conda-forge
 pip                       20.1               pyh9f0ad1d_0    conda-forge
 plac                      1.1.3                    pypi_0    pypi
 pluggy                    0.13.1           py37hc8dfbb8_1    conda-forge
 pre-commit                2.4.0            py37hc8dfbb8_0    conda-forge
 pre_commit                2.4.0                         0    conda-forge
 preshed                   3.0.2                    pypi_0    pypi
 prometheus_client         0.7.1                      py_0    conda-forge
 prompt-toolkit            3.0.5                      py_0    conda-forge
 protobuf                  3.11.3                   pypi_0    pypi
 psutil                    5.7.0            py37h8f50634_1    conda-forge
 ptyprocess                0.6.0                   py_1001    conda-forge
 py                        1.8.1                      py_0    conda-forge
 pyarrow                   0.15.0           py37h8b68381_1    conda-forge
 pyasn1                    0.4.8                    pypi_0    pypi
 pyasn1-modules            0.2.8                    pypi_0    pypi
 pycodestyle               2.6.0              pyh9f0ad1d_0    conda-forge
 pycparser                 2.20                       py_0    conda-forge
 pyflakes                  2.2.0              pyh9f0ad1d_0    conda-forge
 pygments                  2.6.1                      py_0    conda-forge
 pynvml                    8.0.4                    pypi_0    pypi
 pyopenssl                 19.1.0                     py_1    conda-forge
 pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
 pyrsistent                0.16.0           py37h8f50634_0    conda-forge
 pysocks                   1.7.1            py37hc8dfbb8_1    conda-forge
 pytest                    5.4.2            py37hc8dfbb8_0    conda-forge
 pytest-cov                2.8.1                    pypi_0    pypi
 python                    3.7.6           h8356626_5_cpython    conda-forge
 python-dateutil           2.8.1                      py_0    conda-forge
 python_abi                3.7                     1_cp37m    conda-forge
 pytorch                   1.5.0           py3.7_cuda10.2.89_cudnn7.6.5_0    pytorch
 pytz                      2020.1             pyh9f0ad1d_0    conda-forge
 pyyaml                    5.3.1            py37h8f50634_0    conda-forge
 pyzmq                     19.0.1           py37hac76be4_0    conda-forge
 rapidjson                 1.1.0             he1b5a44_1002    conda-forge
 re2                       2020.05.01           he1b5a44_0    conda-forge
 readline                  8.0                  hf8c457e_0    conda-forge
 recommonmark              0.6.0                      py_0    conda-forge
 regex                     2020.5.13        py37h8f50634_0    conda-forge
 requests                  2.23.0             pyh8c360ce_2    conda-forge
 requests-oauthlib         1.3.0                    pypi_0    pypi
 rhash                     1.3.6             h14c3975_1001    conda-forge
 rmm                       0.14.0a200513          py37_314    rapidsai-nightly
 rsa                       4.0                      pypi_0    pypi
 scipy                     1.4.1                    pypi_0    pypi
 send2trash                1.5.0                      py_0    conda-forge
 setuptools                46.3.0           py37hc8dfbb8_0    conda-forge
 simpervisor               0.3                      pypi_0    pypi
 six                       1.14.0                     py_1    conda-forge
 snappy                    1.1.8                he1b5a44_1    conda-forge
 snowballstemmer           2.0.0                      py_0    conda-forge
 sortedcontainers          2.1.0                      py_0    conda-forge
 soupsieve                 2.0                      pypi_0    pypi
 spacy                     2.2.4                    pypi_0    pypi
 spdlog                    1.5.0                hc9558a2_0    conda-forge
 sphinx                    3.0.3                      py_0    conda-forge
 sphinx-markdown-tables    0.0.14                   pypi_0    pypi
 sphinx_rtd_theme          0.4.3                      py_0    conda-forge
 sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
 sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
 sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
 sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
 sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
 sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
 sphinxcontrib-websupport  1.2.2              pyh9f0ad1d_0    conda-forge
 sqlite                    3.30.1               hcee41ef_0    conda-forge
 srsly                     1.0.2                    pypi_0    pypi
 streamz                   0.5.3                    pypi_0    pypi
 tblib                     1.6.0                      py_0    conda-forge
 tensorboard               2.1.1                    pypi_0    pypi
 tensorflow-estimator      2.1.0                    pypi_0    pypi
 tensorflow-gpu            2.1.0                    pypi_0    pypi
 termcolor                 1.1.0                    pypi_0    pypi
 terminado                 0.8.3            py37hc8dfbb8_1    conda-forge
 testpath                  0.4.4                      py_0    conda-forge
 tfdlpack-gpu              0.1.3                    pypi_0    pypi
 thinc                     7.4.0                    pypi_0    pypi
 thrift-cpp                0.12.0            hf3afdfd_1004    conda-forge
 tk                        8.6.10               hed695b0_0    conda-forge
 toml                      0.10.0                     py_0    conda-forge
 toolz                     0.10.0                     py_0    conda-forge
 torchvision               0.6.0                py37_cu102    pytorch
 tornado                   6.0.4            py37h8f50634_1    conda-forge
 tqdm                      4.46.0                   pypi_0    pypi
 traitlets                 4.3.3            py37hc8dfbb8_1    conda-forge
 typed-ast                 1.4.1            py37h516909a_0    conda-forge
 typing_extensions         3.7.4.2                    py_0    conda-forge
 uriparser                 0.9.3                he1b5a44_1    conda-forge
 urllib3                   1.25.9                     py_0    conda-forge
 virtualenv                16.7.5                     py_0    conda-forge
 wasabi                    0.6.0                    pypi_0    pypi
 wcwidth                   0.1.9              pyh9f0ad1d_0    conda-forge
 webencodings              0.5.1                      py_1    conda-forge
 werkzeug                  1.0.1                    pypi_0    pypi
 wheel                     0.34.2                     py_1    conda-forge
 wrapt                     1.12.1                   pypi_0    pypi
 xz                        5.2.5                h516909a_0    conda-forge
 yaml                      0.2.4                h516909a_0    conda-forge
 yarl                      1.4.2                    pypi_0    pypi
 zeromq                    4.3.2                he1b5a44_2    conda-forge
 zict                      2.0.0                    pypi_0    pypi
 zipp                      3.1.0                      py_0    conda-forge
 zlib                      1.2.11            h516909a_1006    conda-forge
 zstd                      1.4.3                h3b9ef0a_0    conda-forge

Additional info:

  • Docker version: Docker version 19.03.8, build afacb8b7f0

Additional context

N/A.

bug cuIO

All 9 comments

I've tested P100 and RTX 2060. As a result, this issue is reproduced on only P100. So, it looks like strongly related to GPU generation.

From the location of the fault, it looks like the GPU fault is happening either in gpu::BuildChunkDictionaries() or gpu::InitEncoderPages(). Adding a stream synchronize between the two calls should narrow it down, though it's most likely the former (If it's a race condition, it may be more likely to show up on Pascal than Volta).

Here's a cudf only repro:

import cudf
from cudf.io.parquet import ParquetWriter

cont_names = ["I" + str(x) for x in range(1, 14)]
cat_names = ["C" + str(x) for x in range(1, 27)]
cols = ["label"] + cont_names + cat_names

df = cudf.read_csv("day_0", sep="\t", names=cols, byte_range=(0, 840000000))
df = df.drop(columns=cont_names)

writer = ParquetWriter("/tmp/baremetal_out_0.parquet")
writer.write_table(df)

Also, I added a device synchronize after each call in init_page_fragments, gather_fragment_statistics,
build_chunk_dictionaries and now it seems to reliably break after BuildChunkDictionaries

I've been able to fix this and also have an idea why this fix works but finding it hard to confirm.

The fix is adding a __syncthreads() after this https://github.com/rapidsai/cudf/blob/4c9efa37a88b3beea9bbb9d459507200ba1e44d8/cpp/src/io/parquet/page_enc.cu#L185-L194

And I think this was broken because the value in scratch_red used in nz_pos += s->scratch_red[(t - 32) >> 5] is written by the previous warp which may reach https://github.com/rapidsai/cudf/blob/4c9efa37a88b3beea9bbb9d459507200ba1e44d8/cpp/src/io/parquet/page_enc.cu#L169-L172 in the next iteration of the loop before this one does.

I just don't know how to test this.

Yup, that makes sense, there should be a __syncthreads() before the next iteration of the loop. There was a similar issue in deflate with the race condition involving two points quite far apart ~ for some reason, the error tends to show up on Maxwell/Pascal but much less often on Turing/Volta (presumably longer mem latencies)

I just don't know how to test this.

If you were able to build a minimal reproducer that triggers sometimes on Pascal / Maxwell then it's worth adding it as a test that someone can run on a loop for example.

If we don't have a minimal reproducer I'd say we just check in a fix without a test and move on.

If you were able to build a minimal reproducer that triggers sometimes on Pascal / Maxwell then it's worth adding it as a test that someone can run on a loop for example.

I have a minimal reproducer that depends upon this 47GB file (well, actually only the beginning 800MB is enough to trigger this). Plus, tests are supposed to not take too long to run. This will take as long to run as a typical benchmark does.

I just don't know how to test this.

I mean that I don't have concrete evidence that the fix I suggest will definitely fix this bug, because I couldn't print dict_index data and compare it with and without __syncthreads(). There's too much data to print. I just have a strong suspicion and I do believe that there should be a __syncthreads() between the two code blocks regardless of this bug.

Is there anything sensitive / non-public in the data? One of the things we've done from the Python side at least is have tests that only run if the file is found and skip otherwise. We could take that approach for this file along with adding a comment with a link to download the data, so that if someone wants / needs to reproduce it down the line they can.

Was this page helpful?
0 / 5 - 0 ratings