Kokkos: cuda.view_64bit test hangs on Power8+Kepler37 system - develop and 2.9.00 branches

Created on 18 Feb 2020  路  4Comments  路  Source: kokkos/kokkos

This was detected on the White testbed (Power8+Kepler37) when enabling the large_memory_tests in nightlies.

Reproducer instructions (UVM enabled):

module load cuda/9.2.88 cmake/3.12.3 gcc/7.2.0

export CUDA_LAUNCH_BLOCKING=1
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1

<KOKKOS_PATH>/generate_makefile.bash --arch=Kepler37,Power8 --compiler=${KOKKOS_PATH}/bin/nvcc_wrapper --with-cuda-options="force_uvm,enable_lambda" --with-options=enable_large_mem_tests --with-cuda --kokkos-path=<KOKKOS_PATH>
InDevelop bug

All 4 comments

Reproduced. Investigating.

Found it. This is a general bug, not just for that system :-(. View Initialization was only ever using default index type for its execution policy, which means 32bit indices internally. So when you hand it a size_t with the correct larger than 2B value you are in trouble. Have a fix coming ... .

Note this is not memory size larger than 2GB, but more than 2B elements (i.e. typically 8GB or 16GB allocations). The only realistic way to get in trouble on Summit or Sierra is probably with allocation a view of char. But on the 32GB Volta cards inside of DGX boxes it is not totally outside of what one can imagine, that a big graph with more than 2B elements gets allocated maybe, i.e. the list of all edges or so as a Nx2 array with N>1B.

@crtrott the fix from #2819 did not resolve the hang+timeout in view_64bit tests on White.
This still occurs with cuda/9.2.88+gcc/7.2.0 builds with UVM enabled.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

crtrott picture crtrott  路  3Comments

Char-Aznable picture Char-Aznable  路  3Comments

etphipp picture etphipp  路  5Comments

vbrunini picture vbrunini  路  3Comments

ibaned picture ibaned  路  3Comments