Kokkos: CUDA: Race condition in parallel_scan

Created on 27 Jan 2020  路  3Comments  路  Source: kokkos/kokkos

There was a bug in the parallel_scan implementation where the correct inter-block fence (__threadfence()) wasn't called. This would manifest as a wrong result of the scan.

InDevelop bug

Most helpful comment

@stanmoore1 found the reproducer which made it possible to find this thing ...

All 3 comments

PRs #2666 and #2668

@stanmoore1 found the reproducer which made it possible to find this thing ...

Closing this issue as this is now in master

Was this page helpful?
0 / 5 - 0 ratings