I'm seeing a significant performance regression in Kokkos::DualView with LAMMPS due to a bunch of new DtoH data movement. I bisected the issue to #3326, and @crtrott found the label is being pulled down for every DualView sync and modify event. Can the label only be pulled down from GPU to CPU when profiling is enabled?
We are trying to get a fix done today.
Great, thanks!
@stanmoore1 , yikes, sorry about that. Just pushed #3695, we're pushing to merge it into everything, starting with the release
I confirm that #3701 fixes the DtoH data movement issue, thanks!
Most helpful comment
We are trying to get a fix done today.