Is your feature request related to a problem? Please describe.
A gather or scatter operation take a "map" of integers that map input elements to output elements between two columns (or tables). In Python, these integers can potentially be negative, where a negative value is interpreted as an offset from the _end_ of the range, e.g., -2 would map to n - 2.
However, libcudf's gather/scatter APIs require the map to contain values that are all within the range [0, n), i.e., no negative values. As such, this requires running a separate kernel to first "normalize" the map from Python into the correct map for libcudf. This is expensive as it requires a separate kernel invocation as well as an intermediate memory allocation.
Describe the solution you'd like
Doing this "normalization" step would be much more efficient if it was done during the execution of the gather kernel. This can be done fairly easily with a thrust::transform_iterator.
However, the caveat is that this will require adding an additional code path to the gather implementation. We currently rely on negative indices being ignored for the join implementation. We'll need the option to select between whether negative indices should be ignored or they should be normalized.
Additional Context
We should also update the gather_map and scatter_map to be a non-nullable gdf_column that can be a int32 or int64 to prevent having to cast an array of int64 elements to int32 (since int64 is most common in Python).
cc-ing @rjzamora (in case this is of interest to you as well 馃槈)
@jrhemstad If we're allowing both int32 and int64, would it be possible to allow int8 and int16 as well? Users can and will pass those as well.
@jrhemstad If we're allowing both int32 and int64, would it be possible to allow int8 and int16 as well? Users can and will pass those as well.
Yeah, I guess there's no reason we can't support any integral type, so long as compile time doesn't blow up.
I think it's okay for floats to require a typecast (yes Python users can / will pass floats into gather calls as well), but otherwise would be great to not require a typecast.
yes Python users can / will pass floats into gather calls as well
馃槧
FWIW the interest in this issue from my perspective would be speeding up .iloc, which is relevant for the cuML Grid Search use case. 馃槈
cc @JohnZed @mrocklin
@kkraus14 it looks like you've added the Needs Prioritizing label on this a few weeks ago.
Does anyone have a sense for when this might be done? It seems like this is blocking progress in GridSearch + cuML work. No pressure, (well, a little pressure) we just need to know in order to figure out resourcing.
@kkraus14 it looks like you've added the Needs Prioritizing label on this a few weeks ago.
Does anyone have a sense for when this might be done? It seems like this is blocking progress in GridSearch + cuML work. No pressure, (well, a little pressure) we just need to know in order to figure out resourcing.
The label is out of date 馃槄, this is actively being investigated and worked on. Will update accordingly.
Thanks Keith!
Most helpful comment
Yeah, I guess there's no reason we can't support any integral type, so long as compile time doesn't blow up.