Is your feature request related to a problem? Please describe.
It's useful for testing to be able to perform the equivalent of numpy.isclose on instances of cudf.DataFrame and cudf.Series.
Describe the solution you'd like
This would be a nice thing to have:
>>> import cudf
>>> s1 = cudf.Series([1.9876543, 2.9876654, 3.9876543])
>>> s2 = cudf.Series([1.987654321, 2.987654321, 3.987654321])
>>> rel_tol=1e-5
>>> abs_tol=0.0
>>> s2.isclose(s1, rel_tol, abs_tol)
0 True
1 True
2 True
dtype: bool
>>>
Describe alternatives you've considered
Here's my current hand-rolled solution:
>>> import cudf
>>> s1 = cudf.Series([1.9876543, 2.9876654, 3.9876543])
>>> s2 = cudf.Series([1.987654321, 2.987654321, 3.987654321])
>>> rel_tol=1e-5
>>> abs_tol=0.0
>>> s2.abs().mul(rel_tol).add(abs_tol).sub(s1.sub(s2).abs()).gt(0)
0 True
1 True
2 True
dtype: bool
>>>
There's nothing wrong with using this approach pervasively. I figured it'd be more convenient to have isclose as a built-in method.
I'm happy to help in whatever way I can.
Could you use cupy.isclose for this? I believe with pandas you would still use np.isclose.
Could you use
cupy.isclosefor this? I believe with pandas you would still usenp.isclose.
@paul-tqh-nguyen this would go from cudf --> cupy zero copy so it wouldn't cause any performance degradation. Could you give that a shot?
Using cupy.isclose solves all of my problems!
My apologies; I should've looked there earlier.
Thanks for the quick and helpful responses!
One note that this won't handle null values so I'm going to reopen this to provide some syntactic sugar in cuDF surrounding the cupy function in the future.
Most helpful comment
Could you use
cupy.isclosefor this? I believe with pandas you would still usenp.isclose.