Cudf: [BUG] Comparison based binaryops return nulls instead of True/False in comparing with nulls

Created on 9 Jul 2019  路  6Comments  路  Source: rapidsai/cudf

In [4]: print(cudf.Series([1,2,3,4,5]) == cudf.Series([None, None, None, None, None]))                                                                        
0     
1     
2     
3     
4     
dtype: bool

In [5]: pd.Series([1,2,3,4,5]) == pd.Series([None, None, None, None, None])                                                                                   
Out[5]: 
0    False
1    False
2    False
3    False
4    False
dtype: bool
bug cuDF (Python)

All 6 comments

@harrism We can easily handle this at the Python side if you think it's appropriate to do so.

One more example.

>>> pd.Series([None, None, None, None, None]) == pd.Series([None, None, None, None, None])
0    False
1    False
2    False
3    False
4    False
>>> pd.Series([None, None, None, None, None]) != pd.Series([None, None, None, None, None])
0    True
1    True
2    True
3    True
4    True

Basically it should be always True for __ne__ only and always False for __eq__ and every other operator, irrespective of what None compares against.

@kkraus14 is this urgent for 0.9? Assigning @devavret but waiting to hear about urgency before putting it on the 0.9 board.

This is one of the places where SQL and pandas differ.

SELECT foo = NULL from bar

will return one NULL for each row in bar. I personally would prefer to see this be fixed on the python side. We could also have separate operations or separate options for NULL handling in bin-ops.

@revans2 Thanks for the feedback! @harrism sounds like it makes sense to handle this on the Python side instead.

Closing this because as of Pandas 1.0 we should be propagating nulls as expected.

Was this page helpful?
0 / 5 - 0 ratings