In [4]: print(cudf.Series([1,2,3,4,5]) == cudf.Series([None, None, None, None, None]))
0
1
2
3
4
dtype: bool
In [5]: pd.Series([1,2,3,4,5]) == pd.Series([None, None, None, None, None])
Out[5]:
0 False
1 False
2 False
3 False
4 False
dtype: bool
@harrism We can easily handle this at the Python side if you think it's appropriate to do so.
One more example.
>>> pd.Series([None, None, None, None, None]) == pd.Series([None, None, None, None, None])
0 False
1 False
2 False
3 False
4 False
>>> pd.Series([None, None, None, None, None]) != pd.Series([None, None, None, None, None])
0 True
1 True
2 True
3 True
4 True
Basically it should be always True for __ne__ only and always False for __eq__ and every other operator, irrespective of what None compares against.
@kkraus14 is this urgent for 0.9? Assigning @devavret but waiting to hear about urgency before putting it on the 0.9 board.
This is one of the places where SQL and pandas differ.
SELECT foo = NULL from bar
will return one NULL for each row in bar. I personally would prefer to see this be fixed on the python side. We could also have separate operations or separate options for NULL handling in bin-ops.
@revans2 Thanks for the feedback! @harrism sounds like it makes sense to handle this on the Python side instead.
Closing this because as of Pandas 1.0 we should be propagating nulls as expected.