When I cast the type with astype for Index or Series,
I noticed that their behavior for bool casting is slightly different.
>>> pd.Series([1, None]).astype(bool)
0 True
1 True
dtype: bool
>>> pd.Index([1, None]).astype(bool)
Index([True, False], dtype='object')
As shown above, None is casted True from Series.astype, but casted False from Index.astype.
Is this normal for some reason??
I used pandas 1.1.4.
Thanks :)
@itholic In the case of Series you end up with a float dtype where None is converted to np.nan. Since bool(np.nan) is True I'd say this is expected, if a bit strange:
[ins] In [12]: pd.Series([1, None])
Out[12]:
0 1.0
1 NaN
dtype: float64
[ins] In [13]: bool(np.nan)
Out[13]: True
The main reason is that there is no BoolIndex, so when you try to cast to bool it casts to object instead.
Oh, I got it.
Then I'd handle this case with setting a dtype as object for now.
>>> pd.Series([1, None], dtype=object).astype(bool)
0 True
1 False
dtype: bool
Thanks for the comment, @dsaxton @jbrockmendel !! :D
The problem here is not that we don't have boolean index, but it's indeed the actual Series vs Index constructor that behaves differently, as @dsaxton notes.
For Index, we keep object dtype and don't coerce to float, and then the None vs NaN gets casted to bool differently.
I am not sure there is anything we can do about this, except aligning the constructors (and maybe if at some point the nullable integers become used by default, that could solve it, because we can then infer that instead of object or float)