When DataArray is masked by .where(), the type is converted to float64.
But, if we need to use the DataArray ouput from .where() in .isel(), the dtype should be int.
(#3949 )
import numpy as np
import xarray as xr
val_arr = xr.DataArray(np.arange(27).reshape(3, 3, 3),
dims=['z', 'y', 'x'])
z_indices = xr.DataArray(np.array([[1, 0, 2],
[0, 0, 1],
[-2222, 0, 1]]),
dims=['y', 'x'])
fill_value = -2222
sub = z_indices.where(z_indices != fill_value)
indexed_array = val_arr.isel(z=sub)
array([[ 1, 0, 2],
[ 0, 0, 1],
[nan, 0, 1]])
File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\indexing.py", line 446, in __init__
f"invalid indexer array, does not have integer dtype: {k!r}"
TypeError: invalid indexer array, does not have integer dtype: array([[ 1., 0., 2.],
[ 0., 0., 1.],
[nan, 0., 1.]])
Currently, pandas supports NaN values. Is this possible for xarray? or another method around?
There has been a lot of discussion about the int vs nan problem in the past, here one issue #1194. My question for xarray-devs would be too, if there is some idea on adapting to the pandas scheme?
In the time being, you might just go the other way round (isel before where) and this little hack:
# overwrite fill_values with 0
sub = xr.where(z_indices == fill_value, 0, z_indices)
# isel with sub and mask with where
indexed_array = val_arr.isel(z=sub).where(z_indices != fill_value)
Update: Nevermind, this will make the indexed_array a float. You might use the same where-machinery and overwrite with a fill_value of your liking:
# overwrite fill_values with 0
sub = xr.where(z_indices == fill_value, 0, z_indices)
# isel with sub and mask with where
indexed_array = val_arr.isel(z=sub)
indexed_array = xr.where(z_indices == fill_value, fill_value, indexed_array)
I can't immediately see, but there might be a cleaner way to achieve this.
@kmuehlbauer Thanks, Nice trick! It works well for this situation.
I would love to have support for integer NA values in xarray, but I don't think we want to build it into xarray.
Ideally this would either be built into NumPy (i.e., with a custom dtype, which will require some work before its possible) or someone could build an "integer with NA" duckarray, which could implement the various NumPy protocols such as __array_function__. The later is a bit less elegant but could be done today with very few changes in xarray.