I have three arrays:
a: input arraysindex: the array containing the start index for summationeindex: the array containing the end index for summationimport xarray as xr
data = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]])
# input array
a = xr.DataArray(data, dims=['x', 'y'])
# start_index array
sindex = xr.DataArray(np.array([0, 0, 1, 1]), dims=['x'])
# end_index array
eindex = xr.DataArray(np.array([0, 1, 2, 2]), dims=['x'])
# empty array for saving summation
sum_a = xr.DataArray(np.empty((a.shape[0], 1)), dims=['x', 'y'])
for x in a.x:
# sum values from sindex to eindex at row x
sum_a[x] = a[x, sindex[x].values:eindex[x].values+1].sum()
print(sum_a)
<xarray.DataArray (x: 4, y: 1)>
array([[ 1.],
[ 9.],
[17.],
[23.]])
Dimensions without coordinates: x, y
Is it necessary to use xr.apply_ufunc? or any other good method?
Solution (Boolean)
# stack indexes
index_list = np.column_stack((sindex, eindex))
# all false array
boolean_array = np.zeros(a.shape, dtype=bool)
# iterate and assign true
for row in range(len(index_list)):
boolean_array[row, np.arange(index_list[row][0], index_list[row][1]+1)] = True
sum_a = a.where(boolean_array).sum(dim='y')
where is usually the solution for this kind of problem. I added the keepdims to keep the y dimension after the sum.
yindex = a.y.copy(data=np.arange(a.sizes["y"])) # generate DataArray of indexes
a.where((yindex >= sindex) & (yindex <= eindex)).sum("y", keepdims=True)
Please close if this answers your question.
@dcherian Excellent solution!
If we upgrade this to 3d array and sum by z axis, it seems that method isn't suitable:
import xarray as xr
import numpy as np
x = 2
y = 2
z = 3
data = np.arange(x*y*z).reshape(z, y, x)
# input array
a = xr.DataArray(data, dims=['z', 'y', 'x'])
# start_index array
sindex = xr.DataArray(np.full_like(a[0, ...], 0), dims=['y', 'x'])
# end_index array
eindex = xr.DataArray(np.full_like(a[0, ...], 1), dims=['y', 'x'])
why not?
@dcherian Sorry for the misunderstanding. I tried again for the 3d array, it works well ;)
import xarray as xr
import numpy as np
x = 2
y = 4
z = 3
data = np.arange(x*y*z).reshape(z, x, y)
# input array
a = xr.DataArray(data, dims=['z', 'y', 'x'])
# start_index array
sindex = xr.DataArray(np.full_like(a[0, ...], 0), dims=['y', 'x'])
# end_index array
eindex = xr.DataArray(np.full_like(a[0, ...], 1), dims=['y', 'x'])
zindex = a.z.copy(data=np.arange(a.sizes["z"]))
sub_z = (zindex >= sindex) & (zindex <= eindex)
sum_a = a.where(sub_z).sum('z', keepdims=True)
print(a)
print(sum_a)
<xarray.DataArray (z: 3, y: 2, x: 4)>
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
Dimensions without coordinates: z, y, x
<xarray.DataArray (z: 1, y: 2, x: 4)>
array([[[ 8., 10., 12., 14.],
[16., 18., 20., 22.]]])
Dimensions without coordinates: z, y, x
Most helpful comment
whereis usually the solution for this kind of problem. I added thekeepdimsto keep theydimension after the sum.Please close if this answers your question.