Hi, all
Sorry to bother. Maybe it is a kind of stupid question for others, but I cannot figure it out at this moment.
I want to swap dims in xarray, like swapaxes in numpy. I found both dataarray and dataset has method swap_dims, but I don't understand its arguments: dims_dict : dict-like
Dictionary whose keys are current dimension names and whose values are new names. Each value must already be a coordinate on this array.
Here is my example:
data = np.random.rand(4,3)
lon = [1,2,3]
lat = [4,3,2,1]
foo = xr.DataArray(data,coords=[lat,lon])
foo
foo = xr.DataArray(data,coords=[lat,lon],dims=['lat','lon'])
foo
foo.swap_dims({'lat':'lon'})
The error message:
ValueError Traceback (most recent call last)
<ipython-input-47-c8aa4311b27e> in <module>()
----> 1 foo.swap_dims({'lat':'lon'})
/glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataarray.pyc in swap_dims(self, dims_dict)
794 Dataset.swap_dims
795 """
--> 796 ds = self._to_temp_dataset().swap_dims(dims_dict)
797 return self._from_temp_dataset(ds)
798
/glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataset.pyc in swap_dims(self, dims_dict, inplace)
1293 raise ValueError('replacement dimension %r is not a 1D '
1294 'variable along the old dimension %r'
-> 1295 % (v, k))
1296
1297 result_dims = set(dims_dict.get(dim, dim) for dim in self.dims)
ValueError: replacement dimension 'lon' is not a 1D variable along the old dimension 'lat'
Sorry to bother.
swap_dims does something very different from swap_axes in numpy (we should add an example to make this clear).
For what you want, I think transpose is a closer fit, e.g., foo.transpose('lon', 'lat')
Thanks, @shoyer
I agree your methods works for 2d matrix, but for 3 or 4d matrix it fails.
You need to provide the full list of dimensions to transpose.
We could add a method like numpy's swap_axes, it's just not clear what to name it.
Thanks! I will check it.
I have hit this issue before too.
We could add a method like numpy's
swap_axes, it's just not clear what to name it.
reorder_dims?
reorder_dims?
Would that be consistent with reorder_levels for MultIindex (#1028)? I'm not sure if that handles partial order specifications or not.
I have also hit this issue, this method could be useful. I'm putting below my workaround in case it is any helpful:
def reorder_dims(darray, dim1, dim2):
"""
Interchange two dimensions of a DataArray in a similar way as numpy's swap_axes
"""
dims = list(darray.dims)
assert set([dim1,dim2]).issubset(dims), 'dim1 and dim2 must be existing dimensions in darray'
ind1, ind2 = dims.index(dim1), dims.index(dim2)
dims[ind2], dims[ind1] = dims[ind1], dims[ind2]
return darray.transpose(*dims)
What about allowing .transpose() to handle a subset of array/dataset dimensions? In NumPy, this may not be desirable because it's easy to mix up integer dimensions, but in xarray ds.transpose('lat', 'lon') seems pretty unambiguous.
The implementation would simply reorder all the listed dimensions, keeping other dimensions in their original order.
ds.transpose('lat', 'lon') seems pretty unambiguous.
Though I think that would have radically different behavior for a 2-dim or 3-dim case. For the 2-dim case, it would enforce that order regardless of original order. For the 3-dim case, are you proposing they're swapped from their current order?
(Maybe transpose naturally refers to the behavior I think you describe, we'd need something else to 'set this order')
tranpose('x', 'y') already means ensure this object has dimensions in the order (x, y):
In [2]: a = xarray.DataArray([[0]], dims=['x', 'y'])
In [3]: a.T
Out[3]:
<xarray.DataArray (y: 1, x: 1)>
array([[0]])
Dimensions without coordinates: y, x
In [4]: a.T.transpose('x', 'y')
Out[4]:
<xarray.DataArray (x: 1, y: 1)>
array([[0]])
Dimensions without coordinates: x, y
In [5]: a.transpose('x', 'y')
Out[5]:
<xarray.DataArray (x: 1, y: 1)>
array([[0]])
Dimensions without coordinates: x, y
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically
From personal experience I find that 99% of the time, I want to push some known dimensions either to the front or to the back of the array while I don't care about the order of the others.
I'd love to have this syntax:
transpose(..., "x", "y")
or
transpose("x", "y", ...)
where the ellipsis expands to all dimensions not explicitly listed, in their original order. There can be at most one ellipsis.
Yes, this looks like a great use-case for Ellipsis!
On Thu, Oct 10, 2019 at 2:12 AM crusaderky notifications@github.com wrote:
From personal experience I find that 99% of the time, I want to push some
known dimensions either to the front or to the back of the array while I
don't care about the order of the others.
I'd love to have this syntax:transpose(..., "x", "y")
or
transpose("x", "y", ...)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/pydata/xarray/issues/1081?email_source=notifications&email_token=AAJJFVXRM73GZLFJYZXR6JDQN3WY7A5CNFSM4CVHGIYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA3QQXQ#issuecomment-540477534,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAJJFVW6IZJBW3KLMIU7IQDQN3WY7ANCNFSM4CVHGIYA
.
There's one edge case that might be worth thinking carefully about here:
Consider a dataset with two variables with dimensions ('w', 'x', 'y', 'z') and ('x', 'w', 'y', 'z'). Now we write .transpose(..., 'z', 'y'). What should the dimensions of variables on the resulting dataset be?
('w', 'x', 'z', 'y'), with ... filled in based on the order of dimensions in the overall dataset.('w', 'x', 'y', 'z') and ('x', 'w', 'y', 'z'), with ... filled in for each variable separately.I would vote for (2), given it's fairly easy to replicate (1) by passing the full list, and I think (2) is arguably slightly more expected
(NB this isn't how #3421 works now, but easy to change)
I agree, I think (2) is what most users would expect.
+1 for (2). Although user code that uses ... should not, by definition, care about the order of the variables that are not listed explicitly.
Most helpful comment
From personal experience I find that 99% of the time, I want to push some known dimensions either to the front or to the back of the array while I don't care about the order of the others.
I'd love to have this syntax:
or
where the ellipsis expands to all dimensions not explicitly listed, in their original order. There can be at most one ellipsis.