The documentation states:
coords...
...
Additionally, it is possible to define a coord whose name does not match the dimension name, or a coord based on multiple dimensions, with one of the following notations:
- mapping {coord name: DataArray}
- mapping {coord name: Variable}
- mapping {coord name: (dimension name, array-like)}
- mapping {coord name: (tuple of dimension names, array-like)}
dims (hashable or sequence of hashable, optional) – Name(s) of the data dimension(s). Must be either a hashable (only for 1D data) or a sequence of hashables with length equal to the number of dimensions. If this argument is omitted, dimension names are taken from coords (if possible) and otherwise default to ['dim_0', ... 'dim_n'].
Which seems to be no longer the case.
da = xr.DataArray(np.zeros((2, 2)), coords={'x': [1, 2], 'y': [1, 2]})
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-22-0d63eed9a72f> in <module>
----> 1 da = xr.DataArray(np.zeros((2, 2)), coords={'x': [1, 2], 'y': [1, 2]})
/scratch/local/lib/python3.8/site-packages/xarray/core/dataarray.py in __init__(self, data, coords, dims, name, attrs, encoding, indexes, fastpath)
366 data = _check_data_shape(data, coords, dims)
367 data = as_compatible_data(data)
--> 368 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
369 variable = Variable(dims, data, attrs, encoding, fastpath=True)
370
/scratch/local/lib/python3.8/site-packages/xarray/core/dataarray.py in _infer_coords_and_dims(shape, coords, dims)
105 if utils.is_dict_like(coords):
106 # deprecated in GH993, removed in GH1539
--> 107 raise ValueError(
108 "inferring DataArray dimensions from "
109 "dictionary like ``coords`` is no longer "
ValueError: inferring DataArray dimensions from dictionary like ``coords`` is no longer supported. Use an explicit list of ``dims`` instead.
An update of the documentation to correctly specify the current behavior. (I'll propose a PR later today)
A mismatch between API specification and it's behavior seems like a problem to me :)
xr.show_versions()
xr.show_versions()
commit: None
python: 3.8.1 (default, Feb 11 2020, 12:54:25)
[GCC 5.4.0 20160609]
python-bits: 64
OS: Linux
OS-release: 4.15.0-88-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_IE.UTF-8
LOCALE: en_IE.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.3
xarray: 0.14.1
pandas: 0.25.3
numpy: 1.18.0
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.1.2
cartopy: None
seaborn: None
numbagg: None
setuptools: 45.2.0
pip: 20.0.2
conda: None
pytest: None
IPython: 7.11.1
sphinx: 2.3.1
Agree! Thanks for spotting @sjvrijn . PR is appreciated, thanks for the offer!
For the PR: in general we don't need to document deprecated behavior, so you can remove than description rather than caveat it
@max-sixty Thanks for the tip. In the end it meant just changing the last line on dims. The paragraph on the coords argument is still valid after all.
On a related note: according to #727 (PR #993), this was deprecated since key-order in dictionaries was arbitrary at the time of that issue. However, their order is fixed since Python3.7, as noted in the documentation:
_Changed in version 3.7_: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.
I guess it's still too soon to 'un-deprecate' this behavior again? 👼
I guess it's still too soon to 'un-deprecate' this behavior again?
Ha! We do support 3.6. I think that almost everyone is using xarray with CPython, so it's almost certainly safe. But maybe not worth the tradeoff given the benefit is marginal, what are your thoughts?
I think that inferring dimension-names from the coords-dict is the most intuitive way to define a DataArray.
Passing a dictionary for coords is in my opinion the clearest way to indicate which coordinates belong to which dimension, so then why do I have to specify the same dimension names again?
An example of how I create them from my current project:
values = xr.DataArray(
values,
coords={'n_high': n_highs,
'n_low': n_lows,
'rep': repetitions,
'model': models,
'idx': range(n_test_samples),},
dims=['n_high', 'n_low', 'rep', 'model', 'idx'], <-- repeated dim names
attrs=attributes,
)
If you expect almost everyone to use CPython or 3.7+ anyway, then I don't actually see any drawbacks, while it would regularly make code shorter and less repetitive.
I definitely agree it's better if possible. I think the tradeoff is between better for almost everyone vs the potential of a break for a small minority.
Currently we support python versions for 42 months, and Python 3.6 was released December 2016. So that means we support it until July 2020 -- given that timeframe I think we wait until then and add this back?
Waiting a few more months until it will definitely not be a problem for anyone seems fair to me :+1:
Note that inferring dimensions from coords when it is a list of tuples does still work (with no deprecation warning):
In [1]: import numpy as np, xarray as xr
In [2]: xr.DataArray(np.zeros((2, 2)), coords=[('x', [1, 2]), ('y', [1, 2])])
Out[2]:
<xarray.DataArray (x: 2, y: 2)>
array([[0., 0.],
[0., 0.]])
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2
Most helpful comment
Note that inferring dimensions from coords when it is a list of tuples does still work (with no deprecation warning):