# Your code here
#!/usr/bin/env python
import xarray
import numpy
ds = xarray.Dataset()
ds['x'] = ('x', numpy.linspace(0., 1., 100))
ds['y'] = ('y', numpy.linspace(0., 1., 50))
print(ds)
ds = ds.drop(set('x'))
print(ds)
<xarray.Dataset>
Dimensions: (x: 100, y: 50)
Coordinates:
* x (x) float64 0.0 0.0101 0.0202 0.0303 ... 0.9697 0.9798 0.9899 1.0
* y (y) float64 0.0 0.02041 0.04082 0.06122 ... 0.9592 0.9796 1.0
Data variables:
*empty*
<xarray.Dataset>
Dimensions: (y: 50)
Coordinates:
* y (y) float64 0.0 0.02041 0.04082 0.06122 ... 0.9592 0.9796 1.0
Data variables:
*empty*
In versions before xarray 0.14.1, the code above involving calls to Dataset.drop() with a set as an argument worked as expected. With the new release, the result is an error as shown below. This is breaking backwards compatibility with our software (MPAS-Analysis).
<xarray.Dataset>
Dimensions: (x: 100, y: 50)
Coordinates:
* x (x) float64 0.0 0.0101 0.0202 0.0303 ... 0.9697 0.9798 0.9899 1.0
* y (y) float64 0.0 0.02041 0.04082 0.06122 ... 0.9592 0.9796 1.0
Data variables:
*empty*
Traceback (most recent call last):
File "./drop_issue.py", line 10, in <module>
ds = ds.drop(set('x'))
File "/home/xylar/miniconda3/envs/test/lib/python3.7/site-packages/xarray/core/dataset.py", line 3643, in drop
return self.drop_sel(labels, errors=errors)
File "/home/xylar/miniconda3/envs/test/lib/python3.7/site-packages/xarray/core/dataset.py", line 3689, in drop_sel
labels = either_dict_or_kwargs(labels, labels_kwargs, "drop")
File "/home/xylar/miniconda3/envs/test/lib/python3.7/site-packages/xarray/core/utils.py", line 257, in either_dict_or_kwargs
"the first argument to .%s must be a dictionary" % func_name
ValueError: the first argument to .drop must be a dictionary
xr.show_versions()commit: None
python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-1063-oem
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.1
xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.3
scipy: 1.3.2
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.1
dask: 2.8.0
distributed: 2.8.0
matplotlib: 3.1.2
cartopy: 0.17.0
seaborn: None
numbagg: None
setuptools: 41.6.0.post20191101
pip: 19.3.1
conda: None
pytest: 5.3.0
IPython: None
sphinx: None
I believe a set() object should be a valid labels argument according to the documentation of drop():
http://xarray.pydata.org/en/v0.14.0/generated/xarray.Dataset.drop.html#xarray.Dataset.drop
How about drop_vars instead?
@dcherian - are you suggesting that @xylar change his downstream code to use drop_vars instead? drop_vars has been deprecated. I don't think that's the right response here.
Instead we should fix this bug. The docs for drop suggest that you should be able to provide a set as an input. Somehow we made that not work in the latest release.
Yes yes I agree!
drop_vars would be a quick unblocking fix...
@dcherian - are you suggesting that @xylar change his downstream code to use drop_vars instead? drop_vars has been deprecated. I don't think that's the right response here.
I'm confused. In 0.14.1, drop_vars seems to have just been introduced and drop seems to have been dropped.
How about drop_vars instead?
Yes, I switched to drop_vars but first I had to make a new build of my conda package that constrains the dependency xarray <0.14.0 because it's broken with 0.14.1. So that's why I'm concerned. I have to do a new release before this fix propagates to my users.
So @rabernat and @dcherian, please clarify if I should be switching to drop_vars or if I am going to run into issues yet again if I do so?
I think drop was deprecated in favor of drop_sel and drop_vars (#3475). Using v0.14.1, this works:
ds = xr.Dataset({"a": ("b", [0, 1, 2]), "c": [-1, 2, 4]}, coords={"b": list("abc")})
ds.drop_vars({"a", "b"})
ds.drop_vars(["a", "b"])
ds.drop(["a", "b"])
Additionally, I think we have a documentation bug: I can't reach the documentation on drop.
so I guess we need to figure out why sets don't work anymore?
Additionally, I think we have a documentation bug: I can't reach the documentation on drop.
Yes, I noticed that, too. I assumed it wasn't being documented anymore because it was considered deprecated.
I apologize for this slight tangent but it's related to why we use sets in drop(). Is there a good way in xarray to only keep a list of variables? Our approach is to create a set of all variable and subtract from it a set of variables we want to keep and to drop the resulting set of variables we don't want to keep. But it's hard for me to imagine others don't also need this functionality.
explicitly selecting should work like this:
ds = xr.Dataset(
{"a": ("b", [0, 1, 2]), "c": ("b", [-1, 2, 4]), "d": ("b", [5, 1, 3])},
coords={"b": list("abc")},
)
variables = ["a", "d"]
ds[variables]
I think Dataset.__getitem__ allows any kind of sequence here (e.g. sets)
Ah, thanks very much! I'll switch to using that syntax instead.
~drop_vars has been deprecated.~ Apparently that is wrong. Sorry for adding to the noise.
the reason for set raising an error is that in drop is_list_like(labels) or is_scalar(labels) is used instead of isinstance(labels, str) or not isinstance(labels, Iterable). Since is_list_like explicitly checks for tuple and list, passing set obviously won't call the right method.
drop_vars was just added! Let me know any questions / issues. We should fix set, though for your case it's best to switch to drop_vars
馃憤 re @keewis 's point re ds[variables]
We should fix set, though for your case it's best to switch to drop_vars
This also happens with other iterables like dict_keys.
Most helpful comment
the reason for
setraising an error is that indropis_list_like(labels) or is_scalar(labels)is used instead ofisinstance(labels, str) or not isinstance(labels, Iterable). Sinceis_list_likeexplicitly checks fortupleandlist, passingsetobviously won't call the right method.