Hi all,
I was wondering whether there is an easier way to keep the attributes when using the DataArray.astype method?
import xarray as xr
# DataArray with attributes
da = xr.DataArray(
[[0, 1, 2], [0, 1, 2]], attrs={"attr1": "value1"}
)
# the attributes are not passed over to new_da
new_da = da.astype(float)
# I have to set the attributes by myself
new_da.attrs = da.attrs.copy()
This is just one extra-line, but I have to keep track of the old DataArray and it may become unhandy if I do many astype calls.
Any hints?
xr.show_versions()commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: en_US.UTF-8
xarray: 0.10.2
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: 0.5.0
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.17.2
distributed: 1.21.4
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.8.1
setuptools: 39.0.1
pip: 9.0.1
conda: None
pytest: 3.4.2
IPython: 6.2.1
sphinx: 1.7.1
We should probably just update astype() to preserve attributes.
Note: currently astype is directly assigning a method onto Dataset/DataArray (ugh!).
https://github.com/pydata/xarray/blob/dc3eebf3a514cfdc1039b63f2a542121d1328ba9/xarray/core/ops.py#L39
It should really be switched to use a more transparent definition, e.g., a method defined in a base class, with the implementation defined by using apply_ufunc:
https://github.com/pydata/xarray/blob/6402391cf206fd04c12d44773fecd9b42ea0c246/xarray/core/common.py#L748
This would make it quite easy to add keep_attrs=True.
I have a version of this working, but to get tests to pass I had to add the same behavior for Variable types (as the method was no longer being added from NUMPY_UNARY_METHODS). I don't think I have a very good picture of the proper use of Variables in the internal api, so I wasn't sure if it made sense to extend the behavior therein. Also I should say that on the first pass I had to do this outside of the apply_ufunc mechanism, as apply_ufunc doesn't keep attrs for Variables (thus inspiring my question above).
Just let me know if that makes sense or what alternative path seems best and I'll see if I can open a PR.
OK, sounds good with adding this to Variable, too. It's the internal data structure that we build most of xarray's higher level labeled arrays on: http://xarray.pydata.org/en/stable/internals.html#variable-objects
Feel free to open an incomplete PR with your current progress and I am happy to advise
Above PR is a first draft. It would seem that the kwargs for the dask array method are a subset of the numpy array method, so I based docstring on these. Happy to do something else though if that makes sense.
Just run into this issue, present in 0.15, also does not respect the option keep_attrs=True