When I have two DataArrays and I use a standard operation (+, - ,*, /) the attributes vanish. I think that should not be the case. Even when using as suggested the set_options
import numpy as np
import xarray as xr
a = xr.DataArray(np.random.randn(3,3), dims=('x','y'), name='temp', attrs={'units':'K'})
b = xr.DataArray(np.random.randn(3,3), dims=('x','y'), name='temp', attrs={'units':'K'})
print(a)
<xarray.DataArray 'temp' (x: 3, y: 3)>
array([[ 1.207407, -1.9429 , 3.168454],
[-0.773912, -0.121835, -0.139538],
[ 1.823002, 0.185846, 0.53569 ]])
Dimensions without coordinates: x, y
Attributes:
units: K
print(a-b)
<xarray.DataArray 'temp' (x: 3, y: 3)>
array([[ 1.280892, -1.097781, 2.150318],
[-0.208202, -0.03856 , 0.805856],
[ 2.192506, 1.049181, 2.277078]])
Dimensions without coordinates: x, y
with xr.set_options(keep_attrs=True):
print(a-b)
<xarray.DataArray 'temp' (x: 3, y: 3)>
array([[ 1.280892, -1.097781, 2.150318],
[-0.208202, -0.03856 , 0.805856],
[ 2.192506, 1.049181, 2.277078]])
Dimensions without coordinates: x, y
Attributes vanish when a normal operation is applied!
From docs of set_options:
keep_attrs: rule for whether to keep attributes on xarray
Datasets/dataarrays after operations. Either True to always keep
attrs, False to always discard them, or 'default' to use original
logic that attrs should only be kept in unambiguous circumstances.
Default: 'default'.
The Attributes should remain. Maybe keep only attributes from the left Array ?
Please adjust or advise me.
xr.show_versions()
``
xr.show_versions()
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-39-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
xarray: 0.11.0
pandas: 0.23.4
numpy: 1.15.4
scipy: 1.1.0
netCDF4: 1.4.2
h5netcdf: None
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.2.1
PseudonetCDF: None
rasterio: None
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.20.2
distributed: 1.24.2
matplotlib: 3.0.1
cartopy: 0.16.0
seaborn: 0.9.0
setuptools: 40.6.2
pip: 18.1
conda: 4.5.11
pytest: 4.0.0
IPython: 7.1.1
sphinx: 1.8.2
``
Thanks for the report! It looks like we definitely overlooked this in arithmetic operations. I agree that keep_attrs=True should mean that attributes are maintained in arithmetic.
Any interest in putting together a PR?
Thanks for the quick reply.
Not sure what a PR is. (Sorry I'm not that advanced in coding)
I figure, from code you have been using at other places, something like that
@staticmethod
def _binary_op(f, reflexive=False, **ignored_kwargs):
@functools.wraps(f)
def func(self, other):
if isinstance(other, (xr.DataArray, xr.Dataset)):
return NotImplemented
self_data, other_data, dims = _broadcast_compat_data(self, other)
# Add Attributes here ?
keep_attrs = _get_keep_attrs(default=False)
attrs = self._attrs if keep_attrs else None
with np.errstate(all='ignore'):
new_data = (f(self_data, other_data)
if not reflexive
else f(other_data, self_data))
result = Variable(dims, new_data, attrs=attrs)
return result
return func
should do the trick. Right.
I cloned the recent version and tried out the new code. It works! :)
xr.show_versions()
commit: 0d6056e8816e3d367a64f36c7f1a5c4e1ce4ed4e
python: 3.6.6 |Anaconda, Inc.| (default, Oct 9 2018, 12:34:16)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-39-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.2
libnetcdf: 4.6.1
xarray: 0.11.0+10.g0d6056e8.dirty
pandas: 0.23.4
numpy: 1.15.4
scipy: 1.1.0
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.2.1
PseudonetCDF: None
rasterio: None
cfgrib: installed
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.20.2
distributed: 1.24.2
matplotlib: 3.0.1
cartopy: 0.16.0
seaborn: 0.9.0
setuptools: 40.6.2
pip: 18.1
conda: 4.5.11
pytest: 4.0.0
IPython: 7.1.1
sphinx: 1.8.2
When the option is not set, same behavior as before
print(a-b)
<xarray.DataArray 'temp' (x: 3, y: 3)>
array([[ 0.133102, -1.275794, 1.331784],
[ 0.995555, -0.509624, 0.188597],
[ 1.922048, -0.053253, -0.293245]])
Dimensions without coordinates: x, y
set the option:
with xr.set_options(keep_attrs=True):
print(a-b)
<xarray.DataArray 'temp' (x: 3, y: 3)>
array([[ 0.133102, -1.275794, 1.331784],
[ 0.995555, -0.509624, 0.188597],
[ 1.922048, -0.053253, -0.293245]])
Dimensions without coordinates: x, y
Attributes:
units: K
works. Hope that helps you.
Not sure what a PR is. (Sorry I'm not that advanced in coding)
PR is a pull-request! If you can open a PR with your code, we can merge it to the repo. Would be greatly appreciated from xarray, and you'd be an xarray contributor. Let us know if we can help guide you through the mechanics.
@MBlaschek This might help: https://help.github.com/articles/proposing-changes-to-your-work-with-pull-requests/ . You'd start by creating a fork, then a branch with your changes, push your changes to github and then initiate a pull request.
Hi @MBlaschek, almost there! You'll need to open your pull request in this repository :).
You'll also need to add some tests to make sure your changes keep working as the code is updated in the future. E.g. https://github.com/pydata/xarray/blob/0d6056e8816e3d367a64f36c7f1a5c4e1ce4ed4e/xarray/tests/test_variable.py#L1533
Hi.
Ok Sorry. Had no idea what I was doing. So I hope I fixed it, the way you wanted. I added a test-routine test_binary_ops_keep_attrs
Created a new pull request, as I could not reopen the old one
Most helpful comment
Thanks for the quick reply.
Not sure what a PR is. (Sorry I'm not that advanced in coding)
I figure, from code you have been using at other places, something like that
should do the trick. Right.
I cloned the recent version and tried out the new code. It works! :)
xr.show_versions()
INSTALLED VERSIONS
commit: 0d6056e8816e3d367a64f36c7f1a5c4e1ce4ed4e
python: 3.6.6 |Anaconda, Inc.| (default, Oct 9 2018, 12:34:16)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.15.0-39-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.2
libnetcdf: 4.6.1
xarray: 0.11.0+10.g0d6056e8.dirty
pandas: 0.23.4
numpy: 1.15.4
scipy: 1.1.0
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.2.1
PseudonetCDF: None
rasterio: None
cfgrib: installed
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.20.2
distributed: 1.24.2
matplotlib: 3.0.1
cartopy: 0.16.0
seaborn: 0.9.0
setuptools: 40.6.2
pip: 18.1
conda: 4.5.11
pytest: 4.0.0
IPython: 7.1.1
sphinx: 1.8.2
When the option is not set, same behavior as before
set the option:
works. Hope that helps you.