I don't know if this is intentional, I thought that arr.copy(deep=True) or deepcopy(arr) would give me completely independent copies of a DateArray, but this seems not be the case?
>>> import xarray as xr
>>> xarr1 = xr.DataArray([1,2], coords=dict(x=[0,1]), dims=('x',))
>>> xarr1.x.data[0]
0
>>> xarr2 = xarr1.copy(deep=True) #xarr2 = deepcopy(xarr1) -> leads to same result
>>> xarr2.x.data[0] = -1
>>> xarr1.x.data[0]
-1
How can I create completely independent copies of a DateArray? I wrote a function for this, but don't know if this really always does what I expect and if there is a more elegant way?
def deepcopy_xarr(xarr):
"""
Deepcopy for xarray that makes sure coords and attrs
are properly deepcopied.
With normal copy method from xarray, when i mutated
xarr.coords[coord].data it would also mutate in the copy
and vice versa.
Parameters
----------
xarr: DateArray
Returns
-------
xcopy: DateArray
Deep copy of xarr
"""
xcopy = xarr.copy(deep=True)
for dim in xcopy.coords:
xcopy.coords[dim].data = np.copy(xcopy.coords[dim].data)
xcopy.attrs = deepcopy(xcopy.attrs)
for attr in xcopy.attrs:
xcopy.attrs[attr] = deepcopy(xcopy.attrs[attr])
return xcopy
This seems like a bug.
I suspect the problem is in Variable.copy on these lines (which should probably just be removed). Coordinates store their data in pandas.Index objects, which are supposed to be immutable. But apparently that's not necessarily the case.
We do not allow to assign value in IndexVariable as pandas.Index is immutable (assigning value raises a TypeError),
but we can actually do this from .data attribute (this line).
(Our IndexVariable is not immutable...)
I think we should copy also IndexVariable not keeping the original reference.
I'd like to take a shot fixing this bug unless someone else already is working on it. Would that be alright?
go for it! @pletchm .
Feel free to open a PR or ask questions if you need help.
Great @pletchm ! This is a example of a recent similar issue: https://github.com/pydata/xarray/pull/2839/files
This also doesn't work for ._replace: https://github.com/pydata/xarray/blob/master/xarray/core/dataarray.py#L296
So my comment here isn't really correct: https://github.com/pydata/xarray/pull/3086#discussion_r301211375
@pletchm should I have a go at a PR? Happy to take this and you can take another one; I have lots of time atm
I think this was fixed by https://github.com/pydata/xarray/pull/2936. Certainly I can't reproduce the example in the first comment here any more.
This is fixed! It's not allowing attrs to be passed into _replace. ~I'll open a new issue~ I think that's OK, since attrs are on the variable rather than the DataArray
Most helpful comment
I'd like to take a shot fixing this bug unless someone else already is working on it. Would that be alright?