Xarray: copy / deepcopy not deepcopying coords?

Created on 22 Jun 2017  路  8Comments  路  Source: pydata/xarray

I don't know if this is intentional, I thought that arr.copy(deep=True) or deepcopy(arr) would give me completely independent copies of a DateArray, but this seems not be the case?

>>> import xarray as xr
>>> xarr1 = xr.DataArray([1,2], coords=dict(x=[0,1]), dims=('x',))
>>> xarr1.x.data[0]
0
>>> xarr2 = xarr1.copy(deep=True) #xarr2 = deepcopy(xarr1) -> leads to same result
>>> xarr2.x.data[0] = -1
>>> xarr1.x.data[0]
-1

How can I create completely independent copies of a DateArray? I wrote a function for this, but don't know if this really always does what I expect and if there is a more elegant way?

def deepcopy_xarr(xarr):
    """
    Deepcopy for xarray that makes sure coords and attrs
    are properly deepcopied.
    With normal copy method from xarray, when i mutated
    xarr.coords[coord].data it would also mutate in the copy
    and vice versa.
    Parameters
    ----------
    xarr: DateArray

    Returns
    -------
    xcopy: DateArray
        Deep copy of xarr
    """
    xcopy = xarr.copy(deep=True)

    for dim in xcopy.coords:
        xcopy.coords[dim].data = np.copy(xcopy.coords[dim].data)
    xcopy.attrs = deepcopy(xcopy.attrs)
    for attr in xcopy.attrs:
        xcopy.attrs[attr] = deepcopy(xcopy.attrs[attr])
    return xcopy
API design bug

Most helpful comment

I'd like to take a shot fixing this bug unless someone else already is working on it. Would that be alright?

All 8 comments

This seems like a bug.

I suspect the problem is in Variable.copy on these lines (which should probably just be removed). Coordinates store their data in pandas.Index objects, which are supposed to be immutable. But apparently that's not necessarily the case.

We do not allow to assign value in IndexVariable as pandas.Index is immutable (assigning value raises a TypeError),
but we can actually do this from .data attribute (this line).
(Our IndexVariable is not immutable...)

I think we should copy also IndexVariable not keeping the original reference.

I'd like to take a shot fixing this bug unless someone else already is working on it. Would that be alright?

go for it! @pletchm .

Feel free to open a PR or ask questions if you need help.

Great @pletchm ! This is a example of a recent similar issue: https://github.com/pydata/xarray/pull/2839/files

This also doesn't work for ._replace: https://github.com/pydata/xarray/blob/master/xarray/core/dataarray.py#L296

So my comment here isn't really correct: https://github.com/pydata/xarray/pull/3086#discussion_r301211375

@pletchm should I have a go at a PR? Happy to take this and you can take another one; I have lots of time atm

I think this was fixed by https://github.com/pydata/xarray/pull/2936. Certainly I can't reproduce the example in the first comment here any more.

This is fixed! It's not allowing attrs to be passed into _replace. ~I'll open a new issue~ I think that's OK, since attrs are on the variable rather than the DataArray

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mathause picture mathause  路  4Comments

benbovy picture benbovy  路  3Comments

TomNicholas picture TomNicholas  路  4Comments

duncanwp picture duncanwp  路  4Comments

jbusecke picture jbusecke  路  4Comments