assign_corods correctly disallows creating a new dimension when assigning list-like coords with a name that does not match an existing dimension. However, it does allow this operation if the value is scalar.
Consider the following DataArray:
>>> coords = {"fruit": ("x", ["apple", "banana"])}
>>> arr = xr.DataArray([[1, 2, 3], [4, 5, 6]], dims=("x", "y"), coords=coords)
>>> arr
<xarray.DataArray (x: 2, y: 3)>
array([[1, 2, 3],
[4, 5, 6]])
Coordinates:
fruit (x) <U6 'apple' 'banana'
Dimensions without coordinates: x, y
I can assign new coordinates to an existing dimension:
>>> arr.assign_coords(color=("x", ["red", "yellow"]))
<xarray.DataArray (x: 2, y: 3)>
array([[1, 2, 3],
[4, 5, 6]])
Coordinates:
fruit (x) <U6 'apple' 'banana'
color (x) <U6 'red' 'yellow'
Dimensions without coordinates: x, y
And I cannot (correctly) assign coordinates to a new (nonexistent) dimension:
>>> arr.assign_coords(color=["red", "yellow"])
...
ValueError: cannot add coordinates with new dimensions to a DataArray
The above fails because Xarray, in the absence of an explicit dimension, tries to assign the new coordinates to a color dimension which does not exist. So far so good. But why does _this_ work?
>>> arr = arr.assign_coords(color="red")
>>> arr
<xarray.DataArray (x: 2, y: 3)>
array([[1, 2, 3],
[4, 5, 6]])
Coordinates:
fruit (x) <U6 'apple' 'banana'
color <U3 'red'
Dimensions without coordinates: x, y
I would expect this to fail because color is not a dimension. But these appear to be newly added coordinates without a dimension?
>>> arr.coords
Coordinates:
fruit (x) <U6 'apple' 'banana'
color <U3 'red'
Interesting case, thanks for posting.
There's some logic that this could be allowed - it's a coord along no dimensions (whereas in the previous case it had to be on a dimension, unspecified).
But that's what attributes are, so I'm not sure whether this is something we should support.
Does anyone know whether this is a deliberate decision?
This was intentional. array.assign_coords(name=value) should be equivalent to array = array.copy(deep=False); array.coords[name] = value.
You are allowed to add new coordinates to a DataArray if they share existing dimensions. You are not allowed to add coordinates with new dimensions, because it is enforced as an invariant of the DataArray data model that all coordinate dimensions are found on the DataArray variable as well.
@shoyer, maybe I am missing a nuance in your answer, but I think I already understand this bit about Xarray. My question was why is it is allowed to add a scalar:
>>> arr = xr.DataArray(range(3), dims=['x'], coords={'x': list('abc')})
>>> arr.coords
Coordinates:
* x (x) <U1 'a' 'b' 'c'
>>> arr.coords['y'] = 1 # <----------- Why does this work?
>>> arr.coords
Coordinates:
* x (x) <U1 'a' 'b' 'c'
y int64 1
In this case, y are coordinates that do not share an existing dimension.
In that example, y has no dimensions that aren't on the array. So while it is a bit unusual, the resulting object doesn't break any rules / invariants. Does that make sense?
I see. This is because coordinates are just DataArray objects. So
>>> arr.coords['y'] = range(3)
is equivalent to
>>> new_coords = xr.DataArray(data=range(3))
>>> arr.coords['y'] = new_coords
And the reason _this_ is a ValueError is that new_coords has a default dimension dim_0 that is not on arr. However, this
>>> arr.coords['y'] = 1
is equivalent to...
>>> new_coords = xr.DataArray(data=1, dims=[])
>>> arr.coords['y'] = new_coords
And new_coords has no dimensions that are not on arr.
Most helpful comment
I see. This is because coordinates are just
DataArrayobjects. Sois equivalent to
And the reason _this_ is a
ValueErroris thatnew_coordshas a default dimensiondim_0that is not onarr. However, thisis equivalent to...
And
new_coordshas no dimensions that are not onarr.