Dask: Comparing two Dask arrays for equality

Created on 17 Jan 2019  路  5Comments  路  Source: dask/dask

Hello everyone !

I am trying to use the equality operation (==) between dask arrays but I had an error. Now I am using Numpy arrays implementation instead.
I will post here (1) the code with Numpy arrays, (2) the code with Dask arrays which is buggy, (3) the variables description from the Debugger and (4) the output error.

1. This is the Numpy array-based implementation

masked_parts = np.copy(parts)
for sw_idx in range(0, n_sw):
    # extract only the voxels in the same cluster of the seed
     masked_parts[sw_idx, :] = masked_parts[sw_idx, : ] == masked_parts[sw_idx, :][mask_medoid]

2. This is the Dask array-based implementation

masked_parts = parts
for sw_idx in range(0, n_sw):
    # extract only the voxels in the same cluster of the seed
     masked_parts[sw_idx, :] = masked_parts[sw_idx, : ] == masked_parts[sw_idx, :][mask_medoid]

3. The variables

_masked_parts: {Array} Dask Array, shape(9, 57k), dtype: int_
_mask_medoid: {Ndarray}[False, False, True, ... True], shape(57k), dtype: bool_

4. The output error

../dynpar/dynamic_parcellation.py:74: in __mask_region_for_medoid
masked_parts[sw_idx, :] = masked_parts[sw_idx, :] == masked_parts[sw_idx, :][mask_medoid]


self = dask.array key = (0, slice(None, None, None))
value = dask.array

def __setitem__(self, key, value):
    from .routines import where
    if isinstance(key, Array):
        if isinstance(value, Array) and value.ndim > 1:
            raise ValueError('boolean index array should have 1 dimension')
        y = where(key, value, self)
        self.dtype = y.dtype
        self.dask = y.dask
        self.name = y.name
        self._chunks = y.chunks
        return self
    else:
        raise NotImplementedError("Item assignment with %s not supported)

E NotImplementedError: Item assignment with not supported

../../../.local/lib/python3.6/site-packages/dask/array/core.py:1221: NotImplementedError

So it seems that this is not supported yet by Dask arrays ...
If someone have thoughts related to how to implement my function differently using dask arrays , I would appreciate any comments !

Thanks !

array

Most helpful comment

FWICT the existing example is pretty minimal, but this may highlight the issue more clearly. In other words, this doesn't appear to be an issue of equality, but one of assignment. @bamal, would you be ok with having this issue retitled to reflect this?

In [1]: import dask.array as da                                                 

In [2]: a = da.ones((10, 11), chunks=5)                                         

In [3]: a[5, :] = 0                                                             
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-3-0bcebd5d81b6> in <module>
----> 1 a[5, :] = 0

/zopt/conda3/envs/nanshenv3/lib/python3.6/site-packages/dask/array/core.py in __setitem__(self, key, value)
   1219         else:
   1220             raise NotImplementedError("Item assignment with %s not supported"
-> 1221                                       % type(key))
   1222 
   1223     def __getitem__(self, index):

NotImplementedError: Item assignment with <class 'tuple'> not supported

All 5 comments

Thanks for the bug report @bamal . Can I ask you to provide a minimal reproducible example? http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

This will help everyone be able to reproduce and help you with your problem more quickly.

FWICT the existing example is pretty minimal, but this may highlight the issue more clearly. In other words, this doesn't appear to be an issue of equality, but one of assignment. @bamal, would you be ok with having this issue retitled to reflect this?

In [1]: import dask.array as da                                                 

In [2]: a = da.ones((10, 11), chunks=5)                                         

In [3]: a[5, :] = 0                                                             
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-3-0bcebd5d81b6> in <module>
----> 1 a[5, :] = 0

/zopt/conda3/envs/nanshenv3/lib/python3.6/site-packages/dask/array/core.py in __setitem__(self, key, value)
   1219         else:
   1220             raise NotImplementedError("Item assignment with %s not supported"
-> 1221                                       % type(key))
   1222 
   1223     def __getitem__(self, index):

NotImplementedError: Item assignment with <class 'tuple'> not supported

would you be ok with having this issue retitled to reflect this?

@jakirkham I encourage you to just change things as you see fit. You probably have more understanding and context here than most people who raise issues. I encourage you to feel comfortable taking charge.

Is there something to be done for this issue? If so, can I ask you to briefly summarize it?

I just started looking into dask as a way to scale some code beyond the limits of available ram and slammed headfirst into this problem. Both tuple and slice based item assignment are not a thing yet (I assume the focus is more on reading data than writing it?).

Is there a fundamental issue blocking the implementation of mutation?

Same here, the only workaround is to use dask.array.concatenate and that is what we had to do for fluidfft (unfinished API for dask FFT). I do hope there is a better way as this approach requires rewriting algorithms. If this is sorted out dask.array can truly be a drop-in replacement for numpy

Was this page helpful?
0 / 5 - 0 ratings