Numpy: BUG: Setting the mask on a view with mask=nomask does not propagate to the owner

Created on 27 Jun 2016 · 4Comments · Source: numpy/numpy

Here is the issue I run into during my work
The problem with the mask when it is not set

>>> a =np.arange(6,dtype=np.float64).reshape((2,3))
>>> a
array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.]])
>>> am = np.ma.masked_array(a)
>>> am
masked_array(data =
 [[ 0.  1.  2.]
 [ 3.  4.  5.]],
             mask =
 False,
       fill_value = 1e+20)

>>> am[0][1]=np.ma.masked
>>> am
masked_array(data =
 [[ 0.  1.  2.]
 [ 3.  4.  5.]],
             mask =
 False,
       fill_value = 1e+20)

###--------- doesn't work----------------

>>> am[0,1]=np.ma.masked
>>> am
masked_array(data =
 [[0.0 -- 2.0]
 [3.0 4.0 5.0]],
             mask =
 [[False  True False]
 [False False False]],
       fill_value = 1e+20)

###-------this way it works---------

>>> am[1][1]=np.ma.masked
>>> am
masked_array(data =
 [[0.0 -- 2.0]
 [3.0 -- 5.0]],
             mask =
 [[False  True False]
 [False  True False]],
       fill_value = 1e+20)

###--------now it surprisingly works again--

Linux  3.19.8-100.fc20.x86_64 
Python 2.7.5
>>> np.__version__
'1.8.2'

I know my system is not up to date but I asked a friend who has
and he confirms that issue exists

00 - Bug numpy.ma

Source

MorBilly

👍1

Most helpful comment

Every time I think of nomask, I also think of that Donald Knuth quote, "Premature optimization is the root of all evil."

It seems to me that nomask is no exception. The number of unusual cases one encounters from this behavior is quite large and they are made more difficult to fix because of it.

If it was really such a productive thing to know whether the mask was trivial or not, we could just as easily have a method like has_mask, which caches its result.

Is there any interest in moving towards removing nomask outright?

jakirkham on 29 Aug 2016

👍2

All 4 comments

Confirmed in master (1.12). This is like what #5580 hoped to fix, but as discussed there this particular case is not possible to fix without an overhaul of MaskedArray to remove np.nomask.

Here's the problem: MaskedArrays sometimes store the mask as an array of booleans, and sometimes (if there are no masked values) store the mask simply as the value False (and np.nomask == False).

The problem is that when slicing a MaskedArray (and getting a view), the mask can only be "viewed" if it is currently an array of booleans, but not if it is the constant "False". So the first time you try am[0][1] = ... the mask is the constant "False" and can't be viewed, so doesn't get updated. The second time you try, the mask is being stored as an array of booleans so it can be viewed, and so gets updated.

Add this to the long list of bugs caused by this nomask design, eg #7588.

ahaldane on 27 Jun 2016

👍1

Every time I think of nomask, I also think of that Donald Knuth quote, "Premature optimization is the root of all evil."

It seems to me that nomask is no exception. The number of unusual cases one encounters from this behavior is quite large and they are made more difficult to fix because of it.

If it was really such a productive thing to know whether the mask was trivial or not, we could just as easily have a method like has_mask, which caches its result.

Is there any interest in moving towards removing nomask outright?

jakirkham on 29 Aug 2016

👍2

I would like to chime in that I've also run into this issue. It might be worthwhile to add a note in the documentation about it. The following suggests to me that modifying the mask of a view will modify the mask of the original.

When accessing a slice, the output is a masked array whose data attribute is a view of the original data, and whose mask is either nomask (if there was no invalid entries in the original array) or a view of the corresponding slice of the original mask. The view is required to ensure propagation of any modification of the mask to the original.

ben-e-whitney on 28 Feb 2018

👍1

Is there any interest in moving towards removing nomask outright?

Maybe, also masked. This probably comes down to either making some big changes to masked arrays, or implementing a new class altogether. Might be worth putting together an NEP. I seldom use masked arrays, so this is something best done by people who need the functionality.