For Ray v0.8.2, NumPy masked arrays are not properly serialized/deserialized; the masks are dropped, with plain ndarrays returned on deserilization.
In [4]: a = np.ma.masked_array([1,2],[True,False])
In [5]: a
Out[5]:
masked_array(data=[--, 2],
mask=[ True, False],
fill_value=999999)
In [6]: b = ray.get(ray.put(a))
In [7]: b
Out[7]: array([1, 2])
This issue does not exist on < 0.8.1, AFAICT.
Running
import numpy as np
import ray
ray.init()
arr = np.ma.masked_array([1, 2], [True, False])
put_arr = ray.get(ray.put(arr))
np.testing.assert_equal(type(arr), type(put_arr))
should give
Traceback (most recent call last):
File "ray_masked_array_bug_repro.py", line 7, in <module>
np.testing.assert_equal(type(arr), type(put_arr))
File "/home/clark/.local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 428, in assert_equal
raise AssertionError(msg)
AssertionError:
Items are not equal:
ACTUAL: <class 'numpy.ma.core.MaskedArray'>
DESIRED: <class 'numpy.ndarray'>
Given that np.core.numeric._frombuffer is mask agnostic, I'm guessing that masked array support was accidentally dropped in this PR, due to the added NumPy array reduce override.
I've just confirmed that this PR, fixing a related serialization bug via adding an ndarray subclass check, fixes the issue.
Thanks for checking, https://github.com/ray-project/ray/pull/7392 is merged now!
Most helpful comment
Given that
np.core.numeric._frombufferis mask agnostic, I'm guessing that masked array support was accidentally dropped in this PR, due to the added NumPy array reduce override.I've just confirmed that this PR, fixing a related serialization bug via adding an ndarray subclass check, fixes the issue.