Ray: NumPy masked arrays are not properly serialized/deserialized.

Created on 2 Mar 2020  路  2Comments  路  Source: ray-project/ray

What is the problem?

For Ray v0.8.2, NumPy masked arrays are not properly serialized/deserialized; the masks are dropped, with plain ndarrays returned on deserilization.

In [4]: a = np.ma.masked_array([1,2],[True,False])

In [5]: a
Out[5]:
masked_array(data=[--, 2],
             mask=[ True, False],
       fill_value=999999)

In [6]: b = ray.get(ray.put(a))

In [7]: b
Out[7]: array([1, 2])

This issue does not exist on < 0.8.1, AFAICT.

Version information:

  • Python 3.7.5
  • Ubuntu 19.10
  • ray 0.8.2 (confirmed on nightly wheel and source build off current [2d97650] master)
  • pyarrow 0.16.0

Reproduction

Running

import numpy as np
import ray

ray.init()
arr = np.ma.masked_array([1, 2], [True, False])
put_arr = ray.get(ray.put(arr))
np.testing.assert_equal(type(arr), type(put_arr))

should give

Traceback (most recent call last):
  File "ray_masked_array_bug_repro.py", line 7, in <module>
    np.testing.assert_equal(type(arr), type(put_arr))
  File "/home/clark/.local/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 428, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: <class 'numpy.ma.core.MaskedArray'>
 DESIRED: <class 'numpy.ndarray'>
  • [x] I have verified my script runs in a clean environment and reproduces the issue.
  • [x] I have verified the issue also occurs with the latest wheels.
bug

Most helpful comment

Given that np.core.numeric._frombuffer is mask agnostic, I'm guessing that masked array support was accidentally dropped in this PR, due to the added NumPy array reduce override.

I've just confirmed that this PR, fixing a related serialization bug via adding an ndarray subclass check, fixes the issue.

All 2 comments

Given that np.core.numeric._frombuffer is mask agnostic, I'm guessing that masked array support was accidentally dropped in this PR, due to the added NumPy array reduce override.

I've just confirmed that this PR, fixing a related serialization bug via adding an ndarray subclass check, fixes the issue.

Thanks for checking, https://github.com/ray-project/ray/pull/7392 is merged now!

Was this page helpful?
0 / 5 - 0 ratings