I am trying to parallelise a sampling process, so I created a Sampler object. The Sampler depends on two datasets, which are large (stored as numpy arrays), which are arguments to the constructor. To avoid having duplicates in the object store, my idea has been to first use ray.put to add the object to the object store and then initialise the Sampler objects with the corresponding ids.
Moreover, I don't want to add the decorators to the Sampler class. Instead, I created a subclass of Sampler, RemoteSampler which decorates the methods of the superclass and modifies them by adding the .remote() call. However, I seem to be unable to initialise the superclass from the ActorClass. I get a type error:
TypeError: super() argument 1 must be type, not ActorClass(RemoteSampler).
Please see the skeleton code below:
class Sampler(object):
def __init__(self, train_data, d_train_data, *others):
# these can be big, so we want to have only one copy that
# mutliple actors share
if isinstance(train_data, np.ndarray):
self.train_data = train_data
else:
self.train_data = ray.get(train_data)
if isinstance(d_train_data, np.ndarray):
self.d_train_data = d_train_data
else:
self.d_train_data = ray.get(d_train_data)
# Initialise the rest of the sampler state
self.d1 = {}
self.d2 = {}
def __call__(self, features, n_samples):
a, b, c = self._sampling_loop(features, n_samples)
# process a, b, c and return them
return a, b, c
def build_lookups(self, X):
# Use X to modify state of d1 and d2
def _sampling_loop(self, features, n_samples):
# Use train_data, d_train data and other attributes to
# return some data to call`
@ray.remote
class RemoteSampler(Sampler):
def __init__(self, *args):
super(RemoteSampler, self).__init__(*args)
# TODO: don't hardcode return vals
@ray.method(num_return_vals=4)
def __call__(self, anchor, num_samples):
return self(anchor, num_samples)
@ray.method(num_return_vals=3)
def build_lookups(self, X):
a, b, c = self.build_lookups(X)
return a, b, c
def _fit_parallel(*args, **kwargs):
# method of a class where the RemoteSampler objects are initialised
train_data, d_train_data, *others = args
train_data_id = ray.put(train_data)
d_train_data_id = ray.put(d_train_data)
n_args = (train_data_id, d_train_data_id, *others)
return [RemoteSampler.remote(*n_args) for _ in range(kwargs['ncpu'])]
cc @pcmoritz
Experienced something similar.
I created a base class BaseReader, which was then extended on a new class CoolReader(BaseReader). The extended class was then decorated with @ray.remote and instantiated as cool_reader = CoolReader.remote()
when I try to run results = cool_reader.func.remote(0), I get:
2020-02-01 02:39:16,989 ERROR worker.py:1003 -- Possible unhandled error from worker: ray::CoolReader.__init__() (pid=6095, ip=10.123.134.28)
File "python/ray/_raylet.pyx", line 643, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 623, in function_executor
File "/opt/experiment/reader.py", line 70, in __init__
super(CoolReader, self).__init__()
TypeError: super() argument 1 must be type, not ActorClass(CoolReader)
If I suppress the call super(BaseReader).__init__() at CoolReader.__init__, this error is gone, but that call to the base class constructor is needed
I am using ray 0.8.1
Probably the same issue as https://github.com/ray-project/ray/issues/449.
Btw, a quick workaround is to call super() with no arguments.
@alexcoca More concisely, use super().__init__([your_args]). (I believe Python2 does not support this, but Ray's python2 support reached to the end anyway)
Hi, I'm a bot from the Ray team :)
To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.
If there is no further activity in the 14 days, the issue will be closed!
You can always ask for help on our discussion forum or Ray's public slack channel.
Closing because I think this is a duplicate of https://github.com/ray-project/ray/issues/449. Please reopen if that's a mistake.
Most helpful comment
@alexcoca More concisely, use
super().__init__([your_args]). (I believe Python2 does not support this, but Ray's python2 support reached to the end anyway)