Ray: Cannot initialise superclass from actor process

Created on 13 Nov 2019  路  7Comments  路  Source: ray-project/ray

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Manjaro Linux 18.1.2
  • Ray installed from (source or binary): binary, pip install
  • Ray version: 0.7.6
  • Python version: 3.7.4

Describe the problem


I am trying to parallelise a sampling process, so I created a Sampler object. The Sampler depends on two datasets, which are large (stored as numpy arrays), which are arguments to the constructor. To avoid having duplicates in the object store, my idea has been to first use ray.put to add the object to the object store and then initialise the Sampler objects with the corresponding ids.

Moreover, I don't want to add the decorators to the Sampler class. Instead, I created a subclass of Sampler, RemoteSampler which decorates the methods of the superclass and modifies them by adding the .remote() call. However, I seem to be unable to initialise the superclass from the ActorClass. I get a type error:

TypeError: super() argument 1 must be type, not ActorClass(RemoteSampler).

Source code / logs

Please see the skeleton code below:

class Sampler(object):

    def __init__(self, train_data, d_train_data, *others):

        # these can be big, so we want to have only one copy that
        # mutliple actors share
        if isinstance(train_data, np.ndarray):
            self.train_data = train_data
        else:
            self.train_data = ray.get(train_data)

        if isinstance(d_train_data, np.ndarray):
            self.d_train_data = d_train_data
        else:
            self.d_train_data = ray.get(d_train_data)

        # Initialise the rest of the sampler state
        self.d1 = {}
        self.d2 = {}

    def __call__(self, features, n_samples):

        a, b, c = self._sampling_loop(features, n_samples)

        # process a, b, c and return them

        return a, b, c

    def build_lookups(self, X):

    # Use X to modify state of d1 and d2

    def _sampling_loop(self, features, n_samples):
    # Use train_data, d_train data and other attributes to
    # return some data to call`

@ray.remote
class RemoteSampler(Sampler):

    def __init__(self, *args):
        super(RemoteSampler, self).__init__(*args)
        # TODO: don't hardcode return vals

    @ray.method(num_return_vals=4)
    def __call__(self, anchor, num_samples):
        return self(anchor, num_samples)

    @ray.method(num_return_vals=3)
    def build_lookups(self, X):
        a, b, c = self.build_lookups(X)
        return a, b, c

def _fit_parallel(*args, **kwargs):
    # method of a class where the RemoteSampler objects are initialised
   train_data, d_train_data, *others = args
   train_data_id = ray.put(train_data)
   d_train_data_id = ray.put(d_train_data)
   n_args = (train_data_id, d_train_data_id, *others)

   return [RemoteSampler.remote(*n_args) for _ in range(kwargs['ncpu'])]
P2 bug core

Most helpful comment

@alexcoca More concisely, use super().__init__([your_args]). (I believe Python2 does not support this, but Ray's python2 support reached to the end anyway)

All 7 comments

cc @pcmoritz

Experienced something similar.

I created a base class BaseReader, which was then extended on a new class CoolReader(BaseReader). The extended class was then decorated with @ray.remote and instantiated as cool_reader = CoolReader.remote()

when I try to run results = cool_reader.func.remote(0), I get:

2020-02-01 02:39:16,989 ERROR worker.py:1003 -- Possible unhandled error from worker: ray::CoolReader.__init__() (pid=6095, ip=10.123.134.28)
File "python/ray/_raylet.pyx", line 643, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 623, in function_executor
File "/opt/experiment/reader.py", line 70, in __init__
super(CoolReader, self).__init__()
TypeError: super() argument 1 must be type, not ActorClass(CoolReader)

If I suppress the call super(BaseReader).__init__() at CoolReader.__init__, this error is gone, but that call to the base class constructor is needed

I am using ray 0.8.1

Btw, a quick workaround is to call super() with no arguments.

@alexcoca More concisely, use super().__init__([your_args]). (I believe Python2 does not support this, but Ray's python2 support reached to the end anyway)

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

Closing because I think this is a duplicate of https://github.com/ray-project/ray/issues/449. Please reopen if that's a mistake.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alex-petrenko picture alex-petrenko  路  34Comments

janblumenkamp picture janblumenkamp  路  49Comments

mattearllongshot picture mattearllongshot  路  33Comments

robertnishihara picture robertnishihara  路  36Comments

raulchen picture raulchen  路  41Comments