Ray: Lost reference to actor exception for seemingly valid code.

Created on 25 Nov 2019  路  3Comments  路  Source: ray-project/ray

The following seems like valid code to me, but it raises a Lost reference to actor exception. Or is the following invalid?

import ray
ray.init()

@ray.remote 
class Foo: 
    def method(self): 
        pass 

ray.get(Foo.remote().method.remote())

The last line raises the following exception

RuntimeError                              Traceback (most recent call last)
<ipython-input-2-322859fd3f86> in <module>
----> 1 ray.get(Foo.remote().method.remote())

~/Workspace/ray/python/ray/actor.py in remote(self, *args, **kwargs)
    107 
    108     def remote(self, *args, **kwargs):
--> 109         return self._remote(args, kwargs)
    110 
    111     def _remote(self, args=None, kwargs=None, num_return_vals=None):

~/Workspace/ray/python/ray/actor.py in _remote(self, args, kwargs, num_return_vals)
    127             invocation = self._decorator(invocation)
    128 
--> 129         return invocation(args, kwargs)
    130 
    131     def __getstate__(self):

~/Workspace/ray/python/ray/actor.py in invocation(args, kwargs)
    116             actor = self._actor_hard_ref or self._actor_ref()
    117             if actor is None:
--> 118                 raise RuntimeError("Lost reference to actor")
    119             return actor._actor_method_call(
    120                 self._method_name,

RuntimeError: Lost reference to actor

cc @edoakes @stephanie-wang @ericl

P2 good first issue question

All 3 comments

It used to be supported on older versions of Ray, but had a large performance impact since we couldn't cache method handles in the actor class.

Now, we cache method handles, but have to keep a weak reference back to the actor. So in some cases like this, the actor is GC'ed before the method call completes.

Any suggestion about how to fix / workaround it?

I've been seeing the same, I think.

  ray.init(num_cpus=2)
  owner = task_dependency_graph.graph.create_graph_remote_handle(grid, placeToSaveFinal)
  owner.fillInNodeValue.remote(startNode, 0)
2020-01-16 11:34:30,614 INFO resource_spec.py:216 -- Starting Ray with 0.68 GiB memory available for workers and up to 0.34 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2020-01-16 11:34:38,793 ERROR worker.py:994 -- Possible unhandled error from worker: ray::GraphOwner.fillInNodeValue() (pid=32465, ip=192.168.1.183)
  File "python/ray/_raylet.pyx", line 636, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 619, in ray._raylet.execute_task.function_executor
  File "/home/task-dependency-graph/src/task_dependency_graph/graph.py", line 80, in fillInNodeValue
    self.checkForJobsAndStart()
  File "/home/task-dependency-graph/src/task_dependency_graph/graph.py", line 117, in checkForJobsAndStart
    self.graph.nodes[edgeToCompute.source]['value'],
  File "/home/task-dependency-graph/.tox/docs/lib/python3.6/site-packages/ray/actor.py", line 107, in remote
    return self._remote(args, kwargs)
  File "/home/task-dependency-graph/.tox/docs/lib/python3.6/site-packages/ray/actor.py", line 127, in _remote
    return invocation(args, kwargs)
  File "/home/task-dependency-graph/.tox/docs/lib/python3.6/site-packages/ray/actor.py", line 116, in invocation
    raise RuntimeError("Lost reference to actor")
RuntimeError: Lost reference to actor

For my part, I'm calling EdgeCalculator.remote().callMeBackWithAnswer.remote() in checkForJobsAndStart.
@Maverobot For whatever it's worth, for me the problem was perturbed out of existence when I changed to

        edgeCalculator = EdgeCalculator.remote(self.myRemoteHandle)
        edgeCalculator.callMeBackWithAnswer.remote()
        # end of checkForJobsAndStart

I have no idea why that would help, since the function checkForJobsAndStart ended immediately after that line, so in theory the edgeCalculator would have been immediately garbage-collected anyway. (What I was planning to do was to add the edgeCalculator to GLOBAL_ANTI_GARBAGE_COLLECTION_SET as a workaround hack.)

Unless I'm missing something, ray.get(Foo.remote().method.remote()) is still what https://ray.readthedocs.io/en/latest/ recommends:

import ray
ray.init()

@ray.remote
class Counter(object):
    def __init__(self):
        self.n = 0
    def increment(self):
        self.n += 1
    def read(self):
        return self.n

counters = [Counter.remote() for i in range(4)]
[c.increment.remote() for c in counters]
futures = [c.read.remote() for c in counters]
print(ray.get(futures))

It's not clear what this should be replaced with.

Was this page helpful?
0 / 5 - 0 ratings