Tip 3 in doc/examples/tips-for-first-time.rst gives this example:
import time
import numpy as np
import ray
ray.init(num_cpus = 4)
@ray.remote
def no_work(a):
return
start = time.time()
a_id = ray.put(np.zeros((5000, 5000)))
result_ids = [no_work.remote(a_id) for x in range(10)]
results = ray.get(result_ids)
print("duration =", time.time() - start)
which suggests (to me...) that the following would work:
import time
import numpy as np
import ray
ray.init(num_cpus = 4)
@ray.remote
def no_work(a):
ray.get(a)
return
start = time.time()
a_id = ray.put(np.zeros((5000, 5000)))
result_ids = [no_work.remote(a_id) for x in range(10)]
results = ray.get(result_ids)
print("duration =", time.time() - start)
Which it doesn't. Per #9578, this is because out of band passing of object IDs is not supported. Should the documentation be updated to reflect this by pickling the passed object (also not formally supported per #9578) or should it be rewritten?
When an object id is passed to the remote function / actors, they are translated automatically to values, so ray.get will just cause type errors. I am not sure if that out of band issue is related to this.
Yeah you're right, I'm clearly not awake 馃憤
I'll put in a PR to just add pickling to that example then, because otherwise that example is a bit misleading.
Hmm, actually I think this is a feature, not a bug. Perhaps instead we should document this really clearly:
```
import time
import numpy as np
import ray
ray.init(num_cpus = 4)
@ray.remote
def no_work(a):
# Ray automatically resolves object_ids inside remote functions
# NOTE: this is a numpy array
print(a)
return
start = time.time()
a_id = ray.put(np.zeros((5000, 5000)))
result_ids = [no_work.remote(a_id) for x in range(10)]
results = ray.get(result_ids)
print("duration =", time.time() - start)
@richardliaw I think his point is that the explanation for that example demonstrates how to "not copy the object" to the workers when passing an object ref created by ray.put. But the actual example needs to do copy (unless it is zero-copy-read).
Sorry, I don't understand.
Tip 3 is "Avoid passing same object repeatedly to remote tasks". The given example for this tip as-is demonstrates that you can do 1 copy-into-object-store step for N tasks, instead of N copy-into-object-store steps for N tasks.
Ah, my bad. You are definitely right.
Yep, now I've come back to this it was clearly a misunderstanding on my part. A comment to note that when you pass object_ids to remote functions they are resolved back to the object would be helpful, b/c it feels a bit unexpected (to me at least) that type changes can occur at that point.
Great, good to reach consensus! @dfaligertwood maybe you can update your PR to include the above comments and also become a ray contributor :D ?
Most helpful comment
Ah, my bad. You are definitely right.