This is the master issue to track the progress of formally releasing Java API.
Java worker code was contributed by Ant Financial in 2018 as an experimental feature. It has also been used in the past 2 years internally at Ant. Now we think the code should be stable enough for the formal release.
--include-java and --java-worker-options from ray start. We had a discussion last week, revisiting current Java API. One problem we found is that the current design sometimes doesn't follow Java convention very well (not object-oriented).
The following is an example of using actors (see here and here for other APIs).
# Define an actor class.
public class Adder {
private int sum;
public Adder(int initValue) {
sum = initValue;
}
public int add(int n) {
return sum += n;
}
}
# Create an actor.
RayActor<Adder> adder = Ray.createActor(Adder::new, 0);
# Call the `add` method.
RayObject<Integer> result = Ray.call(Adder::add, adder, 1);
System.out.println(result.get()); // 1
The problem is that the Ray.call(Adder::add, adder, 1) part doesn't look intuitive to Java users. Also it's easy for forget the right order of these arguments.
We plan to change this API to be
adder.call(Adder::add, 1);
Using method reference Adder::add is because we want to leverage Java's static type checking. Due to this, Java can't do adder.add.remote(1) as Python.
I haven't tried this to see if it can work, but what about this:
# Define an actor class.
# Does it need to implement Serializable?
public class Adder implements Serializable {
private int sum;
public Adder(int initValue) {
sum = initValue;
}
public int add(int n) {
return sum += n;
}
}
# Create an actor. It seems to me that if you implement RayActor
# as a "wrapper" class, then you should be able to use normal
# object semantics with lambdas as the serializable function. I
# haven't tried this, so I don't really know if it works with
# Java:
# Construct the Adder instance, then serialize the state and
# reconstruct elsewhere (clunky, but easier to understand...)
RayActor<Adder> adder = Ray.createActor(new Adder(0));
# Call the `add` method.
# The RayActor takes the lambda and passes the nested Adder
# instance to the lambda as argument "a" in this example:
RayObject<Integer> result = adder.call((a) -> a.add(1));
System.out.println(result.get()); // 1
@raulchen thanks for bringing this up.
adder.call(Adder::add, 1);Ray.call(Adder::add, adder, 1) (the status quo)adder.add(1)I agree 1 looks better than 2. Is 3 feasible or no?
Hi @deanwampler, there is one problem with the approach you suggested. That is, the actor method should not only take arguments of the original type, but also take arguments of in-object-store objects. For example,
RayObject<Integer> arg = Ray.put(1);
// Both should work.
adder.call(Adder::add, 1);
adder.call(Adder::add, arg);
The reason why this can work is because the call method is defined with multiple overloaded versions:
// R is the return type, and T0 is the type of the first argument.
public <R, T0> R call(RayFunction<R, T0> func, T0 arg0);
public <R, T0> R call(RayFunction<R, T0> func, RayObject<T0> arg0);
Note, call is defined in Ray's builtin class RayActor (equivalent to Python's ActorHandle).
However, in your approach, We can't do the following, because the add method is defined in user's class (Adder).
RayObject<Integer> arg = Ray.put(1);
adder.call((a) -> a.add(arg));
BTW, due to the overloaded versions of the call method, another small problem is that we have to define 2^N versions, where N is the max supported number of arguments (currently set to 6). This isn't too big of a problem, the only drawback is that it sometimes slows down the IDE, because of large index.
@robertnishihara 3 isn't feasible in Java. Because the type of adder is RayActor (equivalent to Python's ActorHandle), we can't dynamically add a new method add to this class like Python.
@raulchen Ah, good point on passing other objects. Here's another attempt to think about. This actually compiles and runs (Java 11) ;)
import java.util.function.Function;
public class Ray {
public static class RayObject<T> {
private T value;
public RayObject(T t) {
// Do something remote...
value = t;
}
public T get() {
// Fetch the value from the object store
return value;
}
public <R> R remote(Function<? super T, ? extends R> func) {
return func.apply(value);
}
}
public static class Person {
private String name;
private int age;
public Person(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() { return name; }
public int getAge() { return age; }
}
public static void main(String[] args) {
RayObject<Integer> roInt = new RayObject<Integer>(20);
RayObject<String> roString = new RayObject<String>("Joe");
assert(roInt.get() == 20);
assert(roString.get().equals("Joe"));
RayObject<Person> roPerson = new RayObject<Person>(
new Person(roString.get(), roInt.get()));
Person person = roPerson.get();
assert(person.getName().equals("Joe"));
assert(person.getAge() == 20);
assert(roPerson.remote((p) -> p.getName()).equals("Joe"));
assert(roPerson.remote((p) -> p.getAge()) == 20);
String name = roPerson.remote((p) -> p.getName());
int age = roPerson.remote((p) -> p.getAge());
System.out.printf("name: %s\n", name);
System.out.printf("age: %d\n", age);
}
}
Anyway, main routine looks reasonable. Obviously the implementation isn't real, so I may be missing some fundamental issues.
@deanwampler, thanks for the suggestion.
I think we can't do the remote invocation in the constructor of RayObject, where you put // Do something remote.... Because the remote function to run isn't specified in the constructor (it's specified in the remote method in your example).
Another issue is RayObject<Person> roPerson = new RayObject<new Person(roString.get(), roInt.get()));. When we do roString.get(), we've already gotten this object to the local worker. Instead, I think we want to pass the reference to the remote worker.
I see the issue with how I constructed RayObject(Person).
Can we also fix RayPyActor extends RayActor? RayActor originally represents a Java actor handle but inherited by RayPyActor after cross-languages features were added. We need a base interface for an actor handle of any languages.
Can we also fix
RayPyActor extends RayActor?RayActororiginally represents a Java actor handle but inherited byRayPyActorafter cross-languages features were added. We need a base interface for an actor handle of any languages.
@kfstorm sounds reasonable. Can you submit a PR?
@raulchen regarding ordering of these tasks, should we make sure binary packages are available before updating the docs? It seems the packaging prs are still a work in progress. Otherwise it is quite hard for users to install Java.
@ericl sure, I'll prioritize binary package and API refactor before updating the docs.
I'm also bumping the priority of this task to P1.
After some discussions, we decided to upgrade Java API to make it easier to use, more readable, and more consistent with Python.
CallOptions options = new CallOptions.Builder.setResources(...).createCallOptions();
RayObject<Integer> res = Ray.call(Foo::foo, 1, options); // `options` can be omitted.
ObjectRef<Integer> res = Ray.task(Foo::foo, 1).setResource(...).remote(); // `setResource` can be omitted.
ActorCreationOptions options = new ActorCreationOpitons.Builder().setResources(...).setMaxRestarts(...).createActorCreationOptions();
RayActor<MyActor> actor = Ray.createActor(MyActor::new, 1, options);
RayObject<Integer> result = actor.call(MyActor::foo, 1);
ActorHandle<MyActor> actor = Ray.actor(MyActor::new, 1)..setResources(...).setMaxRestarts(...).remote();
ObjectRef<Integer> result = actor.task(MyActor::foo, 1).remote();
Most helpful comment
After some discussions, we decided to upgrade Java API to make it easier to use, more readable, and more consistent with Python.
Remote tasks
Current API
New API
Actors
Current API
New API