Ray: First Release of Java API

Created on 24 Feb 2020 · 13Comments · Source: ray-project/ray

This is the master issue to track the progress of formally releasing Java API.

Background

Java worker code was contributed by Ant Financial in 2018 as an experimental feature. It has also been used in the past 2 years internally at Ant. Now we think the code should be stable enough for the formal release.

Action items.

[ ] Simplify usage for beginners: deploy jars (with all platform binaries included) to Maven's central repo. So that users can use it out of box, without having to compile from code (#6876).
[ ] Make Java version aligned with Python.
[ ] Enhance integration tests for new version releases.
[x] Finalize API: revisit API design and refine it if needed; avoid making breaking changes after formal release.
[ ] Improve documentation.
[ ] Remove --include-java and --java-worker-options from ray start.

P2 java

Source

raulchen

Most helpful comment

After some discussions, we decided to upgrade Java API to make it easier to use, more readable, and more consistent with Python.

Remote tasks

Current API

CallOptions options = new CallOptions.Builder.setResources(...).createCallOptions();
RayObject<Integer> res = Ray.call(Foo::foo, 1, options);  // `options` can be omitted.

New API

ObjectRef<Integer> res = Ray.task(Foo::foo, 1).setResource(...).remote(); // `setResource` can be omitted.

Actors

Current API

ActorCreationOptions options = new ActorCreationOpitons.Builder().setResources(...).setMaxRestarts(...).createActorCreationOptions();
RayActor<MyActor> actor = Ray.createActor(MyActor::new, 1, options);
RayObject<Integer> result = actor.call(MyActor::foo, 1);

New API

ActorHandle<MyActor> actor = Ray.actor(MyActor::new, 1)..setResources(...).setMaxRestarts(...).remote();
ObjectRef<Integer> result = actor.task(MyActor::foo, 1).remote();

raulchen on 4 Jun 2020

🎉4

All 13 comments

We had a discussion last week, revisiting current Java API. One problem we found is that the current design sometimes doesn't follow Java convention very well (not object-oriented).

The following is an example of using actors (see here and here for other APIs).

# Define an actor class.
public class Adder {

  private int sum;

  public Adder(int initValue) {
    sum = initValue;
  }

  public int add(int n) {
    return sum += n;
  }
}

# Create an actor.
RayActor<Adder> adder = Ray.createActor(Adder::new, 0);

# Call the `add` method.
RayObject<Integer> result = Ray.call(Adder::add, adder, 1);
System.out.println(result.get()); // 1

The problem is that the Ray.call(Adder::add, adder, 1) part doesn't look intuitive to Java users. Also it's easy for forget the right order of these arguments.

We plan to change this API to be

adder.call(Adder::add, 1);

Using method reference Adder::add is because we want to leverage Java's static type checking. Due to this, Java can't do adder.add.remote(1) as Python.

raulchen on 24 Feb 2020

I haven't tried this to see if it can work, but what about this:

# Define an actor class.
# Does it need to implement Serializable?
public class Adder implements Serializable {

  private int sum;

  public Adder(int initValue) {
    sum = initValue;
  }

  public int add(int n) {
    return sum += n;
  }
}

# Create an actor. It seems to me that if you implement RayActor
# as a "wrapper" class, then you should be able to use normal
# object semantics with lambdas as the serializable function. I
# haven't tried this, so I don't really know if it works with
# Java:
# Construct the Adder instance, then serialize the state and 
# reconstruct elsewhere (clunky, but easier to understand...)
RayActor<Adder> adder = Ray.createActor(new Adder(0));

# Call the `add` method.
# The RayActor takes the lambda and passes the nested Adder
# instance to the lambda as argument "a" in this example:
RayObject<Integer> result = adder.call((a) -> a.add(1));
System.out.println(result.get()); // 1

deanwampler on 24 Feb 2020

@raulchen thanks for bringing this up.

adder.call(Adder::add, 1);
Ray.call(Adder::add, adder, 1) (the status quo)
adder.add(1)

I agree 1 looks better than 2. Is 3 feasible or no?

robertnishihara on 24 Feb 2020

Hi @deanwampler, there is one problem with the approach you suggested. That is, the actor method should not only take arguments of the original type, but also take arguments of in-object-store objects. For example,

RayObject<Integer> arg = Ray.put(1);
// Both should work.
adder.call(Adder::add, 1); 
adder.call(Adder::add, arg);

The reason why this can work is because the call method is defined with multiple overloaded versions:

// R is the return type, and T0 is the type of the first argument.
public <R, T0> R call(RayFunction<R, T0> func, T0 arg0);
public <R, T0> R call(RayFunction<R, T0> func, RayObject<T0> arg0);

Note, call is defined in Ray's builtin class RayActor (equivalent to Python's ActorHandle).

However, in your approach, We can't do the following, because the add method is defined in user's class (Adder).

RayObject<Integer> arg = Ray.put(1);
adder.call((a) -> a.add(arg));

BTW, due to the overloaded versions of the call method, another small problem is that we have to define 2^N versions, where N is the max supported number of arguments (currently set to 6). This isn't too big of a problem, the only drawback is that it sometimes slows down the IDE, because of large index.

raulchen on 25 Feb 2020

@robertnishihara 3 isn't feasible in Java. Because the type of adder is RayActor (equivalent to Python's ActorHandle), we can't dynamically add a new method add to this class like Python.

raulchen on 25 Feb 2020

@raulchen Ah, good point on passing other objects. Here's another attempt to think about. This actually compiles and runs (Java 11) ;)

import java.util.function.Function;

public class Ray {

    public static class RayObject<T> {
        private T value;
        public RayObject(T t) {
            // Do something remote...
            value = t;
        }
        public T get() {
            // Fetch the value from the object store
            return value;
        }

        public <R> R remote(Function<? super T, ? extends R> func) {
            return func.apply(value);
        }
    }

    public static class Person {
        private String name;
        private int age;

        public Person(String name, int age) {
            this.name = name;
            this.age = age;
        }
        public String getName() { return name; }
        public int    getAge()  { return age;  }
    }

    public static void main(String[] args) {

        RayObject<Integer> roInt = new RayObject<Integer>(20);
        RayObject<String> roString = new RayObject<String>("Joe");
        assert(roInt.get() == 20);
        assert(roString.get().equals("Joe"));

        RayObject<Person> roPerson = new RayObject<Person>(
            new Person(roString.get(), roInt.get()));

        Person person = roPerson.get();
        assert(person.getName().equals("Joe"));
        assert(person.getAge() == 20);

        assert(roPerson.remote((p) -> p.getName()).equals("Joe"));
        assert(roPerson.remote((p) -> p.getAge()) == 20);

        String name = roPerson.remote((p) -> p.getName());
        int    age  = roPerson.remote((p) -> p.getAge());
        System.out.printf("name: %s\n", name);
        System.out.printf("age:  %d\n", age);
    }
}

Anyway, main routine looks reasonable. Obviously the implementation isn't real, so I may be missing some fundamental issues.

deanwampler on 26 Feb 2020

@deanwampler, thanks for the suggestion.

I think we can't do the remote invocation in the constructor of RayObject, where you put // Do something remote.... Because the remote function to run isn't specified in the constructor (it's specified in the remote method in your example).

Another issue is RayObject<Person> roPerson = new RayObject<new Person(roString.get(), roInt.get()));. When we do roString.get(), we've already gotten this object to the local worker. Instead, I think we want to pass the reference to the remote worker.

raulchen on 27 Feb 2020

I see the issue with how I constructed RayObject(Person).

deanwampler on 27 Feb 2020

Can we also fix RayPyActor extends RayActor? RayActor originally represents a Java actor handle but inherited by RayPyActor after cross-languages features were added. We need a base interface for an actor handle of any languages.

kfstorm on 4 Mar 2020

Can we also fix RayPyActor extends RayActor? RayActor originally represents a Java actor handle but inherited by RayPyActor after cross-languages features were added. We need a base interface for an actor handle of any languages.

@kfstorm sounds reasonable. Can you submit a PR?

raulchen on 5 Mar 2020

@raulchen regarding ordering of these tasks, should we make sure binary packages are available before updating the docs? It seems the packaging prs are still a work in progress. Otherwise it is quite hard for users to install Java.

ericl on 1 Jun 2020

@ericl sure, I'll prioritize binary package and API refactor before updating the docs.

I'm also bumping the priority of this task to P1.

raulchen on 2 Jun 2020

After some discussions, we decided to upgrade Java API to make it easier to use, more readable, and more consistent with Python.

Remote tasks

Current API

CallOptions options = new CallOptions.Builder.setResources(...).createCallOptions();
RayObject<Integer> res = Ray.call(Foo::foo, 1, options);  // `options` can be omitted.

New API

ObjectRef<Integer> res = Ray.task(Foo::foo, 1).setResource(...).remote(); // `setResource` can be omitted.

Actors

Current API

ActorCreationOptions options = new ActorCreationOpitons.Builder().setResources(...).setMaxRestarts(...).createActorCreationOptions();
RayActor<MyActor> actor = Ray.createActor(MyActor::new, 1, options);
RayObject<Integer> result = actor.call(MyActor::foo, 1);

New API

ActorHandle<MyActor> actor = Ray.actor(MyActor::new, 1)..setResources(...).setMaxRestarts(...).remote();
ObjectRef<Integer> result = actor.task(MyActor::foo, 1).remote();

raulchen on 4 Jun 2020

🎉4

Was this page helpful?

0 / 5 - 0 ratings