Let's say I have the following classes for storing bytes or ints:
public abstract class A {
public abstract Object getValueAt(int index);
public abstract int getSize();
}
public class B extends A {
private final byte[] bytes;
public B(byte[] bytes) {
this.bytes = bytes;
}
@Override
public Object getValueAt(int index) {
return bytes[index];
}
@Override
public int getSize() {
return bytes.length;
}
}
public class C extends A {
private final int[] ints;
public C(int[] ints) {
this.ints = ints;
}
@Override
public Object getValueAt(int index) {
return ints[index];
}
@Override
public int getSize() {
return ints.length;
}
}
To get all values, one could implement the following node:
public abstract class GetAllValuesNode1 extends Node {
public abstract Object[] executeGeneric(A obj);
@Specialization
protected final Object[] doA(A obj) {
Object[] values = new Object[obj.getSize()];
for (int i = 0; i < values.length; i++) {
values[i] = obj.getValueAt(i);
}
return values;
}
}
This, however, will cause virtual method invocations which Truffle can't optimize well.
First, I thought I could rewrite the node like this:
public abstract class GetAllValuesNode2 extends Node {
public abstract Object executeGeneric(A obj);
@Specialization
protected final Object[] doB(B obj) {
return doA(obj);
}
@Specialization
protected final Object[] doC(C obj) {
return doA(obj);
}
private Object[] doA(A obj) {
Object[] values = new Object[obj.getSize()];
for (int i = 0; i < values.length; i++) {
values[i] = obj.getValueAt(i);
}
return values;
}
}
Unfortunately, virtual calls are still a problem.
To avoid virtual calls completely, one could inline the helper method, so that the node looks like this:
public abstract class GetAllValuesNode3 extends Node {
public abstract Object executeGeneric(A obj);
@Specialization
protected final Object[] doB(B obj) {
Object[] values = new Object[obj.getSize()];
for (int i = 0; i < values.length; i++) {
values[i] = obj.getValueAt(i);
}
return values;
}
@Specialization
protected final Object[] doC(C obj) {
Object[] values = new Object[obj.getSize()];
for (int i = 0; i < values.length; i++) {
values[i] = obj.getValueAt(i);
}
return values;
}
}
Now that all virtual calls are gone, Truffle can optimize things. However, code had to be duplicated for each subclass of A, which becomes worse if there are more subclasses of course.
Maybe an annotation processor could help hiding this problem:
public abstract class GetAllValuesNode4 extends Node {
public abstract Object[] executeGeneric(A obj);
@Specialization
@DuplicateForAllSubClasses
protected final Object[] doA(A obj) {
Object[] values = new Object[obj.getSize()];
for (int i = 0; i < values.length; i++) {
values[i] = obj.getValueAt(i);
}
return values;
}
}
Because of @DuplicateForAllSubClasses, the above would generate something similar to GetAllValuesNode3.
I understand one could also use a ValueProfile to work around this problem, but I'm not sure this results in exactly the same behavior.
/cc @thomaswue
Treat the following as pseudo-code. Would something like this achieve what you want? Only downside currently would be that the partial evaluator will not work across the #apply call (i.e., the call will be inlined only later), but this might be fixable.
```java
public class VirtualMethodProfile
private final Function
private @CompilationFinal(dimensions = 1) Class[] receiverTypes;
public VirtualMethodProfile(Function<T, R> function) {
this.function = function;
receiverTypes = new Class[0];
}
@ExplodeLoop
public R apply(T receiver) {
for (int i = 0; i < receiverTypes.length; ++i) {
if (receiver.getClass() == receiverTypes[i]) {
return function.apply(receiver);
}
}
CompilerDirectives.transferToInterpreterAndInvalidate();
Class[] newTypes = Arrays.copyOf(receiverTypes, receiverTypes.length + 1);
newTypes[newTypes.length - 1] = receiver.getClass();
receiverTypes = newTypes;
return function.apply(receiver);
}
}`
I haven't fully understood how Truffle profiles work in detail to be honest, but this one records receiver classes and adds new ones that it hasn't seen for a given Function. So I'd be able to use such a profile similar to the below (pseudo-code, haven't used Java Function that often)? Also, your profile draft needs to return something after the receiverTypes = newTypes line :)
public abstract class GetAllValuesNode5 extends Node {
private VirtualMethodProfile sizeProfile = new VirtualMethodProfile(function((A x)->x.getSize()));
private VirtualMethodProfile valueAtProfile = new VirtualMethodProfile(function((A x, int i)->x.getValueAt(i)));
public abstract Object[] executeGeneric(A obj);
@Specialization
protected final Object[] doA(A obj) {
Object[] values = new Object[sizeProfile.apply(obj)];
for (int i = 0; i < values.length; i++) {
values[i] = valueAtProfile.apply(obj, i);
}
return values;
}
}
@fniephaus why is there still a problem with virtual calls in the version using doA(A)? At this point, the compiler should know during PE that the A is actually a B or C, no?
I guess this could be an issue that doA(A) was not actually inlined during PE, because some limit was hit?
Fixed the missing return value by editing the comment. Who cares about the slow path case 馃槈.
Yes, that would be the theory that you can use the profile like you showed. The second version would need to be BiFunction, but also on-the-fly capturing the i value in a lambda should work. Function value could be an argument and does not need to be a field in the profile class.
@smarr I don't know and I was unable to debug this on my own...I only saw a significant change in the AST (virtual calls no longer existed) when using the inlined version for some reason.
Maybe @thomaswue can explain this?
For the specific case of arrays, in TruffleRuby we use array mirrors and array strategies objects.
Those can the be used like here:
https://github.com/oracle/truffleruby/blob/b2d83aa61f9430f99c00f46951f9fa37d43b5188/src/main/java/org/truffleruby/core/array/ArrayReadNormalizedNode.java#L31-L35
```java
@Specialization(guards = { "strategy.matches(array)", "isInBounds(array, index, strategy)" }, limit = "STORAGE_STRATEGIES")
public Object readInBounds(DynamicObject array, int index,
@Cached("of(array)") ArrayStrategy strategy) {
return strategy.newMirror(array).get(index);
// Could also be the more direct. The mirrors provide more flexibility and are escape analyzed.
return strategy.get(array, index);
}
````
This is how you would apply @Cached in your use-case:
public abstract class GetAllValuesNode1 extends Node {
public static final int NO_SLOW_PATH = Integer.MAX_VALUE;
public abstract Object[] executeGeneric(A obj);
@Specialization(guards = "cachedClass == obj.getClass()", limit="NO_SLOW_PATH")
protected final Object[] doA(A obj,
@Cached("obj.getClass()") Class<? extends A> cachedClass) {
Object[] values = new Object[obj.getSize()];
A castObj = cachedClass.cast(obj);
for (int i = 0; i < values.length; i++) {
values[i] = castObj.getValueAt(i);
}
return values;
}
}
Now that we have TruffleLibraries, I think this issue can be closed...