Openj9: AOT Support for Method Handles

Created on 23 Feb 2019 · 29Comments · Source: eclipse/openj9

Currently there is no way to generate AOT code for Method Handles. However, in Java 11 (technically from Java 9), String concatenation is done via invokedynamic. Lambdas are also implemented using Method Handles. As the uses of Method Handles increases, it becomes more important to enable AOT support. ~While it may have been extremely difficult to impossible in the past, with the SVM, it is now a more manageable problem to tackle. That said, this is still going to be somewhat of a mini-epic.~ See comment below.

Technical Discussion regarding AOT Validations for Method Handles

Tasks

Ensure creation/validation of class chain of the class of the generated method
Do class chain validation of the class of the patched object (if a ldc bytecode exists)
Do a getClassFromSignature query using the name of the class in 2 and compare the returned J9Class for equality with the J9Class of the patched object from 2.
Add new relocation record for J9Method acquired from the MemberName object stored in a field of MH, or from the side table of a RAM class with invokedynamic/invokehandle
4 a. Update relocation of guards / metadata inlining table to know how to materialize associated J9Method/J9Class.
Add AOT support for the call to invokeBasic, linkTo* (see https://github.com/eclipse/openj9/issues/4850#issuecomment-730439646) - may need a new relocation record for the native address.

jit compaot

Source

dsouzai

All 29 comments

I had a discussion with @andrewcraik regarding the current state of the Method Handles work in order to see what would be needed to enable AOT support. What came out of that discussion is the following:

Method Handles will be bytecode generated
They will be implemented via lambda forms
Because calling a handle involves running a bytecode generated class, it can be JIT'd and inlined, and also OSR can be done as the JIT and VM agree on the representation
JVM already supports storing lambda forms into the SCC
The mechanism allows string typed constant pool entries to hold pointers to arb. objects

Because these handles will be essentially anonymous classes, and because, I believe, we can already AOT the methods in these anonymous classes, I believe there shouldn't be any issues regarding AOT and JSR292. In fact, I don't anything needs to be done to support AOT.

Caveats

Because constant pool entries have the potential to hold pointers to arb. objects, in order to be able to AOT code that refers to these classes, there will need to be a mechanism to store the ROMClass of these classes into the SCC.
On a load run, the JVM still needs to be able to determine whether a stored generated class in the SCC corresponds to a method handle in the current instance; I'm not really sure if there's a way to determine that without actually running the generator.

As the AOT infrastructure heavily depends on ROMClasses, these two caveats will likely be important when it comes to being able to use AOT to alleviate the anticipated startup hit switching to the new infrastructure will introduce.

fyi @hangshao0 @DanHeidinga @tajila @vijaysun-omr @mpirvu

dsouzai on 21 Oct 2020

@hangshao0 please share your thoughts on whether there is work needed on the VM side for this support, i.e. in the "caveat" areas Irwin mentions or any other areas.

vijaysun-omr on 3 Nov 2020

Because constant pool entries have the potential to hold pointers to arb. objects, in order to be able to AOT code that refers to these classes, there will need to be a mechanism to store the ROMClass of these classes into the SCC.

Currently not all ROMClasses are in the SCC (as there are class loaders that won't work with SCC). Last time I talked to @fengxue-IS , I have an impression that only the ram constant pool is patched to have potential pointers to arb. object. The ROMClass constant pool is not patched, so the ROMClass side knows nothing about these arb. objects. This might be a problem for AOT. I will pass it to @fengxue-IS if he has anything to add.

On a load run, the JVM still needs to be able to determine whether a stored generated class in the SCC corresponds to a method handle in the current instance; I'm not really sure if there's a way to determine that without actually running the generator.

The lambda classes are not found through something like findClass() or Class.forName(). The JVM will generate lambda class and build its ROMClass as normal. The one in the SCC will be returned by the ROMClass builder only when it is exactly the same as the one currently being built. I think we don't have much issue here.

hangshao0 on 3 Nov 2020

The JVM will generate lambda class and build its ROMClass as normal. The one in the SCC will be returned by the ROMClass builder only when it is exactly the same as the one currently being built. I think we don't have much issue here.

I guess the point I was making here is that there is going to be a potential hit to startup because of the fact that the act of the JVM generating the lambda class and building its ROMClass will require running the bytecode generator; there isn't a way right now to know if a lambda form exists in the SCC without generating the bytecodes to compare, and there's nothing that AOT can do to help with that.

The ROMClass constant pool is not patched, so the ROMClass side knows nothing about these arb. objects. This might be a problem for AOT.

Hm, I suppose if the compiler either reads or generates code to read one of these arb objects, it would need to create a validation record to validate the class of the arb object, so that we can assert the same is true in the load run.

dsouzai on 4 Nov 2020

Note that with jdk11+, the jlink tool is able to pregenerate some of the LambdaForm classes which allows them to be loaded rather than generated. So some potential savings there once we can re-enable that jlink plugin

DanHeidinga on 4 Nov 2020

I guess the point I was making here is that there is going to be a potential hit to startup because of the fact that......

Yes, I agree.

Hm, I suppose if the compiler either reads or generates code to read one of these arb objects, it would need to create a validation......

These arb objects can be any classes, currently we cannot guarantee their ROMClasses are in the SCC, like with default class sharing, classes from app and URL loaders are not shared. Or it could be loaded by a custom loader that does not find/store classes in SCC.

JVM has all the ROMClasses of these objects at runtime. From the VM side, one possibility is to let AOT tell us all the non-shared ROMClasses it wants to be in the SCC, VM can copy them into the SCC as orphans. But I suspect there will be performance impact and SCC will grow much larger if there are a lot to copy, which may offset the benefit of AOT.

hangshao0 on 4 Nov 2020

These arb objects can be any classes, currently we cannot guarantee their ROMClasses are in the SCC, like with default class sharing, classes from app and URL loaders are not shared. Or it could be loaded by a custom loader that does not find/store classes in SCC.

Hm; well what happens right now during an AOT compile is if the bytecodes name some class that isn't in the SCC, the compiler will take the slow path (ie, either assume it's unresolved or call VM helper or something). So in the case of this arb object pointer, if we can't store the class chain of the class of that object, then the compiler will just have to treat it as an object pointer and not make any assumptions about it. The code will be less optimized, but I believe it should be ok.

dsouzai on 4 Nov 2020

Anything more on this @fengxue-IS or @hangshao0 ?

vijaysun-omr on 16 Nov 2020

Anything more on this

I don't see anything more at this moment.

hangshao0 on 17 Nov 2020

AOT support is needed for the call to invokeBasic, linkTo*. These are VM INL, but we don't support VM INL in AOT.

J9::CodeGenerator::supportVMInternalNatives()
   {
   return !self()->comp()->compileRelocatableCode();
   }

On X, calling to these methods is the same as calling interpreted method, we put the j9method pointer in a register, then call the i2jtransition helper. On other platforms, it's via snippet.

Adding the support should be simple, at least for invokeBasic and linkTo*

liqunl on 19 Nov 2020

@harryyu1994 if you comment here I can assign this to you.

dsouzai on 5 Dec 2020

@dsouzai Okay thanks!

harryyu1994 on 5 Dec 2020

I will tackle this task by task:

Talked to Irwin offline:

Ensure creation/validation of class chain of the class of the generated method

should be already handled by https://github.com/eclipse/openj9/pull/10159

Do class chain validation of the class of the patched object (if a ldc bytecode exists)

the key to tackle this one is to find the location to add our AOT validation record.
@liqunl could you point to me where the "ldc bytecode patching of object" takes place?

harryyu1994 on 9 Dec 2020

@liqunl could you point to me where the "ldc bytecode patching of object" takes place?

Specifically, where does the compiler get the patched object that it puts into the KOT?

dsouzai on 9 Dec 2020

talked to liqun offline, the answer can be found in this PR: https://github.com/eclipse/openj9/pull/11092

harryyu1994 on 9 Dec 2020

Specifically, where does the compiler get the patched object that it puts into the KOT?

The object is from ram constant pool. The rom cp entry refers to a String which is just a placeholder that gets patched to a different object. Rom class remain the same, we patch the object in ram class cp entry. The bytecode looks no different itself, other than that the object it loads might not be the original object. Notice that the patched object can be anything, including a string object.

liqunl on 9 Dec 2020

Ensure creation/validation of class chain of the class of the generated method
should be already handled by #10159

I don't think #10159 handle the case where generated method is called or inlined from a JIT method. The generated method or user method denoted by a MethodHandle is obtained from MemberName object. The object encodes information of target method of invokehandle/invokedynamic bytecodes, as well as MethodHandle.invokeBasic and linker methods like linkToStatic, linkToSpecial, linkToVritual, linkToInterface and invokeBasic (all are native method to be transformed to target method with known object info on its argument). So target method is not named in caller's constant pool, and we need AOT relocation/validation for them.

liqunl on 9 Dec 2020

👍1

The generated method or user method denoted by a MethodHandle is obtained from MemberName object.

Task 4 in the description is to address this issue.

dsouzai on 9 Dec 2020

We get the j9method and/or its vtable/itable index, then create a resolved method with TR_J9VMBase::createResolvedMethod, which calls to TR_J9VMBase::createResolvedMethodWithSignature. I don't see an AOT implementation of them. But I found TR_ResolvedRelocatableJ9Method::createResolvedMethodFromJ9Method, I wonder if it can be used. But we may not have a valid cpIndex

liqunl on 9 Dec 2020

TR_J9VMBase::createResolvedMethodWithSignature checks for AOT within it. However, for inlined methods we would need to create a new inlined method relocation record; we will materialize and validate the J9Method there (see https://github.com/eclipse/openj9/pull/11396 if you're curious as to how inlined method validation/relocation works).

We get the j9method and/or its vtable/itable index,

We will need an SVM validation record when we call the VM API or whatever API we call to get the actual J9Method pointer. However, we don't have to worry about that API when the SVM is disable because direct calls are considered unresolved anyway (so we don't emit the J9Method in the code anywhere).

However, this is not something that needs to be addressed right away; for the very first step we can simply disable inlining of method handles in AOT compilations to get Tasks 1-3 implemented. Then we can implement task 4/5 to support inlining.

dsouzai on 9 Dec 2020

👍1

I'm slowly understanding this now, may still need to look into it more but task 2 is clear to me now:

introduce a new validation record:
    we need 3 pieces of information
        - cpIndex
            - we need the cpIndex to the arbitrary object
        - beholder class
            - from the beholder class we can locate the constant pool
        - class chain
            - we need the arbitrary object pointer to get its class
            - now i'm looking at how these class chains work

I'm going to come back tomorrow to refine this comment but i think on the right track now..

harryyu1994 on 10 Dec 2020

Regarding the beholder class, when the SVM is enabled you would just use the beholder class ID; when the SVM is disabled, you will have to use the inlined site index (inlined site index --> j9method --> j9class/j9constantpool).

dsouzai on 10 Dec 2020

👍1

The object is from ram constant pool. The rom cp entry refers to a String which is just a placeholder that gets patched to a different object. Rom class remain the same, we patch the object in ram class cp entry. The bytecode looks no different itself, other than that the object it loads might not be the original object. Notice that the patched object can be anything, including a string object.

@liqunl I think I still need the answer to this question

Specifically, where does the compiler get the patched object that it puts into the KOT?

where in the codebase do we do this?
i think for the AOT validation part we need to get hold of the j9class pointer of that patched object, how/where can we get it?
in #11092, this clazz pointer won't give me the information i need right? like you mentioned it's just a placeholder.

      TR_OpaqueClassBlock *clazz = comp()->fej9()->getObjectClassAt((uintptr_t)stringConst);
      isString = comp()->fej9()->isString(clazz);

harryyu1994 on 10 Dec 2020

where in the codebase do we do this?

We get the object from ram constant pool. ldc cpIndex, cpIndex points to cp entry where the object is from. stringConst in the following code is static address storing the object reference. Dereference stringConst will give you the object Reference

void * stringConst = owningMethod->stringConstant(cpIndex)

i think for the AOT validation part we need to get hold of the j9class pointer of that patched object, how/where can we get it?

With the object at hand, you can get its J9Class with frontend API getObjectClassAt

in #11092, this clazz pointer won't give me the information i need right? like you mentioned it's just a placeholder.

This class pointer is the J9Class of the patched object (if it is patched). The symbolic link to a string literal is a placeholder, that is in the sense of rom class. The patching happens together with the creation of the ram class (in Unsafe.defineAnonymousClass), so the ram class always contain the patched object.

liqunl on 10 Dec 2020

👍1

This class pointer is the J9Class of the patched object (if it is patched).

If it is ConstString then it's not patched and if it is NonSpecificConstObject then it is patched. Is this the case?

harryyu1994 on 10 Dec 2020

If it is ConstString then it's not patched and if it is NonSpecificConstObject then it is patched. Is this the case?

The patched object can be any type, it can be a ConstString. The rom constant pool entry refers to a ConstString, so you probably don't need extra validation if the object is a string.

liqunl on 10 Dec 2020

so you probably don't need extra validation if the object is a string.

I believe we still need a validation. For example, in the first run it's a ConstString but in the second it gets patched to something else

Also right now it looks like we don't allow patched constants, so that'll have to be changed.

   if (isString)
      sym->setConstString();
   else
      {
      if (comp()->compileRelocatableCode())
         comp()->failCompilation<J9::AOTHasPatchedCPConstant>("Patched Constant not supported in AOT.");

      sym->setNonSpecificConstObject();
      }

dsouzai on 10 Dec 2020

How do we distinguish if it's for a method handle ldc? I was trying to use NonSpecificConstObject to tell whether we have a method handle ldc.

harryyu1994 on 10 Dec 2020

How do we distinguish if it's for a method handle ldc? I was trying to use NonSpecificConstObject to tell whether we have a method handle ldc

ldc is a bytecode that loads constant from constant pool, the type of the constant is encoded in rom class. But when the ram class cp entry is patched, the type from rom class might not match the actual constant type. NonSpecificConstObject is used for any patched object that is not a String.The patched object can be any type, not necessarily a MethodHandle. Since you have the object, you can get its J9Class at compile time.

liqunl on 11 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

OpenJ9 startup performance on a Spring Boot application

jsimomaa · 109Comments

JTReg VM Failure: java/nio/charset/coders/BashStreams.java

M-Davies · 76Comments

SharedClasses.SCM23.MultiCL_0 times out on Windows

pshipton · 62Comments

Segmentation fault, vmState=0x00000000 in OpenJ9 0.20.0 when conflicting class versions are found in cache

edrevo · 49Comments

Shared cache hints for GC heap size

pshipton · 64Comments