Currently there is no way to generate AOT code for Method Handles. However, in Java 11 (technically from Java 9), String concatenation is done via invokedynamic. Lambdas are also implemented using Method Handles. As the uses of Method Handles increases, it becomes more important to enable AOT support. ~While it may have been extremely difficult to impossible in the past, with the SVM, it is now a more manageable problem to tackle. That said, this is still going to be somewhat of a mini-epic.~ See comment below.
ldc bytecode exists)getClassFromSignature query using the name of the class in 2 and compare the returned J9Class for equality with the J9Class of the patched object from 2.I had a discussion with @andrewcraik regarding the current state of the Method Handles work in order to see what would be needed to enable AOT support. What came out of that discussion is the following:
Because these handles will be essentially anonymous classes, and because, I believe, we can already AOT the methods in these anonymous classes, I believe there shouldn't be any issues regarding AOT and JSR292. In fact, I don't anything needs to be done to support AOT.
As the AOT infrastructure heavily depends on ROMClasses, these two caveats will likely be important when it comes to being able to use AOT to alleviate the anticipated startup hit switching to the new infrastructure will introduce.
fyi @hangshao0 @DanHeidinga @tajila @vijaysun-omr @mpirvu
@hangshao0 please share your thoughts on whether there is work needed on the VM side for this support, i.e. in the "caveat" areas Irwin mentions or any other areas.
- Because constant pool entries have the potential to hold pointers to arb. objects, in order to be able to AOT code that refers to these classes, there will need to be a mechanism to store the ROMClass of these classes into the SCC.
Currently not all ROMClasses are in the SCC (as there are class loaders that won't work with SCC). Last time I talked to @fengxue-IS , I have an impression that only the ram constant pool is patched to have potential pointers to arb. object. The ROMClass constant pool is not patched, so the ROMClass side knows nothing about these arb. objects. This might be a problem for AOT. I will pass it to @fengxue-IS if he has anything to add.
- On a load run, the JVM still needs to be able to determine whether a stored generated class in the SCC corresponds to a method handle in the current instance; I'm not really sure if there's a way to determine that without actually running the generator.
The lambda classes are not found through something like findClass() or Class.forName(). The JVM will generate lambda class and build its ROMClass as normal. The one in the SCC will be returned by the ROMClass builder only when it is exactly the same as the one currently being built. I think we don't have much issue here.
The JVM will generate lambda class and build its ROMClass as normal. The one in the SCC will be returned by the ROMClass builder only when it is exactly the same as the one currently being built. I think we don't have much issue here.
I guess the point I was making here is that there is going to be a potential hit to startup because of the fact that the act of the JVM generating the lambda class and building its ROMClass will require running the bytecode generator; there isn't a way right now to know if a lambda form exists in the SCC without generating the bytecodes to compare, and there's nothing that AOT can do to help with that.
The ROMClass constant pool is not patched, so the ROMClass side knows nothing about these arb. objects. This might be a problem for AOT.
Hm, I suppose if the compiler either reads or generates code to read one of these arb objects, it would need to create a validation record to validate the class of the arb object, so that we can assert the same is true in the load run.
Note that with jdk11+, the jlink tool is able to pregenerate some of the LambdaForm classes which allows them to be loaded rather than generated. So some potential savings there once we can re-enable that jlink plugin
I guess the point I was making here is that there is going to be a potential hit to startup because of the fact that......
Yes, I agree.
Hm, I suppose if the compiler either reads or generates code to read one of these arb objects, it would need to create a validation......
These arb objects can be any classes, currently we cannot guarantee their ROMClasses are in the SCC, like with default class sharing, classes from app and URL loaders are not shared. Or it could be loaded by a custom loader that does not find/store classes in SCC.
JVM has all the ROMClasses of these objects at runtime. From the VM side, one possibility is to let AOT tell us all the non-shared ROMClasses it wants to be in the SCC, VM can copy them into the SCC as orphans. But I suspect there will be performance impact and SCC will grow much larger if there are a lot to copy, which may offset the benefit of AOT.
These arb objects can be any classes, currently we cannot guarantee their ROMClasses are in the SCC, like with default class sharing, classes from app and URL loaders are not shared. Or it could be loaded by a custom loader that does not find/store classes in SCC.
Hm; well what happens right now during an AOT compile is if the bytecodes name some class that isn't in the SCC, the compiler will take the slow path (ie, either assume it's unresolved or call VM helper or something). So in the case of this arb object pointer, if we can't store the class chain of the class of that object, then the compiler will just have to treat it as an object pointer and not make any assumptions about it. The code will be less optimized, but I believe it should be ok.
Anything more on this @fengxue-IS or @hangshao0 ?
Anything more on this
I don't see anything more at this moment.
AOT support is needed for the call to invokeBasic, linkTo*. These are VM INL, but we don't support VM INL in AOT.
J9::CodeGenerator::supportVMInternalNatives()
{
return !self()->comp()->compileRelocatableCode();
}
On X, calling to these methods is the same as calling interpreted method, we put the j9method pointer in a register, then call the i2jtransition helper. On other platforms, it's via snippet.
Adding the support should be simple, at least for invokeBasic and linkTo*
@harryyu1994 if you comment here I can assign this to you.
@dsouzai Okay thanks!
I will tackle this task by task:
Talked to Irwin offline:
Ensure creation/validation of class chain of the class of the generated method
should be already handled by https://github.com/eclipse/openj9/pull/10159
Do class chain validation of the class of the patched object (if a ldc bytecode exists)
the key to tackle this one is to find the location to add our AOT validation record.
@liqunl could you point to me where the "ldc bytecode patching of object" takes place?
@liqunl could you point to me where the "ldc bytecode patching of object" takes place?
Specifically, where does the compiler get the patched object that it puts into the KOT?
talked to liqun offline, the answer can be found in this PR: https://github.com/eclipse/openj9/pull/11092
Specifically, where does the compiler get the patched object that it puts into the KOT?
The object is from ram constant pool. The rom cp entry refers to a String which is just a placeholder that gets patched to a different object. Rom class remain the same, we patch the object in ram class cp entry. The bytecode looks no different itself, other than that the object it loads might not be the original object. Notice that the patched object can be anything, including a string object.
Ensure creation/validation of class chain of the class of the generated method
should be already handled by #10159
I don't think #10159 handle the case where generated method is called or inlined from a JIT method. The generated method or user method denoted by a MethodHandle is obtained from MemberName object. The object encodes information of target method of invokehandle/invokedynamic bytecodes, as well as MethodHandle.invokeBasic and linker methods like linkToStatic, linkToSpecial, linkToVritual, linkToInterface and invokeBasic (all are native method to be transformed to target method with known object info on its argument). So target method is not named in caller's constant pool, and we need AOT relocation/validation for them.
The generated method or user method denoted by a MethodHandle is obtained from MemberName object.
Task 4 in the description is to address this issue.
We get the j9method and/or its vtable/itable index, then create a resolved method with TR_J9VMBase::createResolvedMethod, which calls to TR_J9VMBase::createResolvedMethodWithSignature. I don't see an AOT implementation of them. But I found TR_ResolvedRelocatableJ9Method::createResolvedMethodFromJ9Method, I wonder if it can be used. But we may not have a valid cpIndex
TR_J9VMBase::createResolvedMethodWithSignature checks for AOT within it. However, for inlined methods we would need to create a new inlined method relocation record; we will materialize and validate the J9Method there (see https://github.com/eclipse/openj9/pull/11396 if you're curious as to how inlined method validation/relocation works).
We get the j9method and/or its vtable/itable index,
We will need an SVM validation record when we call the VM API or whatever API we call to get the actual J9Method pointer. However, we don't have to worry about that API when the SVM is disable because direct calls are considered unresolved anyway (so we don't emit the J9Method in the code anywhere).
However, this is not something that needs to be addressed right away; for the very first step we can simply disable inlining of method handles in AOT compilations to get Tasks 1-3 implemented. Then we can implement task 4/5 to support inlining.
I'm slowly understanding this now, may still need to look into it more but task 2 is clear to me now:
introduce a new validation record:
we need 3 pieces of information
- cpIndex
- we need the cpIndex to the arbitrary object
- beholder class
- from the beholder class we can locate the constant pool
- class chain
- we need the arbitrary object pointer to get its class
- now i'm looking at how these class chains work
Regarding the beholder class, when the SVM is enabled you would just use the beholder class ID; when the SVM is disabled, you will have to use the inlined site index (inlined site index --> j9method --> j9class/j9constantpool).
The object is from ram constant pool. The rom cp entry refers to a String which is just a placeholder that gets patched to a different object. Rom class remain the same, we patch the object in ram class cp entry. The bytecode looks no different itself, other than that the object it loads might not be the original object. Notice that the patched object can be anything, including a string object.
@liqunl I think I still need the answer to this question
Specifically, where does the compiler get the patched object that it puts into the KOT?
TR_OpaqueClassBlock *clazz = comp()->fej9()->getObjectClassAt((uintptr_t)stringConst);
isString = comp()->fej9()->isString(clazz);
where in the codebase do we do this?
We get the object from ram constant pool. ldc cpIndex, cpIndex points to cp entry where the object is from. stringConst in the following code is static address storing the object reference. Dereference stringConst will give you the object Reference
void * stringConst = owningMethod->stringConstant(cpIndex)
i think for the AOT validation part we need to get hold of the j9class pointer of that patched object, how/where can we get it?
With the object at hand, you can get its J9Class with frontend API getObjectClassAt
in #11092, this clazz pointer won't give me the information i need right? like you mentioned it's just a placeholder.
This class pointer is the J9Class of the patched object (if it is patched). The symbolic link to a string literal is a placeholder, that is in the sense of rom class. The patching happens together with the creation of the ram class (in Unsafe.defineAnonymousClass), so the ram class always contain the patched object.
This class pointer is the J9Class of the patched object (if it is patched).
If it is ConstString then it's not patched and if it is NonSpecificConstObject then it is patched. Is this the case?
If it is ConstString then it's not patched and if it is NonSpecificConstObject then it is patched. Is this the case?
The patched object can be any type, it can be a ConstString. The rom constant pool entry refers to a ConstString, so you probably don't need extra validation if the object is a string.
so you probably don't need extra validation if the object is a string.
I believe we still need a validation. For example, in the first run it's a ConstString but in the second it gets patched to something else
Also right now it looks like we don't allow patched constants, so that'll have to be changed.
if (isString)
sym->setConstString();
else
{
if (comp()->compileRelocatableCode())
comp()->failCompilation<J9::AOTHasPatchedCPConstant>("Patched Constant not supported in AOT.");
sym->setNonSpecificConstObject();
}
How do we distinguish if it's for a method handle ldc? I was trying to use NonSpecificConstObject to tell whether we have a method handle ldc.
How do we distinguish if it's for a method handle ldc? I was trying to use NonSpecificConstObject to tell whether we have a method handle ldc
ldc is a bytecode that loads constant from constant pool, the type of the constant is encoded in rom class. But when the ram class cp entry is patched, the type from rom class might not match the actual constant type. NonSpecificConstObject is used for any patched object that is not a String.The patched object can be any type, not necessarily a MethodHandle. Since you have the object, you can get its J9Class at compile time.