When running AcmeAir on lara with these options:
-Xjit:limit={java/lang/StringBuilder.lengthInternal*},verbose,initialOptLevel=hot,dontDowngradeToCold,scorchingSampleThreshold=0 -Xshareclasses:none -Xms512m -Xmx512m
I get the following crash:
Unexpected character (o) at position 0.
at org.json.simple.parser.Yylex.yylex(Yylex.java:610)
at org.json.simple.parser.JSONParser.nextToken(JSONParser.java:269)
at org.json.simple.parser.JSONParser.parse(JSONParser.java:118)
at org.json.simple.parser.JSONParser.parse(JSONParser.java:81)
at org.json.simple.parser.JSONParser.parse(JSONParser.java:75)
at com.acmeair.jmeter.functions.UpdateCustomerFunction.processJSonString(UpdateCustomerFunction.java:81)
at com.acmeair.jmeter.functions.UpdateCustomerFunction.execute(UpdateCustomerFunction.java:52)
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:142)
at org.apache.jmeter.engine.util.CompoundVariable.execute(CompoundVariable.java:118)
at org.apache.jmeter.testelement.property.FunctionProperty.getStringValue(FunctionProperty.java:101)
at org.apache.jmeter.testelement.AbstractTestElement.getPropertyAsString(AbstractTestElement.java:274)
at org.apache.jmeter.config.Argument.getValue(Argument.java:146)
at org.apache.jmeter.protocol.http.util.HTTPArgument.getEncodedValue(HTTPArgument.java:236)
at org.apache.jmeter.protocol.http.util.HTTPArgument.getEncodedValue(HTTPArgument.java:217)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sendPostData(HTTPHC4Impl.java:1113)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.handleMethod(HTTPHC4Impl.java:453)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:329)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:74)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1146)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1135)
at org.apache.jmeter.threads.JMeterThread.process_sampler(JMeterThread.java:434)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:261)
at java.lang.Thread.run(Thread.java:812)
However, when I change initialOptLevel=hot to optLevel=hot, the crash goes away, so this is probably caused by a recompilation issue.
Although verbose log only shows one compilation, enabling traceFull produces two logs, to further lend support to this idea.
Can you please specify which architecture you're running on? This is a very tiny method [1] so it will be very obvious if we screwed something up in there.
It's on x86-64
It's on x86-64
@andrewcraik FYI. @dmitry-ten can you please provide a trace log of the method?
Which log is which?
The only difference between these two is that lengthInternal.1.txt is a hot compile with GCR while lengthInternal.2.txt is a hot compile. lengthInternal.2.txt is trivial load indirect from the parameter register and return. lengthInternal.1.txt is the same, but there is the complexity of GCR trees to trip a recompile. If lengthInternal.1.txt is causing the problem try running a disableGuardedCountingRecompilation and see if it start passing - if so there may be some weird interaction between the mainline and the recomp helper.
Running with disableGuardedCountingRecompilation made the crash disappear.
So, I asked @mpirvu, and he said that we shoudn't even have GCR trees in a hot compilation.
Not sure if this is related to the problem.
The GCR trees are inserted because of this code
else if (jitConfig->javaVM->phase != J9VM_PHASE_NOT_STARTUP &&
details.isOrdinaryMethod() &&
!options->getOption(TR_DisableNoServerDuringStartup))
{
options->setOption(TR_NoOptServer);
reducedWarm = true;
// These guys should be compiled with GCR hooks so that we get the throughput back
options->setInsertGCRTrees();
}
that attempts to do cheaper compilations during start-up. This is not a common scenario because the JIT refrains as much as possible from doing hot compilation during start-up. However, it is a possible scenario. GCR is not much tested in the context of hot compilations, but there is no reason for why it shouldn't work.
Still haven't figured out the problem, but, apparently, all GCR compilations will always be recompiled at warm, according to this code in control/CompilationController.cpp:
if (recomp->_bodyInfo->getUsesGCR())
{
level = warm; // GCR recompilations should be performed at warm
// If a GCR count was specified, used that
if (options->getGCRCount() > 0)
{
countValue = options->getGCRCount();
}
else // find the count corresponding to the warm level (or next available)
{
countValue = options->getCountValue(mayHaveLoops, (TR_Hotness) level);
if (countValue < 0)
{
// Last resort: use some sensible values
countValue = mayHaveLoops ? options->getInitialBCount() : options->getInitialCount();
}
}
}
So it doesn't make sense to have GCR enabled at levels above cold.
I found the reason for this crash. This is original IL: After PRE: So the constant 2 ends up written to the wrong address.
It's in the \n47n bstore <gcr-patch-point>[#565 Static] [flags 0x400301 0x40 ] [0x7fc849947a40] bci=[-1,0,1170] rc=0 vc=0 vn=- li=- udi=- nc=1
n46n bconst 2 [0x7fc8499479f0] bci=[-1,0,1170] rc=1 vc=0 vn=- li=- udi=- nc=0
n47n istore <gcr-patch-point>[#565 Static] [flags 0x400301 0x40 ] [0x7fc849947a40] bci=[-1,0,1170] rc=0 vc=156 vn=- li=6 udi=- nc=1
n46n bconst 2 (X!=0 X>=0 ) [0x7fc8499479f0] bci=[-1,0,1170] rc=1 vc=156 vn=- li=-1 udi=- nc=0 flg=0x104
I verified that with disablePRE the crash goes away.
Maybe someone who knows more about this optimization can help figure out why it does this transformation.
So I wasn't following this, but we need a traceFull,tracePRE log. The most worrying thing here is that the O^O transformations do not mention the bstore and yet the bstore is created to an istore without checking the child constant types. Likely a nasty issue. @cathyzhyi could you take a look - shouldn't be a hard fix once you find the right spot in PRE.
If 9470 is merged it should be undone once this is fixed.
@dmitry-ten could you get a log with tracePRE and tracefull?
I see that the problem appears after PRE node numbering.
I think the problem is here:
https://github.com/eclipse/omr/blob/b8634d51d53c1018a8abe934a50fd35fa7df514f/compiler/optimizer/LocalAnalysis.cpp#L541-L545
The API opCodeForDirectLoad is meant for field shadow only. We should be calling the version in the OMR layer for this particular case.
The API
opCodeForDirectLoadis meant for field shadow only. We should be calling the version in the OMR layer for this particular case.
It is using the version in the OMR layer. That code is in OMR. The fact that OpenJ9 overrides that API and returns something different is another story.
The API
opCodeForDirectLoadis meant for field shadow only. We should be calling the version in the OMR layer for this particular case.It is using the version in the OMR layer. That code is in OMR. The fact that OpenJ9 overrides that API and returns something different is another story.
Actually, because the IL is extensible class so we are calling the OpenJ9 implementation here.
Actually, because the
ILis extensible class so we are calling the OpenJ9 implementation here.
Right I understand that is what is happening in OpenJ9, however looking at it from OMR perspective (which is where the code lives) the code is correct. It asks the IL class "given this data type, what is the opcode to do a direct load". OpenJ9 overloads this API, so when the same code is executed within a Java method compilation it will return something different.
How do you propose the OMR code should be altered? I don't see anywhere in the codebase us calling OMR::IL::opCodeForDirectLoad explicitly.
So I think this could go one of two ways - one is that the OpenJ9 override is not warranted and whatever cases it was papering over will need to be 'fixed' in OpenJ9 and the override removed so that this opt works correctly. The other would be that the override is warranted but the API is not sufficiently rich to allow for us to determine the correct conversion (eg this case we want the matching type not a widened type). This latter case would be a problem in OMR's API design that would need to be addressed.
Now both of these are going to take some time and this causing a real functional issue to a major consumer of OMR downstream so if these are the real solutions we are going to consider we are going to have to consider the right tactical fix. I don't know if that fix is in OMR or OpenJ9 but I think we need to be open to something not too tasteful in the short term pending a proper resolution which may be a fair bit more invasive.
@cathyzhyi is going to look at the override and see if it is warranted etc. Once she reports back we can decide on a plan of action.
So here is the current state of the opCodeFor APIs:
opCodeFor(In)directArrayLoad/storeopCodeForCorresponding(In)directStore/Loads APIs are only used in PRE and FieldPrivatizer and created because of array. bload bstorei. However, I feel this group of APIs are redundant. we could have just used the APIs from 1 since array shadows are easy to tell.opCodeFor*store/loads APIs currently all do “widening to int” magic.I think the main point of the problem is that, putting array access aside:
However, currently, there is no distinction between those cases, optimizations are also calling the int widening version which is functionally correct most of the time until we hit the current issue.
My proposal:
opCodeFor* and don’t do widening magic because the APIs don’t have enough information.dataTypeToNodeType(TR::DataType). opCodeFor(In)directArrayLoad/store APIsdataTypeToNodeType before passing to opCodeFor*store/loads. opCodeFor*store/loads directly. @andrewcraik Hey Andrew, could you share your thoughts on this?
@cathyzhyi do you see the new dataTypeToNodeType living in OMR? If so we need to discuss this with the OMR community before we can settle on a design. I guess the uses of dataTypeToNodeType will be primarily confined to ilgen - do yo use a need for it in any optimizations?
The use of the other APIs seems reasonable. I guess my main concern with the added complexity of having to call the right API for the right origin is if we can design the API to minimize misuse - eg the data type to node type should take a symref or something like that rather than just a type so that it is hard to pass the type from a node. etc.
Overall I think it seems reasonable once we flush out the above details.
do you see the new dataTypeToNodeType living in OMR? If so we need to discuss this with the OMR community before we can settle on a design. I guess the uses of dataTypeToNodeType will be primarily confined to ilgen - do yo use a need for it in any optimizations?
Actually, there are some cases that opts cook up nodes from symbols. EA and LoopVersioner do that. Now I realize it's more common than I thought. Given this, I think we need to discuss with OMR community as well.
I guess my main concern with the added complexity of having to call the right API for the right origin is if we can design the API to minimize misuse
Yes, I agree. However, some use cases might use the descriptor string to determine type like in this case https://github.com/eclipse/openj9/blob/4cbf03a3e89509b42424a4e3fa768f9c896ffe5f/runtime/compiler/ilgen/Walker.cpp#L5695. Currently, I've only found one such use.