Openj9: NullPointerException not thrown due to a codegen bug on Z and Power

Created on 24 May 2019  Â·  2Comments  Â·  Source: eclipse/openj9

While running an internal test a JUnit exception is thrown with an assertion that a NullPointerException was not thrown when it was expected to be. I've reduced the problem down to a simple unit test as follows which exposes the bug on Z and Power:

import java.nio.channels.SelectableChannel;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.spi.AbstractSelectionKey;

class Test
{
    private static class FakeSelectionKey extends AbstractSelectionKey
    {
        FakeSelectionKey()
        {
            super();
        }

        public SelectableChannel channel()
        {
            return null;
        }

        public Selector selector()
        {
            return null;
        }

        public int interestOps()
        {
            return 0;
        }

        public SelectionKey interestOps(int arg0)
        {
            return null;
        }

        public int readyOps()
        {
            return 0;
        }
    }

    public static void main(String[] args)
    {
        for (int i = 0; i < Integer.parseInt(args[0]); ++i)
        {
            foo();
        }
    }

    public static void foo()
    {
        FakeSelectionKey fakeKey = new FakeSelectionKey();

        try
        {
            fakeKey.cancel();
            System.out.println("Error: Did not throw NullPointerException");
            System.exit(1);
        }
        catch (NullPointerException e)
        {
            // Passed!
        }
    }
}

You can run the unit test via:

java -Xcompressedrefs -Xjit:'count=1,disableAsyncCompilation,limit={java/nio/channels/spi/AbstractSelectionKey.cancel()V}' -Xgcpolicy:gencon Test 1000000000

However it will not reproduce with only the command above. The reason is because we need nextGenHCR [1] to kick in but it is currently disabled at startup because of performance issues. Because this is a unit test we end up defaulting back to the old HCR which has an extra call in the cold path which prevents the bug from being exposed. To fix this do either of the following two:

  1. export TR_DisableHCR=1 then re-run the test
  2. Rebuild the JIT with the line in [2] commented out then re-run the test

[1] https://github.com/eclipse/openj9/blob/master/doc/compiler/hcr/NextGenHCR.md
[2] https://github.com/eclipse/openj9/blob/28dd8748bf66a51c030949ea93286e3794478662/runtime/compiler/control/CompilationThread.cpp#L7376

power z bug jit

Most helpful comment

@fjeremic @r30shah We checked the log files and found out that the first child of checkcastAndNullCHK would be optimized to aconst NULL in some cases. And then we traced the checkcastAndNullCHK evaluator to the function calculateInstanceOfOrCheckCastSequences which decides if the NullTest should be performed. We found that for the aconst NULL case, the code doesn’t decide to do the NullTest. In that case, we added two lines of codes to add the NullTest if the current op code is checkcastAndNullCHK and the first child is “Null” (the first child is “Null” when the isNull() returns true). After the code change, it will throw the NullPointerException in the NullTest when the first child is Null.

Following is the trace log for checkcastAndNULLCHK before the code change, where the first child is aconst NULL and the instructions below don't do the NULL check, therefore not throwing the NullPointerException.

============================================================
; Live regs: GPR=0 FPR=0 VRF=0 {}
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ff71637220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  1)    aconst NULL (X==0 X>=0 sharedMemory )                                              [     0x3ff716b43f0] bci=[-1,20,89] rc=1 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  1)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ff716371d0] bci=[-1,23,89] rc=1 vc=903 vn=- li=- udi=- nc=0
------------------------------
checkcastAndNULLCHK:Maximum Profiled Classes = 3
Outline Super Class Test: 0
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ff71637220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  0)    aconst NULL (in GPR_0242) (X==0 X>=0 sharedMemory )                                [     0x3ff716b43f0] bci=[-1,20,89] rc=0 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  0)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ff716371d0] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=0
------------------------------

 [     0x3ff71887b80]   LGHI    GPR_0242,0x0
 [     0x3ff71888400]   ASSOCREGS
 [     0x3ff71887e30]   Label L0064:
 POST:
 {AssignAny:GPR_0242:R}

After the code change, we add the code (930adf0627b3b38697b029047ec69c25510280d1) to take the NullTest in this case. Following is the trace log after code change:

============================================================
; Live regs: GPR=0 FPR=0 VRF=0 {}
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ffb4337220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  1)    aconst NULL (X==0 X>=0 sharedMemory )                                              [     0x3ffb43b43f0] bci=[-1,20,89] rc=1 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  1)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ffb43371d0] bci=[-1,23,89] rc=1 vc=903 vn=- li=- udi=- nc=0
------------------------------
checkcastAndNULLCHK:Maximum Profiled Classes = 3
Outline Super Class Test: 0
checkcastAndNULLCHK: Emitting NullTest
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ffb4337220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  0)    aconst NULL (in GPR_0242) (X==0 X>=0 sharedMemory )                                [     0x3ffb43b43f0] bci=[-1,20,89] rc=0 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  0)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ffb43371d0] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=0
------------------------------

 [     0x3ffb4587b80]   LGHI    GPR_0242,0x0
 [     0x3ffb4587c80]   CGIT    GPR_0242,0,BH(mask=0x8),
 [     0x3ffb4588510]   ASSOCREGS
 [     0x3ffb4587f40]   Label L0064:
 POST:
 {AssignAny:GPR_0242:R}

All 2 comments

@fjeremic @r30shah We checked the log files and found out that the first child of checkcastAndNullCHK would be optimized to aconst NULL in some cases. And then we traced the checkcastAndNullCHK evaluator to the function calculateInstanceOfOrCheckCastSequences which decides if the NullTest should be performed. We found that for the aconst NULL case, the code doesn’t decide to do the NullTest. In that case, we added two lines of codes to add the NullTest if the current op code is checkcastAndNullCHK and the first child is “Null” (the first child is “Null” when the isNull() returns true). After the code change, it will throw the NullPointerException in the NullTest when the first child is Null.

Following is the trace log for checkcastAndNULLCHK before the code change, where the first child is aconst NULL and the instructions below don't do the NULL check, therefore not throwing the NullPointerException.

============================================================
; Live regs: GPR=0 FPR=0 VRF=0 {}
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ff71637220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  1)    aconst NULL (X==0 X>=0 sharedMemory )                                              [     0x3ff716b43f0] bci=[-1,20,89] rc=1 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  1)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ff716371d0] bci=[-1,23,89] rc=1 vc=903 vn=- li=- udi=- nc=0
------------------------------
checkcastAndNULLCHK:Maximum Profiled Classes = 3
Outline Super Class Test: 0
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ff71637220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  0)    aconst NULL (in GPR_0242) (X==0 X>=0 sharedMemory )                                [     0x3ff716b43f0] bci=[-1,20,89] rc=0 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  0)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ff716371d0] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=0
------------------------------

 [     0x3ff71887b80]   LGHI    GPR_0242,0x0
 [     0x3ff71888400]   ASSOCREGS
 [     0x3ff71887e30]   Label L0064:
 POST:
 {AssignAny:GPR_0242:R}

After the code change, we add the code (930adf0627b3b38697b029047ec69c25510280d1) to take the NullTest in this case. Following is the trace log after code change:

============================================================
; Live regs: GPR=0 FPR=0 VRF=0 {}
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ffb4337220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  1)    aconst NULL (X==0 X>=0 sharedMemory )                                              [     0x3ffb43b43f0] bci=[-1,20,89] rc=1 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  1)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ffb43371d0] bci=[-1,23,89] rc=1 vc=903 vn=- li=- udi=- nc=0
------------------------------
checkcastAndNULLCHK:Maximum Profiled Classes = 3
Outline Super Class Test: 0
checkcastAndNULLCHK: Emitting NullTest
------------------------------
 n43n     (  0)  checkcastAndNULLCHK [#84]                                                            [     0x3ffb4337220] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=2
 n167n    (  0)    aconst NULL (in GPR_0242) (X==0 X>=0 sharedMemory )                                [     0x3ffb43b43f0] bci=[-1,20,89] rc=0 vc=903 vn=- li=- udi=- nc=0 flg=0x102
 n42n     (  0)    loadaddr  java/nio/channels/spi/AbstractSelector[#366  Static] [flags 0x18307 0x0 ]  [     0x3ffb43371d0] bci=[-1,23,89] rc=0 vc=903 vn=- li=- udi=- nc=0
------------------------------

 [     0x3ffb4587b80]   LGHI    GPR_0242,0x0
 [     0x3ffb4587c80]   CGIT    GPR_0242,0,BH(mask=0x8),
 [     0x3ffb4588510]   ASSOCREGS
 [     0x3ffb4587f40]   Label L0064:
 POST:
 {AssignAny:GPR_0242:R}

@0xdaryl @Leonardo2718 FYI another use case for Tril on OpenJ9. This would have definitely been caught in primitive IL tests where we pass constants to evaluators. In this case an aconst NULL child of a checkcastAndNULLCHK

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JamesKingdon picture JamesKingdon  Â·  5Comments

0xdaryl picture 0xdaryl  Â·  3Comments

DanHeidinga picture DanHeidinga  Â·  4Comments

xliang6 picture xliang6  Â·  3Comments

VermaSh picture VermaSh  Â·  3Comments