Openj9: StringBuilder.toString() retains null characters of initial size when accessing String "value" via Unsafe

Created on 28 May 2018  路  13Comments  路  Source: eclipse/openj9

Hi OpenJ9 team.

I encountered a problem when trying to use the Akka toolkit together with OpenJ9.
Akka uses Unsafe in order to efficiently (I guess) access/copy String values and when used in OpenJ9 throws an ArrayIndexOutOfBoundsException here.

I debugged a little deeper and found out that the cause of this is that the OpenJ9's StringBuilder behaves differently when instantiating it with an initial size: In the String's value the SB builds it retains the not written chars instead of removing them (what seems to be the behavior of the default OpenJDK behavior).

I wrote this little reproducer to show it:

public class Reproducer {

    public static void main(String... args) throws NoSuchFieldException, IllegalAccessException {

        final StringBuilder sb = new StringBuilder(4);
        final String strToAppend = "foo";
        sb.append(strToAppend);
        final String s = sb.toString();

        System.out.println("Expected string length: " + strToAppend.length());
        System.out.println("Expected char[] length: " + strToAppend.toCharArray().length);

        System.out.println("Actual length: " + s.length());
        System.out.println("Actual char[] length: " + s.toCharArray().length);

        sun.misc.Unsafe unsafe = null;
        for (Field field : sun.misc.Unsafe.class.getDeclaredFields()) {
            if (field.getType() == sun.misc.Unsafe.class) {
                field.setAccessible(true);
                unsafe = (sun.misc.Unsafe) field.get(null);
                break;
            }
        }
        if (unsafe == null) throw new IllegalStateException("Can't find instance of sun.misc.Unsafe");
        final long stringValueFieldOffset = unsafe.objectFieldOffset(String.class.getDeclaredField("value"));
        final char[] chars = (char[]) unsafe.getObject(s, stringValueFieldOffset);

        System.out.println("String.value char[] via Unsafe length: " + chars.length);
    }
}

This is the output of a OpenJDK 8 Hotspot JVM (what I would expect):

C:\apps\Java\jdk1.8.0_152\bin\java.exe

Expected string length: 3
Expected char[] length: 3
Actual length: 3
Actual char[] length: 3
String.value char[] via Unsafe length: 3

This is the output of a OpenJDK 8 OpenJ9 JVM:

C:\apps\Java\openj9-jdk8u172-b11\bin\java.exe

Expected string length: 3
Expected char[] length: 3
Actual length: 3
Actual char[] length: 3
String.value char[] via Unsafe length: 4

I see the same behavior also on Linux.

I saw that in 0.9.0 there were already some changes to StringBuilder - so maybe this is already addressed there.

I used the latest Windows build from adoptopenjdk.net which was from: 2 May 2018

vm userRaised

All 13 comments

The changes for 0.9.0 won't give the same behavior as OpenJDK for Java 8, but they will for Java 9 and later.

Though I would add that, in general, using Unsafe to access the internal implementation of any class which is not part of your application is a fragile design - the internal implementation of any class in the Java Class Library can change at any time and there are some classes (such as java/lang/Class or java/lang/Thread) where different Java implementations will have different internal implementations of the public API. As @pshipton stated recent changes to OpenJ9 will mean the internal representation of StringBuilders should be consistent between the two Java implementations for Java 9 and later for the time being. I would be interested in understating what performance problem lead you to do this kind of internal representation inspection?

I really don't know - I am just using the Akka toolkit which does that stuff in its Unsafe.copyUSAsciiStrToBytes(String,byte[]) method.

So OpenJ9 will have no focus in being also consistent for Java 8 in that point?

Changing it for Java 8 may cause a change in performance for existing users of OpenJ9 Java 8. I don't think we should change the default behavior. There is a system property -Djava.lang.string.create.unique=true which almost gives want you want, but not quite. We could add another system property or add more states to the existing one.

If the Akka implementation is changed from:

https://github.com/akka/akka/blob/1919f222fa849ea155e2246e814351517eac07e2/akka-actor/src/main/scala/akka/util/Unsafe.java#L46-L57

to something like this I believe things should work with OpenJ9 on Java 8:

    public static void copyUSAsciiStrToBytes(String str, byte[] bytes) {
        if (isJavaVersion9Plus) {
            final byte[] chars = (byte[]) instance.getObject(str, stringValueFieldOffset);
            System.arraycopy(chars, 0, bytes, 0, chars.length);
        } else {
            final char[] chars = (char[]) instance.getObject(str, stringValueFieldOffset);
            int i = 0;
            while (i < str.length()) {
                bytes[i] = (byte) chars[i++];
            }
        }
    }

Note the change in the while loop. We can propose the fix to akka to make it more JVM agnostic.

Sure, we could try that - would be good coming from another JVM "implementer" I guess 馃憤

Sure, we could try that - would be good coming from another JVM "implementer" I guess 馃憤

I suppose before we do that are you able to make this change and test that it fixes the problem you're observing?

I tried for the last 20 minutes to get Akka compiling on my windows machine behind http proxy (corporate network).
Without luck so far - I'll keep on trying a little further, but doesn't look so good ;)

I tried for the last 20 minutes to get Akka compiling on my windows machine behind http proxy (corporate network).

On Java 8 you should be able to just simply compile that akka/util/Unsafe.java file by itself and bootclasspath it to your JVM when you invoke your akka workload.

I'm afraid that I'm too stupid - I tried several hours adding the modified Unsafe.class to the classpath or bootclasspath but it never is used.
I however assume that the modification would be working from what I observed in the reproducer.

I've proposed a fix in https://github.com/akka/akka/pull/25165 which should clean things up for OpenJ9. I'll leave this open until https://github.com/akka/akka/pull/25165 is merged at which point we can close this issue.

https://github.com/akka/akka/issues/25161 has been merged. This is fixed in the upstream akka repo. Please close the issue if the fix is satisfactory. Thanks!

Great, thanks so much for the help and the PR at the correct place.

Was this page helpful?
0 / 5 - 0 ratings