Nixpkgs: buildFHSUserEnv chroot, openjdk, statically-linked libraries in .so's, clang, and the procedure linkage table

Created on 3 Jan 2017  Â·  16Comments  Â·  Source: NixOS/nixpkgs

After looking around a java+JNI library with gdb while trying to debug the build of Android Open Source Project, I got to this point:

> │10749       Unique_SSL_CTX sslCtx(SSL_CTX_new(SSLv23_method()));                                                                                                                                                                    
  │
  │10750       Unique_SSL ssl(SSL_new(sslCtx.get()));                    

which in assembly is

 >│0x3ffcd67fd50 <NativeCrypto_get_cipher_names(_JNIEnv*, _jclass*, _jstring*)+64>         callq  0x3ffcd669060 <SSLv23_method@plt>

Now, SSLv23_method is statically linked, but the code somehow ends up ignoring the statically linked copy and goes straight to the dynamic version:

(gdb) info address SSLv23_method
Symbol "SSLv23_method" is a function at address 0x3ffb9ad7eb0.
(gdb) info address SSLv23_method@plt
Symbol "SSLv23_method@plt" is at 0x3ffb99f9060 in a file compiled without debugging.

The compile/link command is the following:

[ 83% 5/6] /bin/bash -c "prebuilts/clang/host/linux-x86/clang-copperhead/bin/clang++ -Wl,-rpath-link=out/host/linux-x86/obj/lib -Wl,-rpath,\\\$ORIGIN/../lib64 -Wl,-rpath,\\\$ORIGIN/lib64 -shared -Wl,-soname,libconscrypt_openjdk_jni.so  -Lout/host/linux-x86/obj/lib    -m64 -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--no-undefined-version    --gcc-toolchain=prebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8 --sysroot prebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8/sysroot -Bprebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8/x86_64-linux/bin -Bprebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8/lib/gcc/x86_64-linux/4.8 -Lprebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8/lib/gcc/x86_64-linux/4.8 -Lprebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.15-4.8/x86_64-linux/lib64/ -target x86_64-linux-gnu    -nodefaultlibs      out/host/linux-x86/obj/SHARED_LIBRARIES/libconscrypt_openjdk_jni_intermediates/src/main/native/org_conscrypt_NativeCrypto.o out/host/linux-x86/obj/SHARED_LIBRARIES/libconscrypt_openjdk_jni_intermediates/src/openjdk/native/JNIHelp.o               -Wl,--whole-archive  out/host/linux-x86/obj/STATIC_LIBRARIES/libcrypto_static_intermediates/libcrypto_static.a out/host/linux-x86/obj/STATIC_LIBRARIES/libssl_static-host_intermediates/libssl_static-host.a -Wl,--no-whole-archive   out/host/linux-x86/obj/STATIC_LIBRARIES/libc++_static_intermediates/libc++_static.a out/host/linux-x86/obj/STATIC_LIBRARIES/libcompiler_rt-extras_intermediates/libcompiler_rt-extras.a      -o out/host/linux-x86/obj/lib/libconscrypt_openjdk_jni.so  -lpthread  -lpthread -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc"

Hypothesis: ld-linux and bundled ld mismatch

Using strace, it appears that it calls its own bundled copy of gcc's ld directly.

Could the lack of using ld-wrapper be causing this? Could patchelf? Could this be an insufficiently specified chroot environment in build-fhs-userenv/env.nix?

Result of ldd on the problem library:

android-env-chrootenv:root@spaceserv:/home/spacekitteh/code/copperhead/nougat-mr1-release# ldd out/host/linux-x86/lib64/libconscrypt_openjdk_jni.so
        linux-vdso.so.1 (0x000003ff8adca000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x0000038f24d91000)
        libm.so.6 => /usr/lib/libm.so.6 (0x0000038f24a8c000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x0000038f24876000)
        libc.so.6 => /usr/lib/libc.so.6 (0x0000038f244d8000)
        /nix/store/aym0jj00gal73w59gp2ifkwqlbf7s98h-glibc-multi-2.24/lib/ld-linux-x86-64.so.2 (0x000002b925384000)

Hypothesis: Java's JNI gets confused by the chroot

Or could it be to do with how Java JNI handles static vs shared libs?

For reference, here is part of the result of running ldd \which java``:

   libcrypto.so.1.0.0 => /nix/store/f7vp024rwfnmzm17p3c6ji3zc79a5vyr-openssl-1.0.2j/lib/libcrypto.so.1.0.0 (0x000003be33b05000)

Related StackOverflow question

From the JNI spec

Once a native library is loaded, it is visible from all class loaders. Therefore two classes in different class loaders may link with the same native method. This leads to two problems:

  • A class may mistakenly link with native libraries loaded by a class with the same name in a different class loader.
  • Native methods can easily mix classes from different class loaders. This breaks the name space separation offered by class loaders, and leads to type safety problems.

background:
https://github.com/NixOS/nixpkgs/issues/21222#issuecomment-269198363

https://github.com/google/conscrypt/issues/20

All 16 comments

Here's where the BoringSSL library is (or at least, should be) statically linked into the libconscrypt_openjdk_jni.so file, which should then be used instead of openjdk loading openssl from the host.

Gonna tag peeps who know about the core infrastructure and buildFHSUserEnv.

@edolstra @fpletz @vcunat @joachifm @abbradar

Can this be reproduced without building all the AOSP? I've successfully built and ran unit tests for Conscrypt (standalone) in a chrootenv. I don't remember anything specific which we do that can cause this (in my knowledge symbols can't be overridden with anything short of LD_PRELOAD -- and I haven't even imagined that _statically linked_ PIC code can be affected with this!).

Yep. Try make BasicDreams

On Wed, 4 Jan 2017 at 20:54 Nikolay Amiantov notifications@github.com
wrote:

Can this be reproduced without building all the AOSP? I've successfully
built and ran unit tests for Conscrypt (standalone) in a chrootenv. I don't
remember anything specific which we do that can cause this (in my knowledge
symbols can't be overridden with anything short of LD_PRELOAD -- and I
haven't even imagined that statically linked PIC code can be affected
with this!).

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/NixOS/nixpkgs/issues/21606#issuecomment-270345029,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB2crBJEcLTURk-CZKyaZABheMzto9b8ks5rO3pdgaJpZM4LZcj3
.

Hm, that's better but I still don't have necessary space for downloading all the AOSP now (I can meddle with manifests but I don't have this much time).

Interesting fact: Arch Linux's Java doesn't reference libcrypto:

[root@e7e065246fe8 /]# ldd `which java`
    linux-vdso.so.1 (0x00007ffc5e96b000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fc623d26000)
    libz.so.1 => /usr/lib/libz.so.1 (0x00007fc623b10000)
    libjli.so => /usr/lib/jvm/java-8-openjdk/jre/bin/../lib/amd64/jli/libjli.so (0x00007fc623902000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc6236fe000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc623360000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc623f43000)

I think they link statically more things than we are.

Oh, I think that's because we link to GTK+ in non-headless version.

Can you try to use jre8_headless in the chrootenv?

Oh hey! Using jre8_headless works! <3

Now to figure out the underlying problem.

I think that's because libcrypto loaded in the address space of Java takes priority over symbols in Conscrypt. I thought this was impossible; my theory is that GOT is filled from the same global namespace even for statically linked symbols when they are compiled as PIC. Strange!

Nikolay.

Yeah, that's what I was gonna explore as my third hypothesis, as it's a combination of the first two!

What do you think causes this? As in, which upstream?

Depends on what do you perceive as a problem. I think the best course of action is to granularize our dynamic linking so that not the whole Java links to e.g. fontconfig or GTK but only the library that actually uses it. This will be a bit tedious but that's what Java library authors seem to expect.

The root of the problem, if the theory is correct, needs design fixes for ld.so so that statically linked symbols can't be overridden like this. This is a bit more difficult to explore, implement and upstream ~_^.

Nikolay.

Weird to think we discovered a bug in such a commonly used component.

Seems like it could be a pretty easy thing to exploit!

I don't think it's a bug, rather it's a design choice (treat static and dynamic libraries as evenly as possible). That can be very evil however as we can see, especially in presence of inlining. That is, again, if the theory is correct.

Nikolay.

Hm, I don't see ways for this to be exploited any more than dynamic libraries - what do you mean?

Nikolay.

Hmm, I guess you're right; you'd need LD_PRELOAD or something similar.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ob7 picture ob7  Â·  3Comments

ayyess picture ayyess  Â·  3Comments

rzetterberg picture rzetterberg  Â·  3Comments

tomberek picture tomberek  Â·  3Comments

sid-kap picture sid-kap  Â·  3Comments