Nixpkgs: cycle detected in Darwin->Linux cross GCC

Created on 20 May 2020  路  30Comments  路  Source: NixOS/nixpkgs

Describe the bug
When building a GCC for cross compiling from Darwin to Linux, the build fails with a "cycle detected" error.

To Reproduce

  1. Check out Nixpkgs staging on a Darwin box.
  2. nix-build . --arg crossSystem '(import ./. {}).lib.systems.examples.gnu64' -A stdenv.cc

Expected behavior
GCC compiles successfully, and can be used to produce Linux binaries.

Screenshots

cycle detected in the references of '/nix/store/51hmxfsm1s1089vcl6wxsm1vb6r0kar8-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0' from '/nix/store/73vjzvxk0430rn4njhv09jnmqrz99h26-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0-lib'

Additional context
The cycle:

  • 51hmxfsm1s1089vcl6wxsm1vb6r0kar8-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0/nix-support/propagated-build-inputs includes 73vjzvxk0430rn4njhv09jnmqrz99h26-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0-lib
  • 73vjzvxk0430rn4njhv09jnmqrz99h26-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0-lib/x86_64-unknown-linux-gnu/lib/lib{asan,tsan,lsan,ubsan}.so* refer to 51hmxfsm1s1089vcl6wxsm1vb6r0kar8-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.3.0

Notify maintainers
@Synthetica9

Metadata
Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

  • system: "x86_64-darwin"
  • host os: Darwin 19.4.0, macOS 10.15.4
  • multi-user?: yes
  • sandbox: no
  • version: nix-env (Nix) 2.3.3
  • channels(root): "darwin, home-manager, nixpkgs-20.09pre221814.10100a97c89"
  • channels(gaelan): "darwin, home-manager, nixpkgs-20.09pre221814.10100a97c89"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixpkgs

Channels are irrelevant, I'm using a checkout of staging.

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
  - gcc-unwrapped
# a list of nixos modules affected by the problem
module:
bug cross-compilation darwin

Most helpful comment

All 30 comments

Might want to try reverting d9feea58aeded06aded50eee7e9d69ebe1086e7e

@matthewbauer I don't know if that helps, but I ran into the same issue and saw that there were errors about patchelf not being found. So I'm not sure if the commit you mentioned would do anything, but I could be wrong.

Also, using x86_64-embedded instead of gnu64 seems to work fine, so maybe that helps narrowing down the issue.

That revert didn't help. I'm working on bisecting now鈥攕hould be done in a few hours.

Alright, bisect done. It was the commit right before that one (from the same PR), e1831ebea3bcb415800a74fe458f802e63549e1c.

Looping in @lopsided98, author of that commit

I haven't looked too closely into this, but I think you might have to revert d9feea5 but also change the paths slightly to account for the changed lib location in e1831eb. I'd also make sure to add a comment explaining what platforms require the RPATH patching.

Here's libasan.so.5.0.0's RPATH:

/nix/store/9m3clinl5kksvx5vl0y1hsqhjf6wga4v-glibc-2.30-x86_64-unknown-linux-gnu/lib
/nix/store/gfdvnjl2phb0dl2nqrfm33wxcxb2qkrr-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.2.0/x86_64-unknown-linux-gnu/lib/../lib64
/nix/store/gfdvnjl2phb0dl2nqrfm33wxcxb2qkrr-x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.2.0/x86_64-unknown-linux-gnu/lib
/nix/store/c5jcqksk0gybrxr759bxpsrd1xy8vh7x-swift-corefoundation/Library/Frameworks

Other than glibc, I don't think any of those need to be there: x86_64-unknown-linux-gnu-stage-final-gcc-debug-9.2.0/x86_64-unknown-linux-gnu/lib (and lib64, which is just a symlink to lib) only contain static libraries, and CoreFoundation is a complete WTF.

Interesting that you mention CoreFoundation. I had a problem cross-compiling something because of this error:

x86_64-elf-gcc: error: unrecognized command line option '-iframework'

Using nix-shell I was able to find that the flag -iframework /nix/store/zpf03j84cdp2vki1irc1zny03zf2x7lh-swift-corefoundation/Library/Frameworks ended up in the variable NIX_CFLAGS_COMPILE_FOR_TARGET. Overriding stdenv with pkgs.darwin.stdenvNoCF did the trick, but I suspect something fishy is going on if those flags that are only valid for the host are ending up in the target flags.

I'm not sure if that's related, if not, I would pursue this issue separately. But I thought I'd mention it here in case it is related.

I tried reverting d9feea5 and then applying some adjustments to that portion, but I couldn't get it to work. I wanted to check if patchelf is actually called, and all I see is /nix/store/9kha971vs7n8wdasbh84gr0z7j898l5v-builder.sh: line 244: type: patchelf: not found.

I would love to help, because I also need this for cross-compiling, but I'm a bit lost. If there's anything I should try or test, let me know.

Reverting https://github.com/NixOS/nixpkgs/commit/e1831ebea3bcb415800a74fe458f802e63549e1c works after resolving the conflicts, though, even with the patchelf not found messages.

What I think is happening is that without e1831eb, the offending libraries stay in $out, so that doesn't create a dependency cycle. When they get moved to $lib they should get their RPATHs patched by the modified revert of d9feea5, but this doesn't happen because for some reason patchelf is not in PATH.

The manual sounds like it's only implicitly available on Linux:

On Linux, stdenv also includes the patchelf utility.

Maybe this dependency needs to be added when the target platform is Linux as well? I tried to get patchelf to work, but when I manually put it in the build inputs, I got error messages about files not being ELF executables. I think it tried to patch native Mach-O binaries as well. So the script probably needs to check if the targeted files are ELFs before trying to patch their rpath.

Progress: I got gcc to build, but building hello and nix-copy-closureing it to my linux box results in this:

Inconsistency detected by ld.so: get-dynamic-info.h: 146: elf_get_dynamic_info: Assertion `info[DT_RUNPATH] == NULL' failed!

Progress: I got gcc to build, but building hello and nix-copy-closureing it to my linux box results in this:

Inconsistency detected by ld.so: get-dynamic-info.h: 146: elf_get_dynamic_info: Assertion `info[DT_RUNPATH] == NULL' failed!

Does /nix/store/...-glibc-2.30/lib/ld-linux-x86-64.so.2 give the same failed assertion?
Might want to dump the contents of readelf -d /nix/store/...-glibc-2.30/lib/ld-linux-x86-64.so.2. It shouldn't have any RUNPATHs/RPATHs.

@matthewbauer yes, directly invoking ld-linux-x86-64.so.2 yields the same error. readelf tells me that it has a RUNPATH of [].

readelf on a working ld-linux-x86-64.so.2 reports no RUNPATH at all, so perhaps there's a distinction between an empty RUNPATH and no RUNPATH.

I wonder if the RUNPATH should not be there in the first place or if it should be stripped to empty and then removed. As far as I can see, setting an empty RUNPATH using patchelf or shrinking unnecessary entries does not remove the entry itself, only --remove-rpath does.

What I don't know yet is when this worked the last time (if ever?) and what changed since then.

As @parthy just found out, I have been working on the same issue in PR #90526. Feel free to try this and discuss, what is the best way to fix this.

Did you manage to get around this DT_RUNPATH issue? Or have you not seen this one in your setup?

I was only compiling OS-level things that are linked in a way, where this apparently does not surface. But I can reproduce the issue.

Could this be a linker issue? When I cross-compile down to .o, then copy the object files to a Linux box and link there, everything works. The resulting binary has no runpath set. Linking on macOS using cross-bintools results in the broken runpath and the binary does not work.

After some further tests, this appears to be less a compiler problem, but more a problem of the dynamic linker built as part of glibc. The same hello binary that asserts when executed as is (./hello) works fine when executed with the dynamic linker that comes with my Linux-Nix-installation:

michael@kalyke:~ > /nix/store/x3z8hk8jwy06kkvp83jyf4adp6anq4fa-hello-2.10-x86_64-linux/bin/hello
Inconsistency detected by ld.so: get-dynamic-info.h: 146: elf_get_dynamic_info: Assertion `info[DT_RUNPATH] == NULL' failed!
michael@kalyke:~ > /nix/store/bzisrjgcvn6mfcl4wcyky9y3rjaw3smr-glibc-2.30-x86_64-linux/lib/ld-linux-x86-64.so.2 /nix/store/x3z8hk8jwy06kkvp83jyf4adp6anq4fa-hello-2.10-x86_64-linux/bin/hello
Inconsistency detected by ld.so: get-dynamic-info.h: 146: elf_get_dynamic_info: Assertion `info[DT_RUNPATH] == NULL' failed!
michael@kalyke:~ > /nix/store/qvf11lymvw6n8g66xgj1wsm28z1viqdv-glibc-2.30/lib64/ld-linux-x86-64.so.2 /nix/store/x3z8hk8jwy06kkvp83jyf4adp6anq4fa-hello-2.10-x86_64-linux/bin/hello
Hallo, Welt!

Got it working. I think we have two separate issues:

  • the cyclic dependency, which @Gaelan鈥檚 459c60dda2406d2207d2afd90f210ffc87efde54 and my PR #90526 try to address (guess we have to just pick one solution)
  • the CoreFoundation rpath being always set (see here and here), even when cross-compiling

This rpath creeps into the dynamic linker as part of the glibc build and never gets patchelf鈥檇 away entirely, leaving this empty rpath, which then triggers the assertion. I worked around the issue with this overlay:

(self: super: {
    glibcCross = if super.buildPlatform.isDarwin then (
        super.glibcCross.overrideAttrs (attrs: {
            preConfigure = attrs.preConfigure + "unset NIX_COREFOUNDATION_RPATH";
        })
    ) else super.glibcCross;
})

A proper fix might be to set NIX_COREFOUNDATION_RPATH in the Darwin stdenv鈥檚 preHook only, when the host platform is Darwin?

@mroi, you are my hero ;) I temporarily added this unset statement like so and it makes my compilation results useful:

diff --git a/pkgs/development/libraries/glibc/common.nix b/pkgs/development/libraries/glibc/common.nix
index 36b6bea61cd..4d4d5842f5a 100644
--- a/pkgs/development/libraries/glibc/common.nix
+++ b/pkgs/development/libraries/glibc/common.nix
@@ -208,6 +208,8 @@ stdenv.mkDerivation ({


   '' + lib.optionalString (stdenv.hostPlatform != stdenv.buildPlatform) ''
+    unset NIX_COREFOUNDATION_RPATH
+
     sed -i s/-lgcc_eh//g "../$sourceRoot/Makeconfig"

     cat > config.cache << "EOF"

Good to have confirmation. The next step is probably to discuss the best way to fix this: as a general condition for setting this variable (where?) or as a special case within glibc.

@Gaelan is that PR? It looks good to me!

Just hit this. Is there a PR up for this?

Is there any update on this?

The diff in 459c60d makes sense to me.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nico202 picture nico202  路  70Comments

worldofpeace picture worldofpeace  路  103Comments

grahamc picture grahamc  路  88Comments

nh2 picture nh2  路  76Comments

peti picture peti  路  75Comments