Nixpkgs: wrapProgram & friends depend on linux kernel shebang feature

Created on 18 Nov 2015  ·  13Comments  ·  Source: NixOS/nixpkgs

While trying to package JRuby as a more ruby like package, I discovered that OSX does not support shebangs at the kernel level.

Running jruby (which is a bash script that starts java) works fine on any unix.
Trying to run gem which has jruby in it's shebang:

➭ /nix/store/5cxwjdd1izi7gk0i3csl1ifb2y6m69p1-jruby-9.0.4.0/bin/gem
zsh: exec format error: /nix/store/5cxwjdd1izi7gk0i3csl1ifb2y6m69p1-jruby-9.0.4.0/bin/gem

And apparantly bash has a fallback mode where it interprets the script as a bash script:

building path(s) ‘/nix/store/d64p6aq2hm1f8xz5smd4d7blzvygi2jc-tzinfo-1.2.2.gem’
unpacking sources
/nix/store/5cxwjdd1izi7gk0i3csl1ifb2y6m69p1-jruby-9.0.4.0/bin/gem: line 4: syntax error near unexpected token `('
/nix/store/5cxwjdd1izi7gk0i3csl1ifb2y6m69p1-jruby-9.0.4.0/bin/gem: line 4: `load File.join(File.dirname(__FILE__), 'jgem')'
builder for ‘/nix/store/nw9fwklp9jvp4bilvkhwcn45zgg51v1s-tzinfo-1.2.2.gem.drv’ failed with exit code 2
cannot build derivation ‘/nix/store/x99njf4xb10y21wbchj99667vfvwlyjh-fluentd-0.12.6.drv’: 1 dependencies couldn't be built
error: build of ‘/nix/store/x99njf4xb10y21wbchj99667vfvwlyjh-fluentd-0.12.6.drv’ failed

More details:

If jruby were a binary, it would probably work fine. Maybe wrapProgram can generate and compile a tiny C program that runs the wrapped script.

enhancement mass-rebuild stale

Most helpful comment

Just ran into the same issue, a Python script got patchShebang'd to point to the python2.7 wrapper. On Darwin and Solaris, shebang cannot point to a script.

On top of that, using custom shebangs adds changes to otherwise completely unchanged scripts, and it makes the store deduplication work less well.

How about making a __nixExec binary that knows about the nix store, and a shebang like #!/usr/bin/env __nixExec configName [...args] would make it search for /nix/store/package-where-real-scriptfile-is/.nixExec-configName to set the path and so on, and it also reads the first line of the script to get all the arguments so we're not limited to the 128 chars of Linux?

Then

  • shebang points to an executable so it works on Darwin
  • we have unlimited arguments
  • it will properly fail if __nixExec doesn't exist
  • __nixExec can do extra checks at runtime if needed, complaining about missing packages with the nix-store command to run etc
  • script content will stay mostly unchanged between versions and give slightly better deduplication rates

All 13 comments

:+1: I can work on this if we agree... @edolstra ? I know you don't like wrappers, but... :)

Related is patchShebang, because it removes /usr/bin/env which is a binary that makes the original scripts work.

In other words, originally gem starts with

#!/usr/bin/env jruby

After patchShebang

#!/nix/store/5cxwjdd1izi7gk0i3csl1ifb2y6m69p1-jruby-9.0.4.0/bin/jruby

Related: https://github.com/NixOS/nixpkgs/issues/2146. I attempted to make patchShebangs change "/usr/bin/env prog" into "/nix/store/.../env /nix/store/.../prog", but that hit the default kernel shebang size limit. (See discussion in the mentioned issue.)

Yeah, it would solve multiple issues at once. Still, we probably want a simple way of inspecting what the wrapper does – options off the top of my head:

  • create also a text file next to the wrapper containing arguments to makeWrapper,
  • or even make the binary wrapper generic, i.e. make it just interpret that text file without hard-coding anything into the binary (perhaps do hardcode the location of the text file, as argv0 stuff is tricky).

@vcunat which is basically exec bash and source that file :)

Yes, that wouldn't really change much from the current state. EDIT: ...except for fixing all those problems, of course.

Thinking out loud:

makeWrapper could generate and build a simple C program, injecting the the flags as an array of strings. To recover the flags, we'd have another really small program that dlopens the binary, calls dlsym(handle, "<name-of-magic-symbol>"), and then prints the array of strings (could be formatted with a delimiter - null perhaps - or maybe as bash array syntax).

Now, we could just use getopt(3) on the array and handle the flag processing in the program, or we could instead generate code that does just the right thing. I don't have any strong feelings here.

How does that sound?

Whatever we do, I agree with @vcunat that we ought to have a way to inspect the flags used for a given wrapper. I've had cases where I would extend an existing package with an additional fixupPhase, use wrapProgram to set an important env-var, only to clobber the existing .<prog>-wrapped file (because of mv $prog $hidden, where $prog is now the existing wrapper). Having a way to inspect the flags could allow us to make wrapProgram only regenerate <prog>, merging the two flags - which is, IMO, the correct behavior.

I just hit this issue as well.

OS X seems to not mind the longer /nix/store/...-coreutils/bin/env /nix/store/...-prog/bin/prog shebang interpreter line. Could we set patchShebangs up to conditionally include the nix-coreutils-env prefixed shebang only on OS X (or on BSDs) and only for interpreters detected to be shebang scripts themselves, and continue to use the current behavior (no env in shebang) on linux? That seems like it would also fix this problem.

The one downside I can imagine for that approach is any derivation which uses patchShebangs on a script which itself uses a script interpreter (so, if patchShebangs sees a wrapProgram'd executable script, etc.) would now become a platform-dependent derivation if it weren't already. That seems acceptable for my purposes, but not sure if it would be acceptable in general.

(commenting here rather than #2146 as this discussion is more recent, but they seem like duplicate issues to me)

Here is a (crappy) way to work around this issue:

{ stdenv, bundlerEnv}:
{
  osxWrap = { name, script, bundlerEnvResult }: stdenv.mkDerivation {
    inherit name;
    buildCommand = ''
      mkdir bin

      sed \
        -e "s#/bin/ruby #/bin/ruby ${script} #" \
        ${bundlerEnvResult.wrappedRuby}/bin/ruby > ./bin/${name}

      mkdir -p $out/bin
      install -D -m755 ./bin/${name} $out/bin/${name}
    '';
  };

  neato-script = osxWrap {
    name = "my-neato-script";
    script = ./my-neato-script.rb;
    bundlerEnvResult = bundlerEnv rec {
      name = "my-neato-script-env";

      inherit ruby;
      gemfile = ./Gemfile;
      lockfile = ./Gemfile.lock;
      gemset = ./gemset.nix;
    };
  };
}

Here is another crappy workaround, for not ruby scripts:

let 
rewrap = { name, script, interpreter }:
stdenv.mkDerivation {
  inherit name;

  # The normal bundler wrapper uses a script in a shebang. This works
  # fine on Linux, but OSX's kernel doesn't permit this.
  #
  # This wrapper copies the wrappedRuby script, which ends in:
  #
  #   exec /nix/store/.../bin/.foo-wrapped" "${extraFlagsArray[@]}" "$@"
  #
  # and adds the script between `bin/.foo-wrapped"` and `"${extraFlagsArray`:
  #
  #   exec /nix/store/.../bin/.foo-wrapped" /nix/store/...-my-script "${extraFlagsArray[@]}" "$@"

  # and since the wrapped script uses bash directly in the
  # shebang (and not another script) this works on macOS and Linux.
  buildCommand = ''
    sed \
      -e "s#-wrapped\" #-wrapped\" ${script} #" \
      ${interpreter} > ./working

    if ! grep -q "${script}" ./working; then
      echo " ABORTING:"
      echo "    sed appears to not have correctly added the script on"
      echo "    the exec line, which breaks this tool."
      exit 1
    fi

    mkdir -p $out/bin
    install -D -m755 ./working $out/bin/${name}
  '';
};
in
rewrap {
  name = "timeout";
  script = ./timeout.tcl;
  interpreter = "${expect}/bin/expect";
}

Just ran into the same issue, a Python script got patchShebang'd to point to the python2.7 wrapper. On Darwin and Solaris, shebang cannot point to a script.

On top of that, using custom shebangs adds changes to otherwise completely unchanged scripts, and it makes the store deduplication work less well.

How about making a __nixExec binary that knows about the nix store, and a shebang like #!/usr/bin/env __nixExec configName [...args] would make it search for /nix/store/package-where-real-scriptfile-is/.nixExec-configName to set the path and so on, and it also reads the first line of the script to get all the arguments so we're not limited to the 128 chars of Linux?

Then

  • shebang points to an executable so it works on Darwin
  • we have unlimited arguments
  • it will properly fail if __nixExec doesn't exist
  • __nixExec can do extra checks at runtime if needed, complaining about missing packages with the nix-store command to run etc
  • script content will stay mostly unchanged between versions and give slightly better deduplication rates

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

ob7 picture ob7  ·  3Comments

rzetterberg picture rzetterberg  ·  3Comments

tomberek picture tomberek  ·  3Comments

grahamc picture grahamc  ·  3Comments

chris-martin picture chris-martin  ·  3Comments