nix-shell completely broken

Created on 18 Jul 2016  Â·  33Comments  Â·  Source: NixOS/nix

nix-shell is broken, as reported in this mailinglist thread and the follow-up messages.

  • System: 16.09pre85931.125ffff (Flounder)
  • Nix version: nix-env (Nix) 1.11.2
  • Nixpkgs version: "16.09pre85931.125ffff"
bug

Most helpful comment

Having had the issue, disabling bash completion worked. Bash completion for c++ was apparently the first culprit triggering the bug.

After some debugging with strace etc. this seems to be the problem:

nix-build generates a temporary rc file for bash. The first thing this rc file does is to remove itself, meaning it can only be run once. Since BASH_ENV is also set to this file, all non-interactive calls to bash will source this file. This means that any such calls, before the main shell has a chance to source the rc file, will break nix-shell. (link to relevant code)

All autocompletion files are sourced from /etc/bashrc which is run before the rc file given by --rcfile. So there's a big risk of this happening. (edit: .bashrc -> /etc/bashrc)

In my case the completion file for c++ triggered the bug by calling c++ --version ..., a bash script which will source BASH_ENV. Using strace you can see that all later calls to the rc file fails.

Removing $ENV{BASH_ENV} = $rcfile; fixes the issue (--run still works). The alternative would be to handle the removal of the temporary file in a different manner.

All 33 comments

Hm. I'm not sure. I just tried nix 1.10 and the issue persists. I tried with --run and it worked. I guess it is some issue with the/my setup, but I cannot track it down.

On the ML there were suggestions that this might because of some PATH variable modifications, but there are none in my bashrc file.

Works for me:

$ nix-shell -p calcurse

[nix-shell:~]$ type -p calcurse
/nix/store/08lhvzm941h6w4f2srd5yrrf6z64z3rp-calcurse-4.0.0/bin/calcurse
$ nix-shell -p calcurse
$ # yes, I'm getting my normal prompt here, not the nix-shell prompt, but $SHLVL increased
$ # as posted in the mailinglist thread, my $PATH isn't modified within this new shell instance
$ type -p calcurse
$ #no output

So it must be with my setup.

Issue persists after reboot.

I'm having the same issue on my system but it works when using --pure. Removing the .bashrc and .profile in my home doesn't fix the problem. Creating a new user (with wheel access) and executing there does work. It might be something to do with the existing environment.

@zimbatm Yes, I also guess it is the environment. I do not have a .profile or .bashrc file in my $HOME, though. Just the /etc/bashrc where no modification to $PATH are made besides:

if [[ -d $HOME/archive/bin ]]
then
  export PATH=$PATH:$HOME/archive/bin
fi

edit: And that snippet is in my /etc/bashrc for ... years now... so I don't think that's the issue, actually.

@zimbatm Maybe a stupid question, but have you tried comparing the environment variables in the case it works and int he case it doesn't?

@samuelrivas For me they are equal (by shasum).

@matthiasbeyer you'll have to figure out yourself why this happens, it's hard to guess from the description.

@domenkozar ... hmmh. nix-shell in the store is a symlink to nix-build. If I call nix-build -p calcurse --run bash it gives me

these derivations will be built:
  /nix/store/zxqnrqp4jwml0lnyj2wixb4dy4q7lhw0-shell.drv
building path(s) ‘/nix/store/hpnr8zfa6qr1rpi6g8mz9wlpbyf3vpgp-shell’
builder for ‘/nix/store/zxqnrqp4jwml0lnyj2wixb4dy4q7lhw0-shell.drv’ failed to produce output path ‘/nix/store/hpnr8zfa6qr1rpi6g8mz9wlpbyf3vpgp-shell’
error: build of ‘/nix/store/zxqnrqp4jwml0lnyj2wixb4dy4q7lhw0-shell.drv’ failed

I have the same issue, already for quite some time, worked around it with nix-shell ... -p bashInteractive --run bash until now. interestingly, as mentioned here, --pure also works. Otherwise, no clue why this happens

I cannot reproduce this either; I fail to see how this issue can be resolved without further information, nor do I think it's fair to say that nix-shell is "completely broken" in its current state ...

nor do I think it's fair to say that nix-shell is "completely broken" in its current state ...

Well, for me it is. I don't know either how to reproduce, but as you can read, others have.

nix-shell works just fine for me on NixOS 16.03, on unstable, and on a openSUSE Tumbleweed host system running Nix from master. It strikes me as highly unlikely that nix-shell is "broken", really. It seems far more likely that your shell's setup scripts are doing something that breaks the interactive environment. A simple way to verify or contradict that hypothesis is to run

nix-shell -p haskell.compiler.ghc7103 --command "bash --noprofile"

or

nix-shell -p haskell.compiler.ghc7103 --command "bash --norc"

and then run ghc --version to check whether the compiler is in scope like it should be. If it is, then apparently not sourcing your start-up scripts makes a difference.

With nix-shell -p haskell.compiler.ghc7103 --command "bash --noprofile" I get no prompt until I press CTRL-C. Same for nix-shell -p haskell.compiler.ghc7103 --command "bash --norc"

Anyways, the bash I get does not contain ghc.

Okay, so it seems it is my fault somehow... but I cannot see how and why. Should I share my bashrc file?

I get no prompt until I press CTRL-C.

Well, this sounds bad. I'm not quite sure what nix-shell does internally -- and since it's written in Perl it's utterly impossible to find out --, but I _think_ that it executes the given command with whatever interpreter $SHELL points to, falling back to /bin/sh if $SHELL is unset.

If your system cannot execute bash --noprofile properly, then this cannot be the fault of your start-up scripts because the --noprofile flag means that these are never sourced. Rather, it sounds like your basic system shell doesn't work like it should. You should probably check:

  1. What does the environment variable $SHELL point to? Does it make a difference if you run nix-shell with $SHELL unset or set to /bin/sh?
  2. What is your users login shell according to /etc/passwd? Does it make a difference if you set that value to /bin/sh?

and since it's written in Perl it's utterly impossible to find out

:smile:

If your system cannot execute bash --noprofile properly

If I execute bash --noprofile directly in my terminal, it works beautifully, though.

$SHELL and /etc/passwd point to /var/run/current-system/sw/bin/bash - so this should be fine, shouldn't it?

I have the exact same issues as https://github.com/NixOS/nix/issues/976#issuecomment-235508054, so executing nix-shell -p haskell.compiler.ghc7103 --command "bash --noprofile" leads to the prompt only coming when pressing ctrl+c and then doesn't have the ghc command, while adding --pure (thus nix-shell -p haskell.compiler.ghc7103 --command "bash --noprofile" --pure works perfectly

@matthiasbeyer Maybe you can apply the following patch to nix-build:

diff --git a/scripts/nix-build.in b/scripts/nix-build.in
index 2d45e37..c136156 100755
--- a/scripts/nix-build.in
+++ b/scripts/nix-build.in
@@ -294,6 +294,7 @@ foreach my $expr (@exprs) {
         my $rcfile = "$tmpDir/rc";
         writeFile(
             $rcfile,
+            "set -x; " .
             "rm -rf '$tmpDir'; " .
             'unset BASH_ENV; ' .
             '[ -n "$PS1" ] && [ -e ~/.bashrc ] && source ~/.bashrc; ' .

This should show what bash is doing.

Adding set -x to nix-build, I get no output in the non-working cases and a lot of output in the working cases (with --pure or --run).

Also interestingly, nix-shell --comand some-nonexisting-command gives me a new bash prompt inside directly, while nix-shell --command existing-command gives me no output until typing ctrl+c when I also get a bash prompt.

So, the rcfile doesn't seem to be executed at all in the non-working cases.

Also, this whole thing works fine inside a pure nix-shell environment, so this has to be something with my system...

$ nix-shell --pure -p bashInteractive nix
[nix-shell]$ type -p perl

[nix-shell]$ NIX_DAEMON=daemon NIX_PATH=/nix/var/nix/profiles/per-user/root/channels/nixos nix-shell -p perl
[nix-shell]$ type -p perl
/nix/store/79mdd0cybnxl64dzfkyaixkp3viw7z06-perl-5.20.3/bin/perl

When the process is waiting for Ctrl-c, which program is running?

Looks like the set -x is already too late. Can you try this instead:

diff --git a/scripts/nix-build.in b/scripts/nix-build.in
index 2d45e37..1a2a9b5 100755
--- a/scripts/nix-build.in
+++ b/scripts/nix-build.in
@@ -311,6 +311,7 @@ foreach my $expr (@exprs) {
             $envCommand);
         $ENV{BASH_ENV} = $rcfile;
         my @args = ($ENV{NIX_BUILD_SHELL} // "bash");
+        push @args, "-x";
         push @args, "--rcfile" if $interactive;
         push @args, $rcfile;
         exec @args;

your patch has to be changed to include the -x after the --rcfile because appearantly, bash only accepts short options _after_ long options.

Anyway, with this adjustment, I'm getting https://gist.github.com/fkz/fd2a33b224aff92436d29f6762e465b5 for nix-shell -p python.

Seems to be lots of stuff for bash_completion, maybe I should turn that off somehow and test again...

@matthiasbeyer would you test turning bash completion off and confirm?

I just disabled bash completion in my configuration.nix and then:

nixos-rebuild build
./target/sw/bin/xterm

# in the new xterm:
nix-shell -p calcurse

I got the same results in the new xterm, ... prompt didn't change and calcurse was downloaded (curl output) but not in scope.

Having had the issue, disabling bash completion worked. Bash completion for c++ was apparently the first culprit triggering the bug.

After some debugging with strace etc. this seems to be the problem:

nix-build generates a temporary rc file for bash. The first thing this rc file does is to remove itself, meaning it can only be run once. Since BASH_ENV is also set to this file, all non-interactive calls to bash will source this file. This means that any such calls, before the main shell has a chance to source the rc file, will break nix-shell. (link to relevant code)

All autocompletion files are sourced from /etc/bashrc which is run before the rc file given by --rcfile. So there's a big risk of this happening. (edit: .bashrc -> /etc/bashrc)

In my case the completion file for c++ triggered the bug by calling c++ --version ..., a bash script which will source BASH_ENV. Using strace you can see that all later calls to the rc file fails.

Removing $ENV{BASH_ENV} = $rcfile; fixes the issue (--run still works). The alternative would be to handle the removal of the temporary file in a different manner.

Another userspace bug diagnosed by the transcendental powers of strace !

1034 appears to work for me, too, and finally fixes this annoying bug, thanks a lot @hedning !

Somehow, this issue still exists for me. Can someone tell me step-by-step how to debug this?

@matthiasbeyer Are you sure your nix have the fix? I don't think nix have had an release since the fix: The commit is dated 8. sept. while the last release is dated 6. sept. To make sure you actually have the fix open nix-build and search for $ENV{BASH_ENV} = $rcfile; If it's present you don't have the fix.

@edolstra Can we get a new 1.11 release?

Was this page helpful?
0 / 5 - 0 ratings