I upgraded an imperative container from 17.03 to 18.03 with nixos-unstable on the host: 18.09pre139319.1d9330d63a5 (Jellyfish), and now I can no longer run any nix commands:
$ nixos-rebuild -v switch
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
building Nix...
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
warning: don't know how to get latest Nix
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
building the system configuration...
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
$ nix-env -iA nixos.hello
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
I also tried creating a fresh container and it breaks in the same way.
nix-info
on the container fails with:
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
system: 0, multi-user?: no, error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
version: 0, error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
nix-info
on the host system:
system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.0.2, channels(goibhniu): "nixos-18.03pre120540.b8f7027360", channels(root): "nixos-18.09pre139319.1d9330d63a5", nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos/nixpkgs
I can confirm this issue on a clean 18.03 install.
To reproduce:
nixos-container create test
nixos-container start test
nixos-container root-login test
# nixos-rebuild switch
This issue sees related https://github.com/NixOS/nix/issues/2134
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
That’s the error messages you get when you try using nix 1.11
together with 2.0
.
I suspect inside the imperative containers there might be a different version running than outside?
This doesn't seem to be the issue. the container is running the same nix version, it seems:
[arian@nixos:~]$ sudo nixos-container root-login lol3
[sudo] password for arian:
[root@lol3:~]# nix-build --version
nix-build (Nix) 2.0.2
[root@lol3:~]# nix --version
nix (Nix) 2.0.2
[root@lol3:~]# nix-env --version
error: opening lock file '/nix/var/nix/db/big-lock': Read-only file system
[root@lol3:~]# readlink $(which nix-env)
/nix/store/j4di8j9awar03dfz2c91hd0yrdw427v1-nix-2.0.2/bin/nix-env
Interestingly enough, the command nix-env --version
crashes already
Aha, I think I found the culprit
/nix/var/nix/db
is mounted read-only into the container. Whilst in 17.09 I think it would have been read-write. (How else would an imperative container modify the nix store otherwise?)
However, I do not have a 17.09 install at hand on which I could try this.
[root@lol3:/nix/var/nix/db]# cat /proc/mounts
/dev/sda5 / btrfs rw,relatime,ssd,space_cache,subvolid=5,subvol=/var/lib/containers/lol3 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev 0 0
tmpfs /sys tmpfs ro,nosuid,nodev,noexec,relatime,mode=755 0 0
tmpfs /dev tmpfs rw,nosuid,size=822424k,mode=755 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,size=8224204k 0 0
devtmpfs /dev/net/tun devtmpfs rw,nosuid,size=822424k,nr_inodes=2046098,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=3,mode=620,ptmxmode=666 0 0
devpts /dev/console devpts rw,nosuid,noexec,relatime,gid=3,mode=620,ptmxmode=666 0 0
tmpfs /run tmpfs rw,nosuid,nodev,size=4112104k,mode=755 0 0
tmpfs /run/systemd/nspawn/incoming tmpfs ro,size=4112104k,mode=755 0 0
/dev/sda5 /nix/store btrfs ro,relatime,ssd,space_cache,subvolid=5,subvol=/nix/store 0 0
/dev/sda5 /nix/var/nix/daemon-socket btrfs ro,relatime,ssd,space_cache,subvolid=5,subvol=/nix/var/nix/daemon-socket 0 0
/dev/sda5 /nix/var/nix/db btrfs ro,relatime,ssd,space_cache,subvolid=5,subvol=/nix/var/nix/db 0 0
/dev/sda5 /nix/var/nix/gcroots btrfs rw,relatime,ssd,space_cache,subvolid=5,subvol=/nix/var/nix/gcroots/per-container/lol3 0 0
/dev/sda5 /nix/var/nix/profiles btrfs rw,relatime,ssd,space_cache,subvolid=5,subvol=/nix/var/nix/profiles/per-container/lol3 0 0
In nix 1.12, this lock was never acquired. So it was not a problem that
the /nix/var/nix/db
was mounted ro
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/containers.nix#L129
However, acquiring the lock throws an EROFS (Read only file system)
,
However, in Nix 2.0, the following patch was introduced:
https://github.com/NixOS/nix/commit/9bdd949cfdc9e49f1e01460a2a73215cac3ec904
Previously, the function would return on EROFS
, but now the exception is actually raised... So this seems to be an issue with nix 2.0
not working with a readonly db
folder.
I'm not very familiar with the nix
source code, and why not throwing on EROFS was not an issue in the past. But I have a feeling this is a nix
bug and should be filed in that repository as well.
You can work around this for now by remounting /nix/var/nix/db as read/write within the container:
mount -o remount,rw /nix/var/nix/db
Oh great! Thanks. Perhaps we should just add that line in the container
code.
On Thu, Jul 12, 2018, 11:51 goibhniu notifications@github.com wrote:
You can work around this for now by remounting /nix/var/nix/db as
read/write within the container:
mount -o remount,rw /nix/var/nix/db—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/NixOS/nixpkgs/issues/40355#issuecomment-404456799,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAmWoykF85d4gE2TC_68QCRV5s3G3S-Vks5uFxwGgaJpZM4T7kQ5
.
Making the host nix database writable in the container is a _very_ bad idea, eg.
<goibhniu> LnL: in case you missed it ... I ran `nix-collect-garbage -d` in a container, but it garbage collected everything from the host system :/
<LnL> euh...
<goibhniu> I get this error when I try to run nix-channel, or nix-env commands that download stuff. Both in the container and on the host.
<LnL> like gcroots on the host?
<goibhniu> yep, I only have access to the programs that are still running, and what's in the store for the container
<LnL> ok #1 please create an issue for that, this really shouldn't be possible
<goibhniu> at least I can install stuff from the store that the container uses
<ben> oh i guess the nix daemon isnt seeing your local env vars
<goibhniu> I can't look up the issue now, but it's probably because I changed /nix/var/nix/db (or something like that) to rw, to work around an issue.
<ben> is that how it works
<goibhniu> oh!
<LnL> hrm, that might have enabled this yes
<LnL> containers should only talk to the daemon, not access the db directly AFAIK
goibhniu thinks there's some env var that can get nix to not use the daemon
<LnL> well if you make the db writable it'll try to use that directly unless you explicitly set NIX_REMOTE=daemon
<LnL> container -> nix-cli -> host -> nix-daemon -> db would know about your host's roots, container -> nix-cli -> db won't
So how should the container connect to the host's nix-daemon? Docs read like it's only possbile to connect over ssh, or could you expose it as a file-based socket too?
Yes. It's a unix domain socket, running nix commands as an unprivileged user also use this to communicate with the daemon. By default it's located at /nix/var/nix/daemon-socket/socket
, but it's possible to customize with the --store flag or NIX_REMOTE environment variable if you want another path inside the container.
$ nix-store -r /nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10 --store unix://foo/socket
these paths will be fetched (0.04 MiB download, 0.19 MiB unpacked):
/nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10
copying path '/nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10' from 'https://cache.nixos.org'...
warning: you did not specify '--add-root'; the result might be removed by the garbage collector
/nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10
$ /nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10/bin/hello
Hello, world!
Is there anyone who knows how this stuff works, and thinks can help me get it fixed before 18.09
?
Otherwise I'd suggest removing containers from the documentation for the 18.09
release as they currently just do not work at all anymore. And bringing it back in the 19.03
release
cc @vcunat @samueldr
I'm afraid I don't know about the internals, but containers do work for me as long as I stick with e.g. nixos-container update mycontainer
instead of logging in and trying nixos-rebuild switch
.
@reinhardt I've updated the docs to reflect this
Okay, after some digging, I have found out what is going wrong, and how we can fix it.
The fact that this worked before seems to have been pure coincidence.
Because we are user root
within the container namespace, the nix
commands assume single-user mode, and try to modify the store directly. However, the root
inside the user namespace is different than the root
that owns /nix/store
. Hence we can't modify the store at al, and the nix
command crashes. Instead, we should force nixos-container
to talk to the host nix daemon instead:
The solution is to force the root
user to use the host daemon, which we can do as follows:
$ sudo nixos-container root-login
# export NIX_REMOTE=daemon
# nixos-rebuild switch
When we are not the root
user in the container, nix commands already work as expected...
[arian@t430s:~]$ sudo nixos-container create test --config 'users.users.arian = { isNormalUser = true; createHome = true; };'
[arian@t430s:~]$ sudo nixos-container start test
[arian@t430s:~]$ sudo nixos-container root-login test
# nix-channel --add https://nixos.org/channels/nixos-18.03 nixpkgs
# nix-channel --update
# su arian
$ <all nix commands now work>
We should add the NIX_REMOTE=daemon
environment variable to the root-login
command, and then everything should work as expected...
Nice catch!
Most helpful comment
Okay, after some digging, I have found out what is going wrong, and how we can fix it.
The fact that this worked before seems to have been pure coincidence.
Because we are user
root
within the container namespace, thenix
commands assume single-user mode, and try to modify the store directly. However, theroot
inside the user namespace is different than theroot
that owns/nix/store
. Hence we can't modify the store at al, and thenix
command crashes. Instead, we should forcenixos-container
to talk to the host nix daemon instead:The solution is to force the
root
user to use the host daemon, which we can do as follows:When we are not the
root
user in the container, nix commands already work as expected...We should add the
NIX_REMOTE=daemon
environment variable to theroot-login
command, and then everything should work as expected...