Describe the bug
Multiple errors when using SteamLinuxRuntime_soldier/run-in-soldier
This prevent the latest Proton (5.13) from running
To Reproduce
The following commands are to run in a steam-run shell: steam-run /bin/bash
x86_64-linux-gnu-capsule-capture-libs: code 2: open "/etc/ld.so.cache": No such file or directory
/run/pressure-vessel/pv-from-host/bin/pressure-vessel-adverb: error while loading shared libraries: libdl.so.2: cannot open shared object file: No such file or directory
It's finally printing
Expected behavior
Should run natively from within steam and without manual tweaks
Additional context
There is two problems to solve:
/etc/ld.so.cache
file--filesystem=/nix
option to steam bwrapNotify maintainers
Metadata
Please run nix-shell -p nix-info --run "nix-info -m"
and paste the result.
- system: `"x86_64-linux"`
- host os: `Linux 5.4.69, NixOS, 21.03.git.0da2a1a113e (Okapi)`
- multi-user?: `yes`
- sandbox: `yes`
- version: `nix-env (Nix) 2.3.7`
- channels(root): `"nixos-21.03pre246543.24c9b05ac53"`
- nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
Maintainer information:
# a list of nixpkgs attributes affected by the problem
attribute:
- steam
# a list of nixos modules affected by the problem
module:
Probably related, if I run Steam from a terminal window and try to run a game with Proton 5.13 I get this message before it (the game) crashes:
ERROR: ld.so: object '/home/user/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
/home/user/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel/bin/pressure-vessel-adverb: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
ERROR: ld.so: object '/home/user/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
/home/user/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel/bin/pressure-vessel-adverb: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
@nyanloutre, to clarify, if you apply these two workarounds (both of which are entirely unsupported and are done at your own risk, but they provide useful information):
/etc/ld.so.cache
_start-container-in-background
script to add --filesystem=/nix
next to the other --filesystem
optionsis that sufficient to make Proton 5.13 work? (I suspect the answer is no, but I hope I'm wrong.)
How to permanently add the
--filesystem=/nix
option to steam bwrap
If it's that simple, then we can do the equivalent of that in the pressure-vessel source code. Making pressure-vessel work on non-FHS host distributions is not really on the radar right now (we have higher-priority issues to deal with, and fixes that apply to FHS distributions like Ubuntu benefit a lot more people than fixes for individual non-FHS distributions); but making a one-line change to share an extra filesystem has such a good effort/result and risk/result ratio that we can probably sneak it in to a future release anyway.
I'm not familiar with NixOS, except knowing that it's non-FHS. What's in /nix
? Does it contain executables, libraries and static data belonging to OS-level packages, like /usr
on a FHS system? Or does it contain system-specific variable data, like /etc
, /var
and /run
on a FHS system? Or does it contain user data, like /home
and /srv
on a FHS system? Or some mixture of those?
If /nix
only contains things that are reasonably similar to a FHS system's /usr
, then we can certainly share it in the same situations where we'd share /usr
.
If there are symbolic links into /nix
from locations like /usr
and /lib
, are they absolute (/lib/foo -> /nix/path/to/foo
), or relative (/lib/foo -> ../nix/path/to/foo
), or a mixture? (This affects where we have to put it in the container to make it work: /nix
, or /run/host/nix
, or both.)
How to generate the
/etc/ld.so.cache
file
Where does a NixOS system normally keep its ld.so
cache, and where will NixOS' glibc look for it?
Handling this "nicely" is likely to need C source code changes in libcapsule and pressure-vessel. See https://github.com/ValveSoftware/steam-runtime/issues/230#issuecomment-712133364 and https://gitlab.collabora.com/vivek/libcapsule/-/merge_requests/40 - the exact paths involved on Exherbo and NixOS are almost certainly different, but the general idea will be the same.
We do not have the resources to support non-FHS distros, but I'm hoping to get infrastructure in place in the C code so that distro developers can contribute a patch that adds their filesystems/ld.so.cache/etc. to a reasonably obvious list (probably with Exherbo as the initial example, just because someone has already told me how it works).
Hello, and thanks for the help !
@nyanloutre, to clarify, if you apply these two workarounds (both of which are entirely unsupported and are done at your own risk, but they provide useful information):
* generate `/etc/ld.so.cache` * hack the `_start-container-in-background` script to add `--filesystem=/nix` next to the other `--filesystem` options
is that sufficient to make Proton 5.13 work? (I suspect the answer is no, but I hope I'm wrong.)
It still doesn't launch any games, I have this error:
ERROR: ld.so: object '/home/paul/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
/mnt/hdd/Steam/steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel/bin/pressure-vessel-adverb: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
ERROR: ld.so: object '/home/paul/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
/mnt/hdd/Steam/steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel/bin/pressure-vessel-adverb: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
I'm not familiar with NixOS, except knowing that it's non-FHS. What's in
/nix
? Does it contain executables, libraries and static data belonging to OS-level packages, like/usr
on a FHS system? Or does it contain system-specific variable data, like/etc
,/var
and/run
on a FHS system? Or does it contain user data, like/home
and/srv
on a FHS system? Or some mixture of those?
In Nixos every binary and libs are located in /nix, everything else is only simlinks.
To make Steam work, NixOS is using bwrap to give it an FHS environment. Unfortunatly it's not enough to make the steam runtime work.
If there are symbolic links into
/nix
from locations like/usr
and/lib
, are they absolute (/lib/foo -> /nix/path/to/foo
), or relative (/lib/foo -> ../nix/path/to/foo
), or a mixture? (This affects where we have to put it in the container to make it work:/nix
, or/run/host/nix
, or both.)
The links are always absolute and point to /nix/store/xxxxxxxxxxx-libfoo/...
Where does a NixOS system normally keep its
ld.so
cache, and where will NixOS' glibc look for it?
NixOS doesn't have an ld.so cache, but it may be possible to generate it in the wrapper script that already launch steam in a bwrap.
I also managed to get this error by adding unset LD_PRELOAD
in the _v2-entry-point script:
/usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so: undefined symbol: g_type_ensure
Failed to load module: /usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so
ERROR: ld.so: object '/home/paul/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored.
This is noise, you can safely ignore it. The way Steam sets up LD_PRELOAD
is a bit weird, and will always cause these warnings.
/mnt/hdd/Steam/steamapps/common/SteamLinuxRuntime_soldier/pressure-vessel/bin/pressure-vessel-adverb: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
This is the real problem. The container doesn't have the required ld.so
, or ld.so.cache
, or search paths, or something, to find the dependencies of your libstdc++.so.6
.
(I'm not sure why pressure-vessel-adverb
is loading C++ code - it should only depend on GLib and libelf, which are C.)
To make Steam work, NixOS is using bwrap to give it an FHS environment
That might not work particularly well with the container runtime, which also relies on bwrap.
What does that FHS environment look like? Is /usr
populated with symlinks into /nix
?
NixOS doesn't have an ld.so cache
That's unfortunate, because libcapsule relies on parsing the glibc ld.so cache to be able to find host system libraries. The alternative would be to hard-code a list of library directories and hope it matches the locations your glibc will look in.
it may be possible to generate it in the wrapper script that already launch steam in a bwrap
I think that would be a good idea.
/usr/lib/x86_64-linux-gnu/gio/modules/libdconfsettings.so: undefined symbol: g_type_ensure
You can safely ignore this. GLib automatically loads GIO modules, and finds the version on the host system, for which the copy of GLib provided with pressure-vessel (which is from Steam Runtime v1 for maximum portability) is too old; but it doesn't actually need any GIO modules, so it's harmless that they fail to load. We actually try to turn them off, but until recently the included copy of GLib was too old to have a good mechanism to do this. A future release will probably stop this happening, but it isn't a high priority.
adding
unset LD_PRELOAD
This will break the Steam overlay, which used to be just the overlay used for achievement popups, the Shift+Escape screen, etc. but is gradually accumulating more important functionality (game controller remapping, streaming, probably more).
I made an FHS like so:
with import <nixpkgs> {};
buildEnv {
name = "steam-env";
paths = [ gcc-unwrapped.lib glib.out json-glib.out ];
}
and added the lib
of this FHS to the rpath of pressure-vessel-adverb
, which seems to have gotten it further, but adding all of the required libraries causes it to fail on libstdc++ again! I checked with strace, and it fails with this:
1226 openat(AT_FDCWD, "/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
It opens libstdc++ from the rpath several times, but once tries to open it from here, but this output has no libstdc++, so it obviously fails. I am honestly not sure why it only tries this path, and I did try adding the FHS to LD_LIBRARY_PATH too to no effect.
@nyanloutre I can't reproduce your results with run-in-soldier.
For me, the following command: steam-run sh -c ldconfig -C /etc/ld.so.cache -f /dev/null /lib; $HOME/.local/share/Steam/steamapps/common/SteamLinuxRuntime_soldier/run-in-soldier --filesystem=/nix -- echo test
gives the following output:
/nix/store/x5jj29qcgyz4gn2ml076m7i305f4j31s-dconf-0.36.0-lib/lib/gio/modules/libdconfsettings.so: undefined symbol: g_type_ensure
Failed to load module: /nix/store/x5jj29qcgyz4gn2ml076m7i305f4j31s-dconf-0.36.0-lib/lib/gio/modules/libdconfsettings.so
pressure-vessel-wrap[4]: Using glibc from provider system for some but not all architectures! Arbitrarily using provider locales.
pressure-vessel-wrap[4]: Using libdrm.so.2 from provider system for some but not all architectures! Will take /usr/share/libdrm from provider.
/run/pressure-vessel/pv-from-host/bin/pressure-vessel-adverb: /../lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.30' not found (required by /nix/store/8nn8q8p6frw05q6d87kdig8x3c4pxly6-libselinux-2.9/lib/libselinux.so.1)
Wrt. to my above halfway workaround, even adding an ld.so.cache doesn't make it load from libstdc++ from another place. It always tries to load it from the same folder where the interpreter is located.
@smcv wrt. the FHS environment, it's actually /lib, /bin, etc. that are filled with absolute symlinks to /nix/..., and then /usr/x is just a symlink to /x for any x.
/nix/store/x5jj29qcgyz4gn2ml076m7i305f4j31s-dconf-0.36.0-lib/lib/gio/modules/libdconfsettings.so: undefined symbol: g_type_ensure
You can ignore that, it's harmless.
pressure-vessel-wrap[4]: Using glibc from provider system for some but not all architectures! Arbitrarily using provider locales.
pressure-vessel-wrap[4]: Using libdrm.so.2 from provider system for some but not all architectures! Will take /usr/share/libdrm from provider.
Do you have both 32- and 64-bit graphics drivers on the host system? To use Steam, you really should (I'm surprised it works without involving pressure-vessel). The container has some basic Mesa graphics drivers (more as an accident of the dependency stack than anything deliberate), but they're from Debian 10, so they won't support very recent hardware.
/run/pressure-vessel/pv-from-host/bin/pressure-vessel-adverb: /../lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.30' not found (required by /nix/store/8nn8q8p6frw05q6d87kdig8x3c4pxly6-libselinux-2.9/lib/libselinux.so.1)
That's a new error, so we're making progress!
Please could you get a log with PRESSURE_VESSEL_VERBOSE=1
in the environment? That will make pressure-vessel give a lot more info about what it's thinking.
wrt. the FHS environment, it's actually /lib, /bin, etc. that are filled with absolute symlinks to /nix/..., and then /usr/x is just a symlink to /x for any x
That's the reverse of the /usr merge. pressure-vessel should cope equally well either way round, but if /usr was populated with a directory hierarchy (with symlinks to the libraries in /nix) and the compat symlinks were lib -> usr/lib
, etc., that might get you onto more-commonly-tested code paths.
For reference, here is a log with with PRESSURE_VESSEL_VERBOSE=1
:
https://gist.github.com/moben/26dfe84ea4208bee42d98f40dd180c40
Full list of changes to get to this point:
/bin/ldconfig -C /etc/ld.so.cache
to the top of SteamLinuxRuntime_soldier/_start-container-in-background
.--filesystem=/nix
to the other --filesystem
parameters in SteamLinuxRuntime_soldier/_start-container-in-background
.PRESSURE_VESSEL_VERBOSE=1
in SteamLinuxRuntime_soldier/_v2-entry-point
as it wasn't propagated when set as a launch option in steam.python3: error while loading shared libraries: libexpat.so.1: cannot open shared object file: No such file or directory
pressure-vessel-launcher[669]: Child 715 died: wait status 32512
pressure-vessel-launch[702]: child 715 exited: wait status 32512
pressure-vessel-launch[702]: child exit code 715: 127
pressure-vessel-launch[702]: Exiting with status 127
/data/src/clientdll/installscript_posix.cpp (419) : Assertion Failed: Standalone evaluator returned error code for app 418530
/data/src/clientdll/installscript_posix.cpp (419) : Assertion Failed: Standalone evaluator returned error code for app 418530
seems relevant. Looks like either libexpat is missing from the container that steam runs in on NixOS to simulate something close to an FHS environment or a python3 from the runtime is called but can't find the lib from the runtime?
Edit:
Tried adding libexpat to the environment the steam client runs in, but the error remains. (might have made a mistake when adding the lib, https://github.com/moben/nixpkgs/commit/eada9fac16b09c7c45ea6994c206fd3f3203db33)
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/steam-games-unable-to-load-libstdc-so-6/9589/6
Progress:
The error above is from running the actual proton
script. As an test, I replaced it with a bash script, which results in bash failing to find libtinfo.so.6
. The interesting part here is that the NixOS bash doesn't link against that but the one in soldier does.
So binaries are run from the runtime but the libraries aren't present.
Edit: /lib64/ld-linux-x86-64.so.2
is the one from the host. I think it's likely that this is due to the "reverse usrmerge", as @smcv suspected, as pressure-vessel mounts /usr
but the binaries still hardcode /lib64
. I think it makes sense to see if that can be switched in the NixOS wrapper for steam because it's unlikely that the binaries will change as the runtime is debian-based if I understand correctly.
Same error when /lib
, /bin
, ... are symlinks to the directories in /usr
.
Also tried with the client_beta
branch for the soldier runtime as I'm running the beta of the steam client as well. (Same setup/workarounds otherwise). There we get these errors:
GameAction [AppID 418530, ActionID 2] : LaunchApp changed task to Completed with ""
Game removed: AppID 418530 "", ProcID 857
Uploaded AppInterfaceStats to Steam
>>> Adding process 858 for game ID 418530
Exiting app 418530
No cached sticky mapping in ActivateActionSet.>>> Adding process 859 for game ID 418530
>>> Adding process 861 for game ID 418530
bwrap: Can't bind mount /oldroot/ on /newroot/: No such device
pressure-vessel-wrap[838]: Child process exited with code 1
pressure-vessel-wrap[838]: Exiting with status 1
ln: failed to create symbolic link '/tmp/SteamPVSockets.iS6Uav/SteamLinuxRuntime.4944e167c07d3386/socket' -> '': No such file or directory
pressure-vessel-adverb[829]: Child 831 exited with wait status 256
pressure-vessel-adverb[829]: Command exited with status 1
pressure-vessel-adverb[829]: No more child processes
pressure-vessel-adverb[829]: Exiting with status 1
Full log: https://gist.github.com/moben/aef8d7d553a25f352f77c27f2da4efba
Another new error, therefore more progress! :-)
bwrap: Can't bind mount /oldroot/ on /newroot/: No such device
I think this is probably pressure-vessel's bwrap getting confused by being run inside another bwrap. It uses pivot_root()
, which has confusing and poorly-documented requirements.
For your reference, the way the runtime is meant to work is:
/run/host
(or sometimes /run/gfx
), and creating symbolic links in /overrides
(or sometimes /usr/lib/pressure-vessel/overrides
, see the source code) that point into that directory, and setting environment variables that make graphics driver loaders like GLVND and vulkan-loader look there in preference to /usr
./overrides
in the same way we do for graphics drivers. We have to do this because libraries in the runtime are entitled to depend on libfoo >= the runtime version
, and graphics drivers from the provider are entitled to depend on libfoo >= the provider version
, so the only way we can guarantee to satisfy both of those dependencies is to choose whichever libfoo
is newest.This is all quite complicated, even on close-to-FHS systems like Fedora/SUSE (FHS, with 32-bit /usr/lib
and 64-bit /usr/lib64
), Debian/Ubuntu (FHS but with special multiarch library directories like /usr/lib/x86_64-linux-gnu
), and Arch Linux (FHS but reversing the libQUAL directories so /usr/lib
is 64-bit and /usr/lib32
is 32-bit). As you can imagine, when you run it on an entirely non-FHS layout (NixOS) and trick it into looking like FHS by using bwrap, everything gets even more confusing.
If the libraries we have stolen from your host system (graphics drivers and glibc) have any hard-coded paths in them, for example pointing into /nix
, then we need to somehow make those paths exist inside the final container. This is part of why you have found that you need --filesystem=/nix
.
We also need to make sure the glibc inside the final container can find standard Steam Runtime libraries like libtinfo.so.6
in their Debian-style locations, otherwise none of our executables will work. On close-to-FHS systems like Fedora, Debian and Arch, the host system's glibc will automatically search /etc/ld.so.cache
, and our container ships with a /etc/ld.so.cache
that knows about all the libraries we provide in the Steam Runtime, so it works. However, on more unusual systems like Exherbo where the cache filename hard-coded into ld.so
is different, this won't work, and on entirely non-FHS systems like NixOS, all bets are off.
We have some work-in-progress to add a mode where instead of relying on LD_LIBRARY_PATH
to load /overrides
, we run ldconfig
during container setup to regenerate ld.so.cache
. This is primarily for the non-Proton use-case (native Linux games), to fix games that incorrectly clear the LD_LIBRARY_PATH
, instead of correctly prepending or appending their bundled library directories to it. However, as a side-effect this will definitely help Exherbo (because we can make the new cache available at the filenames their glibc expects, in addition to the filename our glibc expected), and might also help you.
So, a question for you: is there a location where we could put a ld.so.cache
, such that the ld.so
from NixOS' glibc will automatically load it, and will find all the libraries listed there?
The other way we could potentially make this work is that at the moment, pressure-vessel cannot run successfully inside a Flatpak container, unless you disable all the sandboxing/security (which is a early prototype code rather than something we genuinely expect gamers to do). However, we want it to be able to run from the Flatpak version of Steam, either optionally or always using the freedesktop.org runtime (which is basically the same close-to-FHS layout as Debian) as the provider of graphics drivers. If NixOS has a working version of Flatpak, then this would provide a way to run Steam and pressure-vessel, bypassing all the issues that come with trying to reconcile NixOS' non-FHS layout with the close-to-FHS layout of the Steam Runtime.
As can be seen here, there is no fixed path from which ld.so.cache is read, it is instead read from /nix/store/<...>/etc/ld.so.cache
. Even /etc/ld.so.preload
is not used, and instead /etc/ld-nix.so.preload
is read, as evident from here.
I am not sure why ld.so.cache is explicitly disallowed in this way in nixpkgs, though I assume that even if you removed that patch, nothing would break, since /etc/ld.so.cache
doesn't exist anyway. I wonder if this is the only big thing that is preventing pressure-vessel from working.
Actually there is no ld.so.cache
as far as I can tell. All nix-built binaries have an RPATH
that includes all their dependencies. This allows installing multiple versions of the dependencies on the system (but requires a lot of rebuilds).
This also means that normally there are no libraries in the FHS system libdirs, so it really is a rather peculiar setup. The wrapper that is used to launch the steam client adds all paths from the "old" runtime to LD_LIBRARY_PATH
, which makes library lookup work there.
I did some more testing by replacing proton with a wrapper that adds the runtime directories to LD_LIBRARY_PATH
, which gets me a lot further, until:
No cached sticky mapping in ActivateActionSet.
pressure-vessel-adverb[636]: Failed to open file '/proc/636/task/636/children': No such file or directory
which I assume is caused in some way by the double-bwrap setup.
I did also figure out why we get
error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory
for all pressure-vessel binaries: LD_PRELOAD
contains the steam overlay gameoverlayrenderer.so
. If I clear LD_PRELOAD
for each command in _v2-entry-point
, that goes away.
Full logs and all modifications here:
https://gist.github.com/moben/12ba6e8c4735e347f1cde92c17f5183d
Most of this would of course have to be worked around on the NixOS side. I think a workable way would be to add a tmpfs at /nix/store/glibc<...>/etc/
inside the outer bwrap and write an ld.so.conf
and ld.so.cache
there to include the fake FHS and all other relevant directories.
@smcv Maybe it makes sense for pressure vessel to launch proton with an LD_LIBRARY_PATH
containing all the runtime libdirs, as the host ld-linux.so
in general can't be expected to know the debian multiarch layout? Or is this already supposed to happen? I did dumb env from my proton wrapper at some point and at least in some invocations, those were missing from LD_LIBRARY_PATH
.
@moben mounting a tmpfs at that path with bwrap would mean that /nix/store can't be mounted to /nix/store, and instead each path inside /nix/store has to be mounted individually, and then the glibc one has to be special cased, since it would need a bind for each file/directory inside that path, so that we can make a tmpfs at the path you propose. It is possible, but it isn't trivial, and bwrap can't even handle that many binds at the moment unfortunately.
Actually, nevermind, it is possible since /nix/store/...-glibc-.../etc
already exists: it only has a file named rpc
. You wouldn't need what I described.
Maybe it makes sense for pressure vessel to launch proton with an
LD_LIBRARY_PATH
containing all the runtime libdirs, as the hostld-linux.so
in general can't be expected to know the debian multiarch layout?
The host ld-linux.so
doesn't need to know the Debian multiarch layout, as long as it knows the /etc/ld.so.cache
path that is hard-coded in upstream glibc, and looks there for a { SONAME => path }
map. We had assumed that things that are hard-coded in upstream glibc are more or less portable between glibc systems.
Unfortunately for this approach, your glibc doesn't read /etc/ld.so.cache
, because it has been patched to read /nix/<a path we can't predict>/etc/ld.so.cache
instead...
In the longer term we want to become less reliant on LD_LIBRARY_PATH
, and more reliant on generating our own /etc/ld.so.cache
(from the list of paths that we know we want to search), because games have a habit of messing with the LD_LIBRARY_PATH
in ways that happen to work on typical systems but fail horribly if the LD_LIBRARY_PATH
is the only way you can find libraries. That might also be a bit faster.
I think what we would need to do for that to work on NixOS is to find your glibc in /nix
, traverse upwards until we get to its ${prefix}
, mount a tmpfs on ${prefix}/etc
, and make ${prefix}/etc/ld.so.cache
a symlink to /etc/ld.so.cache
. Or, you could maybe try doing the same in your outer bwrap
?
fyi, with the changes in the linked PR I only need the --filesystem=/nix
workaround.
Now /etc/ld.so.conf
and /etc/ld.so.cache
both exist and I use the outer bwrap to create a tmpfs in the the unpredictable NixOS specific location where glibc looks for these and symlink them to /etc
. Additionally, ldconfig
is wrapped to write/read the files in /etc
.
Not really sure if that gets me further, as the proton script seems to not launch anymore but it greatly reduces the amount of errors: https://gist.github.com/moben/e95d828095cf7adf96811c35006995ed
bwrap: execvp /usr/lib/pressure-vessel/from-host/bin/pressure-vessel-adverb: No such file or directory
everything else seems to launch okay though, so I assume there is some issue propagating the necessary mounts to the inner bwrap, so libc ends up broken in there. Or the adverb binary actually doesn't exist.
Is it supposed to create 7(?) temporary SteamLinuxRuntime_soldier/var/tmp-*
chroots during a single run?
Any breakthroughs on this? Thanks!
Please be patient, and in the meantime, use Proton 5.0 or older on NixOS. Solving this is very likely to need changes in both the NixOS packaging and pressure-vessel.
I have a half-finished pressure-vessel branch that will probably help this (and Exherbo, and other non-FHS OS layouts), but it isn't the highest-priority topic that I have to work on.
Is it supposed to create 7(?) temporary SteamLinuxRuntime_soldier/var/tmp-* chroots during a single run?
No. If it's creating any temporary chroots in var/tmp-*
, then it's got into a mode that might become the default later, but wasn't really meant to be the default yet. The idea is that instead of relying on LD_LIBRARY_PATH
to force games to find the right libraries, we make a "cheap" copy of the runtime (using hardlinks or btrfs reflinks to avoid actually duplicating file content), then go through the copy and delete all the libraries that are older versions of the ones we got from the host system, so that the game has no choice but to use the ones we wanted it to use.
In that mode, there is meant to be exactly one copy of the runtime in var/
at any given time: before creating a new copy, pressure-vessel-wrap deletes old copies that are no longer in use. However, if the game is failing to start up (as it does on NixOS, because something breaks our assumptions), then it might be leaving a process running that holds a lock on the old copy, meaning that we detect the old copy as still in use, and don't delete it.
pressure-vessel was originally designed to wrap a single command, like bwrap - we had to bolt on support for using a single container for multiple commands in order to support Proton games, and the interface between Steam and pressure-vessel isn't ideal.
Each attempt at running a game is meant to create one container session, which in the case of Proton/Wine games is shared between zero or more setup commands (installing DLLs into the Wine prefix) and the actual game. I suspect that what you're seeing is:
(because if a setup command fails, Steam carries on to the next one anyway - it doesn't give us a way to tell it that we failed to set up the container and it should give up now).
One thing that might be helpful would be to try to run something simpler in the container: instead of running a complete Proton game, try to just run an xterm. When I'm back at work next week I'll put together a suitable command to try.
Please could someone try this, in either the SteamLinuxRuntime
or SteamLinuxRuntime_soldier
directory?
LD_DEBUG=files PRESSURE_VESSEL_VERBOSE=1 ./run xterm
It will produce a lot of output, but that output should help us to see what is going wrong, and will hopefully point us towards a solution or workaround, either in pressure-vessel or in NixOS.
Or it might be easier if someone can explain step-by-step how to get from https://nixos.org/download.html#nixos-iso or https://nixos.org/download.html#nixos-virtualbox to being able to see the failure occur. I am not familiar with how NixOS works, and I can't dedicate enough time and space to this to be able to install all the operating systems people might want to use pressure-vessel on, but if I can use a virtual machine as a shortcut then that might be enough.
I did with the SteamLinuxRuntime_soldier
folder.
/bin/bash : mauvais interpr茅teur: No such file or directory
Yeah, /bin/bash
don't exist in NixOS. Use the portable version /usr/bin/env bash
instead.
If you are making a FHS environment exist for Steam's benefit anyway, I would recommend making sure /bin/bash
(and /bin/sh
, and other commonly-hard-coded paths) exist there. Even if pressure-vessel doesn't need that path, games that run in the LD_LIBRARY_PATH
runtime will very frequently hard-code it.
If you change the #!/bin/bash
scripts in SteamLinuxRuntime_soldier
and SteamLinuxRuntime_soldier/pressure-vessel/bin
to #!/usr/bin/env bash
, does that get you further?
I'm mostly interested in having @moben run the command I posted, because they seem to have a suitably large tower of workarounds to be closer to being able to find the root cause. If other people are not using the same configuration as @moben then there's a risk that we will be spending time debugging things that @moben already knows about, which is time that I cannot really afford to spend.
@smcv, I ran this which gave me this before replacing shebangs and this after replacing the shebangs.
@smcv, here are also some step by step instructions to setup a nixos vm with proton 5.13 : https://gist.github.com/kimat/cf0f1a36302e5d992ef681f04cf33c5e
@davidak /bin/bash
, /bin/sh
and a lot of other things exist inside the FHS environment that steam runs in. The No such file
errors are just what happens if ld.so.cache
doesn't exist/contain the necessary libraries and you try to run the bash from the soldier runtime. See the linked draft PR.
@smcv Thanks a lot for your time and explanations. Here is a log of running LD_DEBUG=files PRESSURE_VESSEL_VERBOSE=1 ./run xterm
:
https://gist.github.com/moben/54283ad25fe94fa111f535ce92b5709a
Ends again with bwrap: execvp /usr/lib/pressure-vessel/from-host/bin/pressure-vessel-adverb: No such file or directory
Makes sense as that does not exist on the FHS host, but I think it's again some dynamic loader issue. (I'm running this in an FHS environment that is identical to the one steam itself is running in).
Interestingly, I managed to get two other errors in my own testing recently:
When running a game with soldier and Proton 5.13 from within steam I got the same error as above a while ago. I added bind mounts for /*
to /run/host/*
, as it seemed to try and use the dynamic loader from there for capture-libs?
This gets me a lot further: https://gist.github.com/moben/8f54dccc7e02658b756e0fc90b0abd67
capture-libs actually runs but complains that all the absolute paths inside the fake FHS point outside /run/host
:
x86_64-linux-gnu-capsule-capture-libs: warning: "/nix/store/jq7ikqkazzazydfgghia6n1zww1nnbiq-libglvnd-1.3.2/lib/libEGL.so.1.1.0" is not within prefix "/run/host"
(and so on)
I actually almost get to run Spelunky 2 this way but of course all the drivers are missing...
I also tried to do some testing with the debugging tools mentioned on https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/tree/master/pressure-vessel#debugging but I always run into
(pressure-vessel-wrap:590): pressure-vessel-CRITICAL **: export_contents_of_run: assertion `g_file_test ("/.flatpak-info", G_FILE_TEST_EXISTS)' failed
btw, that check in https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/master/pressure-vessel/wrap.c#L312 looks backwards, or the comment above is wrong? It bails out if that does not exist but the comment says that it won't work in flatpak.
No new workarounds except mounting the FHS env also at /run/host
. I think most of the current issues stem from the non-default mode that I seem to be getting into. Or is it supposed to use/expect the dynamic loader in /run/host
?
I suspect pressure-vessel correctly detects that it's running i bwrap but this FHS environment isn't flatpak based and somewhere bwrap and flatpak are conflated.
I'll try investigating why the test xterm command fails while running pressure-vessel from within steam somewhat works.
Thanks, I'll try to have a look at the logs next week.
Because pressure-vessel-wrap is entering a container that rearranges the filesystem, the error messages can get a bit confusing: it's not always obvious whether the absolute path you see is meant to exist inside the container, outside the container or both. In your case there's an extra layer of container, adding to the confusion.
I added bind mounts for /* to /run/host/*, as it seemed to try and use the dynamic loader from there for capture-libs?
/run/host
is not normally meant to exist in the environment from which you ran pressure-vessel-wrap. It is meant to exist in the environment created by pressure-vessel-wrap, and contains your /usr
, /bin
, /sbin
and /lib*
.
I think part of the problem might be that pressure-vessel is assuming that because it sees a library path /nix/...
on the outside, it should be using /run/host/nix/...
inside, which is wrong, because /run/host
only contains /usr
, /bin
, /sbin
and /lib*
. In the next release this part should be fixed, because we merged https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/merge_requests/173 (for the benefit of Vulkan layers in $HOME
or graphics drivers in /opt
, but it should help you too).
/usr/lib/pressure-vessel/from-host/bin/pressure-vessel-adverb: No such file or directory
This is not meant to exist outside pressure-vessel's container. It is meant to exist inside. You can inspect it by considering what would happen if the temporary copy in var/tmp-*/
was the root filesystem, with your /usr
and so on mounted under /run/host
.
In the mode that creates a temporary copy in var/tmp-*
, the pressure-vessel
directory ends up hard-linked into the container's /usr/lib/pressure-vessel/from-host
(it's set up like that to make it more likely that it can work when we do https://github.com/flatpak/flatpak/issues/3797); in the mode that doesn't, it ends up mounted at /run/pressure-vessel/from-host
(which Flatpak probably won't let us do).
I'm still not entirely sure why that mode is active for you, but for cross-referencing with the source code and wrapper scripts, it's activated by --copy-runtime-into
or environment variable PRESSURE_VESSEL_COPY_RUNTIME_INTO
. It does have the benefit that you can look at the copy in var/tmp-*
I also tried to do some testing with the debugging tools mentioned on https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/tree/master/pressure-vessel#debugging
Sorry, that document might well be out of date - most of the text is from early development before the public release of the scout container runtime, which was itself quite a while before soldier.
You will probably need to run those command-lines from your FHS environment, and add at least --filesystem=/nix
to all of them, for the same reason you needed it before.
Starting a container with a copy of the host system instead of a runtime is meant to still work, but Steam never does that and we don't regularly test it, so it isn't surprising if there have been regressions. The logic for whether we're in Flatpak or not does seem wrong there. Using --runtime=/path/to/SteamLinuxRuntime_soldier/soldier/files
is more likely to work.
Most helpful comment
@smcv, I ran this which gave me this before replacing shebangs and this after replacing the shebangs.
@smcv, here are also some step by step instructions to setup a nixos vm with proton 5.13 : https://gist.github.com/kimat/cf0f1a36302e5d992ef681f04cf33c5e