Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
dmesg
output:
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
show_signal_msg: 24 callbacks suppressed
GpuWatchdog[27365]: segfault at 0 ip 00007f271e83623d sp 00007f27038c5760 error 6 in libcef.so[7f271aab0000+69a4000]
Code: 00 79 09 48 8b 7d a0 e8 01 80 c1 02 41 8b 85 00 01 00 00 85 c0 0f 84 ab 00 00 00 49 8b 45 00 4c 89 ef be 01 00 00 00 ff 50 58 <c7> 04 25 00 00 00 00 37 13 00 00 c6 05 a1 a5 37 03 01 80 bd 7f ff
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
Additional context
amdgpu is able to recover and not require a hard reboot on kernel 5.7.7, but not on 5.4.50. this doesn't affect the crash happening in the first place, though.
overriding hardware.opengl.package{,32}
with mesa-20.1.3
does fix the issue on my machine
Notify maintainers
@primeos @vcunat
Metadata
"x86_64-linux"
Linux 5.7.7, NixOS, 20.09pre-git (Nightingale)
yes
yes
nix-env (Nix) 2.3.6
""
"nixos-19.09-19.09.809.5000b1478a1"
/nix/store/3v5m83bfhwjy0k2y4yblh01cvqv00igr-nixpkgs
Maintainer information:
# a list of nixpkgs attributes affected by the problem
attribute:
- mesa
# a list of nixos modules affected by the problem
module:
I also have this problem. Could you share how you have overridden mesa?
I don't seem to have this problem for some reason:
"x86_64-linux"
Linux 5.7.7, NixOS, 20.09pre233323.dc80d7bc4a2 (Nightingale)
yes
yes
nix-env (Nix) 2.3.6
"nixos-20.09pre233323.dc80d7bc4a2"
"home-manager, nixos-20.03-20.03.2491.6a00eba02a3, nixos-unstable-20.09pre233323.dc80d7bc4a2, nixpkgs-unstable-20.09pre233849.1d801806827, nixpkgs-20.03-20.03.1812.14dd961b8d5"
/nix/var/nix/profiles/per-user/root/channels/nixos
I'm using a Radeon RX 590 with OpenGL version string: 4.6 (Compatibility Profile) Mesa 20.0.8
Thanks @ashkitten! With your fix I can start steam again :100: Dirt Rally 2 crashes however, but Half Life 2 and Portal do work. Did not have the time to test more.
PS. I have a Radeon RX 5700XT
Hmm not sure if related, but while steam starts, I also get a similar error when trying to launch anything that uses vulkan (so basically any thing that runs on proton). Steam itself works fine, as do any games that use OpenGL for rendering (forcing the OpenGL-based WineD3D makes other games run too, but is not a feasible workaround for performance reasons). With mesa 20.0.8 it corrupts the screen and messes up the X session (cursor moves, no interaction possible) but switching to a different console works and I can kill the X session from there to recover. With mesa 20.1.3 vulkan applications simply crash. Trying to run a vulkan triangle tutorial I get this in stdout:
ac_rtld error: !part->elf
ELF error: (null)
Segmentation fault (core dumped)
and dmesg has these lines:
triangle[4080]: segfault at d4 ip 00007fd5ad3a8d36 sp 00007fff9c4a2f40 error 4 in libvulkan_radeon.so[7fd5ad311000+3e6000]
Code: 4c 8b a3 68 04 00 00 31 f6 31 ff 4d 8b 95 50 1e 00 00 48 8d 8b 40 04 00 00 c7 83 c0 05 00 00 00 b9 00 00 4c 8d 83 70 04 00 00 <41> 0f b6 84 24 d4 00 00 00 41 b9 00 01 00 00 08 83 38 04 00 00 58
EDIT: Radeon RX5700XT here as well
I reverted mesa back to 20.0.2
, with that version at least Dirt Rally 2 works again (not sure if that uses Vulkan).
According to steam Dirt Rally 2.0 requires DirectX 11, which would get translated to Vulkan by DXVK unless you put PROTON_USE_WINED3D=1 %command%
in your launch options (with that env variable set it uses the OpenGL-based WineD3D backend). If it works that way (performance aside), it is likely a RADV issue separate from the steam crashes.
Should be fixed with the next channel update as https://github.com/NixOS/nixpkgs/commit/0e93ae3f67c84385937bda0bb61db89c847fba20 is now in master. Thanks for the bug report.
Upgrading to 20.1.3/4 didn't fix this for me. I had to remove ~/.cache/radv_builtin_shaders*
. Satisfactory was crashing 100% with ac_rtld error: !part->elf
, and now it works...
I have a feeling caching is broken across the board in mesa right now due to disk_cache_get_function_identifier
falling back to timestamps, which are 0 in the nix store.
I did some investigating and fortunately it does seem to be limited to radv, because radv isn't currently built with --build-id=sha1
.
My proposed fix is in #93946.
Most helpful comment
Should be fixed with the next channel update as https://github.com/NixOS/nixpkgs/commit/0e93ae3f67c84385937bda0bb61db89c847fba20 is now in master. Thanks for the bug report.