Nixpkgs: `iris` mesa driver segfaults

Created on 1 Aug 2020  路  24Comments  路  Source: NixOS/nixpkgs

Describe the bug

Core was generated by `glxgears'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f47d6ccb639 in _mesa_sha1_format () from /run/opengl-driver/lib/dri/iris_dri.so
(gdb) bt
#0  0x00007f47d6ccb639 in _mesa_sha1_format () from /run/opengl-driver/lib/dri/iris_dri.so
#1  0x00007f47d70e8465 in iris_disk_cache_init () from /run/opengl-driver/lib/dri/iris_dri.so
#2  0x00007f47d70e812e in iris_screen_create () from /run/opengl-driver/lib/dri/iris_dri.so
#3  0x00007f47d69310b7 in pipe_iris_create_screen () from /run/opengl-driver/lib/dri/iris_dri.so
#4  0x00007f47d6f5e358 in pipe_loader_create_screen () from /run/opengl-driver/lib/dri/iris_dri.so
#5  0x00007f47d69335e8 in dri2_init_screen () from /run/opengl-driver/lib/dri/iris_dri.so
#6  0x00007f47d6e81b87 in driCreateNewScreen2 () from /run/opengl-driver/lib/dri/iris_dri.so
#7  0x00007f47d7d7b4d7 in dri3_create_screen () from /run/opengl-driver/lib/libGLX_mesa.so.0
#8  0x00007f47d7d6a399 in __glXInitialize () from /run/opengl-driver/lib/libGLX_mesa.so.0
#9  0x00007f47d7d65f14 in GetGLXPrivScreenConfig () from /run/opengl-driver/lib/libGLX_mesa.so.0
#10 0x00007f47d7d66935 in glXChooseVisual () from /run/opengl-driver/lib/libGLX_mesa.so.0
#11 0x000000000040404d in make_window.constprop ()
#12 0x000000000040256f in main ()

To Reproduce
Steps to reproduce the behavior:

  1. Add the following to the system configuration
{
  hardware.opengl.package = (pkgs.mesa.override {
    galliumDrivers = [ "radeonsi" "virgl" "swrast" "iris" ];
  }).drivers;
}
  1. Run glxgears or X server

Metadata

 - system: `"x86_64-linux"`
 - host os: `Linux 5.4.54, NixOS, 20.09pre236419.a45f68ccac4 (Nightingale)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.7`
 - channels(root): `"nixos-20.09pre236419.a45f68ccac4"`
 - channels(suhr): `"home-manager"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

May be related to #91145.

bug

All 24 comments

I believe we're building iris in mesa default now?

It seems that without this options i965 driver is used.

Would this be solved with patchelf from master? https://github.com/NixOS/patchelf/pull/218

I hope so. Can you test this diff? #94532

Yes. How do I test it?

@suhr: easiest way I know is rebuilding your system against that commit by adding -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/0c34580.tar.gz to your nixos-rebuild command. IIRC you don't even need to reboot as the /run/opengl-driver* symlinks should get switched on activation.

With this change:

  1. Mesa seems to be actually built with iris driver, even without extra configuration
  2. There's no segfault

@suhr did you test #94532 with your Mesa override?:

{
  hardware.opengl.package = (pkgs.mesa.override {
    galliumDrivers = [ "radeonsi" "virgl" "swrast" "iris" ];
  }).drivers;
}

I just tried to update Mesa on my system and got the segmentation fault with iris as well. When I tried to test #94532 it didn't cause any rebuilds and I had to drop my mesa.override { ... } so that the workaround in pkgs/top-level/all-packages.nix was applied:

  mesa = callPackage ../development/libraries/mesa {
    llvmPackages = llvmPackages_9;
    inherit (darwin.apple_sdk.frameworks) OpenGL;
    inherit (darwin.apple_sdk.libs) Xplugin;
  }
    # Temporary fix for .drivers that avoids causing lots of rebuilds; see #91145
     // { drivers = (mesa.overrideAttrs (a: {
            nativeBuildInputs = [ patchelf_0_9 ] ++ a.nativeBuildInputs or [];
          })).drivers;
        }
    ;

I guess the problem is that using override on mesa doesn't re-apply the temporary fix for the drivers output (and drivers will be updated due to the override).

So on my setup glxgears already works with iris with only the previous fix (#91145 and without #94532) as long as I don't override Mesa.

@vcunat maybe we should use another approach here (I don't mind, but I'm not sure how many users override Mesa)?

As for long-term improvement, I'd favor doing https://github.com/NixOS/nixpkgs/issues/44831 and then stuff becomes easier.

BTW, I originally wanted to fold the hack into the main mesa build on our staging branch, but apparently I forgot.

When I tried to test #94532 it didn't cause any rebuilds

Yes, I also had this problem with override. Fortunately, it seems like I don't need it anyway.

So we merge #94532 to master and immediately do the rebuild in staging? Or some other suggestion?

By the rebuild I mean using the newest patchelf for all mesa attribute, so .drivers override isn't there to complicate the situation.

@vcunat sounds good to me, thanks :)

@vcunat

Following https://nixos.wiki/wiki/Intel_Graphics and having a config that looks like this:

  config = {
    environment.variables = {
      MESA_LOADER_DRIVER_OVERRIDE = "iris";
    };
    hardware.opengl.package = (pkgs.mesa.override {
      galliumDrivers = [ "nouveau" "virgl" "swrast" "iris" ];
    }).drivers;
  };

Then nixos-rebuild switch -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/0c34580.tar.gz

This seems not to fix the issue for me. Any ideas?

Additionally, building against master does not change matters.

For what it's worth, I thought I'd post some segfault issues with Iris that I found on the net.

https://bugs.launchpad.net/ubuntu/+source/mesa/+bug/1877879
https://bugs.archlinux.org/task/65580

I'd try getting a stack trace as in the initial post. (So far I only see you mention it's _some_ segfault.)

Actually, you might be running into https://github.com/NixOS/nixpkgs/issues/94444#issuecomment-667655384

Current mesa in auto mode prints

Gallium drivers: r300 r600 radeonsi nouveau virgl svga swrast iris

so in your place I'd remove the hardware.opengl.package definition to avoid possible interaction with override in the commit that you're testing.

@MatthewCroughan as @vcunat already pointed out this is most certainly due to your mesa override. You can drop it (including environment.variables.MESA_LOADER_DRIVER_OVERRIDE) on nixos-unstable as it's already the default. I've updated the Wiki page (https://nixos.wiki/wiki/Intel_Graphics - thanks for linking it in your comment).

@primeos Hmm, why is it then that launching chromium as provided by cachix from https://github.com/colemickens/nixpkgs-chromium starts with failed to load driver: iris if it is provided by default on unstable?

@vcunat it is the same error as noted in https://bugs.archlinux.org/task/65580

segfault at 7f00ba6ca298 ip 00007f00b992a526 sp 00007ffe24f628c0 error 4 cpu 1 in iris_dri.so[7f00b8eaf000+db7000]

The segfault only occurs if I specify

  config = {
    environment.variables = {
      MESA_LOADER_DRIVER_OVERRIDE = "iris";
    };
    hardware.opengl.package = (pkgs.mesa.override {
      galliumDrivers = [ "nouveau" "virgl" "swrast" "iris" ];
    }).drivers;
  };

On unstable. Removing this just results in all of the applications that want to use Iris not finding it, such as Chromium.

@primeos Hmm, why is it then that launching chromium as provided by cachix from https://github.com/colemickens/nixpkgs-chromium starts with failed to load driver: iris if it is provided by default on unstable?

Try launching it with export LIBGL_DEBUG=verbose. Could be due to incompatible versions like in https://github.com/NixOS/nixpkgs/issues/94315#issuecomment-667942575 (I assume that occasionally happens after glibc updates but I didn't have a closer look).

The segfault only occurs if I specify

Yes just drop that part.

Removing this just results in all of the applications that want to use Iris not finding it, such as Chromium.

Most likely not all, only the ones that are from a newer Nixpkgs revision than Mesa. And technically they most likely find iris but fail to load it.

@primeos

[matthew@t480:~]$ chromium
Fontconfig warning: "/etc/fonts/fonts.conf", line 86: unknown element "blank"
MESA-LOADER: failed to open iris (search paths /run/opengl-driver/lib/dri)
failed to load driver: iris
MESA-LOADER: failed to open kms_swrast (search paths /run/opengl-driver/lib/dri)
failed to load driver: kms_swrast
MESA-LOADER: failed to open swrast (search paths /run/opengl-driver/lib/dri)
failed to load swrast driver
[3916:3996:0806/230454.191393:ERROR:object_proxy.cc(632)] Failed to call method: org.freedesktop.DBus.Properties.Get: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
[3916:3996:0806/230454.192159:ERROR:object_proxy.cc(632)] Failed to call method: org.freedesktop.UPower.GetDisplayDevice: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
[3916:3996:0806/230454.192913:ERROR:object_proxy.cc(632)] Failed to call method: org.freedesktop.UPower.EnumerateDevices: object_path= /org/freedesktop/UPower: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.UPower was not provided by any .service files
libEGL warning: MESA-LOADER: failed to open iris (search paths /run/opengl-driver/lib/dri)

[3945:3945:0806/230454.233916:ERROR:gl_surface_egl.cc(698)] EGL Driver message (Critical) eglInitialize: DRI2: failed to load driver
libEGL warning: MESA-LOADER: failed to open swrast (search paths /run/opengl-driver/lib/dri)

libEGL warning: MESA-LOADER: failed to open swrast (search paths /run/opengl-driver/lib/dri)

[3945:3945:0806/230454.236343:ERROR:gl_surface_egl.cc(698)] EGL Driver message (Error) eglInitialize: eglInitialize
[3945:3945:0806/230454.236732:ERROR:gl_surface_egl.cc(1181)] eglInitialize Default failed with error EGL_NOT_INITIALIZED
[3945:3945:0806/230454.236880:ERROR:gl_ozone_egl.cc(20)] GLSurfaceEGL::InitializeOneOff failed.
[3945:3945:0806/230454.248689:ERROR:viz_main_impl.cc(152)] Exiting GPU process due to errors during initialization
MESA-LOADER: failed to open iris (search paths /run/opengl-driver/lib/dri)
failed to load driver: iris
MESA-LOADER: failed to open kms_swrast (search paths /run/opengl-driver/lib/dri)
failed to load driver: kms_swrast
MESA-LOADER: failed to open swrast (search paths /run/opengl-driver/lib/dri)
failed to load swrast driver

I'm closing this issue as the segfault is resolved (at least if one drops the old mesa override or after 9febe2f8fcf302b98643ea035824ba82b1b28d29 lands in nixos-unstable).

@MatthewCroughan Unfortunately your log doesn't provide much useful information. Feel free to open a new issue (edit: probably #94315) (it's another issue that doesn't result in a segfault) if you want to track this problem (but the current log won't be sufficient and this might only be a temporary problem due to "incompatible" versions (mesa+chromium)).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ob7 picture ob7  路  3Comments

domenkozar picture domenkozar  路  3Comments

spacekitteh picture spacekitteh  路  3Comments

ayyess picture ayyess  路  3Comments

sid-kap picture sid-kap  路  3Comments