When I handle some official business, my desktop crashed and restarted suddenly. I dont know what's wrong, I just collect some key journal logs, hope it help. This situation happens several times within days.
Apr 08 14:29:41 aart-ThinkPad-X1 io.elementary.cerbere.desktop[31178]: [16014:16033:0408/142941.961758:ERROR:socket_stream.cc(2
18)] Closing stream with result -2
Apr 08 14:32:54 aart-ThinkPad-X1 kernel: gala[17631]: segfault at 188 ip 00007f114fd1ea00 sp 00007ffd5c502e38 error 4 in libmut
ter-2.so.0.0.0[7f114fc75000+159000]
Apr 08 14:32:54 aart-ThinkPad-X1 kernel: Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 50 c3 90 66 2e 0f 1f 84 00 00 00 00 00 8b
87 90 00 00 00 c3 66 0f 1f 84 00 00 00 00 00
Apr 08 14:32:54 aart-ThinkPad-X1 gnome-session[30783]: gnome-session-binary[30783]: WARNING: Application 'gala.desktop' killed
by signal 11
Apr 08 14:32:54 aart-ThinkPad-X1 gnome-session-binary[30783]: WARNING: Application 'gala.desktop' killed by signal 11
Apr 08 14:32:54 aart-ThinkPad-X1 gala.desktop[19305]: Window manager warning: Trying to re-add keybinding "switch-to-workspace-
last".
Apr 08 14:32:54 aart-ThinkPad-X1 gala.desktop[19305]: Window manager warning: Trying to re-add keybinding "move-to-workspace-la
st".
Apr 08 14:32:54 aart-ThinkPad-X1 gala[19305]: Preferences.vala:192: '/usr/share/plank/themes/Transparent/dock.theme' is read-on
ly!
Apr 08 14:32:54 aart-ThinkPad-X1 gala[19305]: Preferences.vala:378: Missing key 'OuterStrokeColor' for group 'PlankTheme' in pr
eferences file '/usr/share/plank/themes/Transparent/dock.theme' - using default value
Apr 08 14:32:54 aart-ThinkPad-X1 gala[19305]: Preferences.vala:378: Missing key 'FillStartColor' for group 'PlankTheme' in pref
erences file '/usr/share/plank/themes/Transparent/dock.theme' - using default value
Apr 08 14:32:54 aart-ThinkPad-X1 gala[19305]: Preferences.vala:378: Missing key 'FillEndColor' for group 'PlankTheme' in prefer
ences file '/usr/share/plank/themes/Transparent/dock.theme' - using default value
I thought i'm the only one have this kind of issue. Sometime it crashed when i swipe to another workspace very fast and i have to restart the computer.
Same here. It happens very often when switching between desktops. Then, out of the blue, the desktop background image turns black and slowly rebuilds in about one or two seconds. You'll have to wait until everything is back to normal. Because if you switch to another workspace during the build-up, the whole desktop crashes and you're back at the greeter.
After logging in, nothing is there, not even plank, and cmd + T does not work either. So, the only way out is a hard reset.
Another odd thing is that, if you get past the build-up without a crash, all full and split screen windows are stuck at the top of the screen. They can only be moved horizontally and it seems like their top and bottom are about 10px outside the screen. A fix is to click the full-screen toggle to make them floating again.
This all started happening about a week ago. It's coming to a point where I am getting frustrated over it. Currently, I am pulling long hours and sometimes I have to reboot multiple times a day.
UPDATE:
So far, three complete crashes today, requiring a reboot every time.
Same here. It happens very often when switching between desktops. Then, out of the blue, the desktop background image turns black and slowly rebuilds in about one or two seconds. You'll have to wait until everything is back to normal. Because if you switch to another workspace during the build-up, the whole desktop crashes and you're back at the greeter.
After logging in, nothing is there, not even plank, and
cmd + Tdoes not work either. So, the only way out is a hard reset.Another odd thing is that, if you get past the build-up without a crash, all full and split screen windows are stuck at the top of the screen. They can only be moved horizontally and it seems like their top and bottom are about 10px outside the screen. A fix is to click the full-screen toggle to make them floating again.
This all started happening about a week ago. It's coming to a point where I am getting frustrated over it. Currently, I am pulling long hours and sometimes I have to reboot multiple times a day.
UPDATE:
So far, three complete crashes today, requiring a reboot every time.
Yeah, I also think it happens when switching between desktops, and I suspect this relates to animation of switching workspace or window animation, so I use elementary tweaks tool to turn off the animation. I'm not sure this work or not, but it seems like no crash anymore.
Yeah, I also think it happens when switching between desktops, and I suspect this relates to animation of switching workspace or window animation, so I use elementary tweaks tool to turn off the animation. I'm not sure this work or not, but it seems like no crash anymore.
Yesterday, I turned off automatic hiding of the dock, which seems to help. Let's see if turning off animations helps today. Thanks for the tip!
I recompiled gala v3.3.0 and left the binary and lib without stripping them. I got the following lines in /var/log/syslog:
Apr 14 10:27:11 y700 kernel: [ 5355.174293] gala[18428]: segfault at 188 ip 00007f9cb3f2ca00 sp 00007fffd8c0afa8 error 4 in libmutter-2.so.0.0.0[7f9cb3e83000+159000]
Apr 14 10:30:07 y700 kernel: [ 5531.532485] gala[26473]: segfault at 188 ip 00007ff789aa3a00 sp 00007ffee9225898 error 4 in libmutter-2.so.0.0.0[7ff7899fa000+159000]
Apr 14 11:34:24 y700 kernel: [ 9388.088523] gala[26607]: segfault at 188 ip 00007fa5d33d5a00 sp 00007ffdbdc3e348 error 4 in libmutter-2.so.0.0.0[7fa5d332c000+159000]
Apr 14 14:07:28 y700 kernel: [14988.896399] gala[10923]: segfault at 188 ip 00007fe004b73a00 sp 00007ffd23abba28 error 4 in libmutter-2.so.0.0.0[7fe004aca000+159000]
Hopefully, this would help.
Same here. Yesterday I updated my elementary OS (after one month). It has been happening since then.
With animations disabled in elementary tweaks, I've been working whole day without glitches or crashes. Also, automatically showing and hiding plank is enabled again, so it's most certainly the workspace transition animations causing the crashes.
UPDATE:
My top menu bar just disappeared, and I no longer have access to the application menu. So, not everything is well without animations.
Same here, clean / fresh elementary install done yesterday:
[ 489.451637] Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 50 c3 90 66 2e 0f 1f 84 00 00 00 00 00 8b 87 90 00 00 00 c3 66 0f 1f 84 00 00 00 00 00
Happens once in a while (once or twice each hour, this being a productive all-day laptop). Disabled animations via Tweaks just to see whether that fixes things.
Any response from the devs?
Hi,
same here.
Please find a crashdump below
Gala Version: 3.3.0~r987+pkg54~ubuntu5.1.2.1
libmutter-2-0 Version: 3.28.4-0ubuntu18.04.2+elementary11~ubuntu5.0.1
Why it's not pinned to issues shortlist? It's one of the most annoying issues in Elementary OS at the moment and it's probably happening on every computer running EOS. By the way, disabled dock hiding and desktop animations via 'elementary tweaks' and everything is running ok for two days straight

I'm running the latest elementary OS Hera but still have never faced this issue. Do people who're facing this issue use NVIDIA GPUs?
@ryonakano No, I'm using an XPS 13 with the same iGPU as yours.

I'm running the latest elementary OS Hera but still have never faced this issue. Do people who're facing this issue use NVIDIA GPUs?
First i was running latest nvidia-440 proprietary driver. After several session drops described in this topic, i tried reinstalling system without manually installing any drivers at all, so my driver is nvidia-435 or something like this by default. Both times i had this issue. Maybe this issue is only related to users who have nvidia gpu, but it does not mean that it should not get a fix.
By the way, this is my config

Not just NVIDIA - I'm on straight Intel here (Thinkpad Carbon X1 G7) and get the same thing: occasional segfault in libmutter on workspace switch.
I observed that for me the crashes happen more often when I'm in a Skype or Zoom call (I know, but that's what I have to use, so don't judge me, ok? :)). Both of these open small, borderless windows, which contain the basic controls, while you're in a call. Perhaps they trigger the problem.
It happens even without any of them running, but much more rarely, at lest for me.
@itoshkov I've been trying many combinations to see if it's related to an application. I even worked a whole day with just a terminal and a code editor open, but it does not help. I've got the feeling those crashes occur more often the longer you're in a session. But then again, it's a feeling, not a hard fact.
I'm running the latest elementary OS Hera but still have never faced this issue. Do people who're facing this issue use NVIDIA GPUs?
Not, I'm using Intel CPU and GPU.
Does elementary ship dbgsym packages anywhere? I can get a backtrace of the crash no problem, but without dbgsym packages (esp libmutter-2-0-dbgsym) its kind of useless, and of course the Ubuntu dbgsym packages don't match up.
@robn I couldn't find it. What one could do is to rebuild the libmutter from sources with included debug info. The first steps are as follows:
apt source libmutter-2-0 # this will download and unpack the source code in the current directory
sudo apt build-dep libmutter-2-0 # this will download and install all the build dependencies for libmutter
cd mutter-3.28.4
debuild -i -us -uc -b # build the packages. options are "magical". I found them as an example in the debuild man page, but couldn't find what they mean, except for -b which is for binary package
cd ..
dpkg --install libmutter-* # this is the scary operation. I just did it and will now try to see if this broke everything.
I'll report what happened.
That works as before. That is, with crashes and everything. Here are some more things from the log:
Apr 22 13:25:27 y700 gala.desktop[12553]: Window manager warning: Trying to re-add keybinding "switch-to-workspace-last".
Apr 22 13:25:27 y700 gala.desktop[12553]: Window manager warning: Trying to re-add keybinding "move-to-workspace-last".
Apr 22 13:25:27 y700 gala[12553]: invalid cast from 'CoglTexture2D' to '(null)'
Apr 22 13:25:28 y700 gala.desktop[12553]: Window manager warning: Invalid WM_TRANSIENT_FOR window 0x3e0002d specified for 0x3e00079 (ViberPC).
...
Apr 22 13:25:31 y700 kernel: [ 2018.880536] gala[12553]: segfault at 188 ip 00007fc9776339f0 sp 00007ffc6be01278 error 4 in libmutter-2.so.0.0.0[7fc97758a000+159000]
Apr 22 13:25:31 y700 kernel: [ 2018.880556] Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 47 50 c3 90 66 2e 0f 1f 84 00 00 00 00 00 8b 87 90 00 00 00 c3 66 0f 1f 84 00 00 00 00 00 <f6> 87 88 01 00 00 08 75 07 48 8b 47 38 c3 66 90 48 8b 47 20 48 8b
Apr 22 13:25:31 y700 gnome-session[2263]: gnome-session-binary[2263]: WARNING: Application 'gala.desktop' killed by signal 11
Apr 22 13:25:31 y700 gnome-session-binary[2263]: WARNING: Application 'gala.desktop' killed by signal 11
Apr 22 13:25:31 y700 gnome-session[2263]: gnome-session-binary[2263]: WARNING: App 'gala.desktop' respawning too quickly
Apr 22 13:25:31 y700 gnome-session[2263]: gnome-session-binary[2263]: CRITICAL: We failed, but the fail whale is dead. Sorry....
Apr 22 13:25:31 y700 gnome-session-binary[2263]: Unrecoverable failure in required component gala.desktop
Apr 22 13:25:31 y700 gnome-session-binary[2263]: WARNING: App 'gala.desktop' respawning too quickly
Apr 22 13:25:31 y700 gnome-session-binary[2263]: CRITICAL: We failed, but the fail whale is dead. Sorry....
...
Apr 22 13:25:31 y700 gala-daemon[2689]: gala-daemon: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
The invalid cast from 'CoglTexture2D' to '(null)' error looked like the culprit until I saw it in other places in the log, but without causing the crash.
BTW I'm looking at /var/log/syslog. Please tell me, if this is not the right place to look.
After running all day I finally caught a crash (why do these things never happen when you want them to):
#0 0x00007f9bdb07c9f0 in meta_window_get_workspace (window=0x0) at core/window.c:7294
#1 0x000055e3b7559c28 in ()
#2 0x00007f9bdcfa910d in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#3 0x00007f9bdcfbc05e in () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#4 0x00007f9bdcfc4715 in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#5 0x00007f9bdcfc512f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#6 0x00007f9bdad2f22d in clutter_timeline_do_frame (timeline=0x55e3ba4fd0f0) at clutter-timeline.c:1092
#7 0x00007f9bdad073f0 in master_clock_advance_timelines (master_clock=0x55e3b967f180) at clutter-master-clock-default.c:414
#8 0x00007f9bdad073f0 in clutter_clock_dispatch (source=<optimised out>, callback=<optimised out>, user_data=<optimised out>) at clutter-master-clock-default.c:564
#9 0x00007f9bdd239417 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#10 0x00007f9bdd239650 in () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007f9bdd239962 in g_main_loop_run () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007f9bdb06ba1c in meta_run () at core/main.c:664
#13 0x000055e3b7548630 in gala_main ()
#14 0x00007f9bd9973b97 in __libc_start_main (main=0x55e3b753f2a0 <main>, argc=1, argv=0x7ffc91c89488, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7ffc91c89478) at ../csu/libc-start.c:310
#15 0x000055e3b753f2da in _start ()
This is with libmutter debug symbols, but unfortunately not source, nor symbols for gala or glib (but surely those won't quite be related).
I'll have another run at it, trying to hook up source this time.
Ok, with source around:
#0 0x00007f26e24e79f0 in meta_window_get_workspace (window=window@entry=0x0) at core/window.c:7294
#1 0x000055ba121e2c28 in gala_window_manager_gala_end_switch_workspace (self=0x55ba13eec1c0) at ./src/WindowManager.vala:2122
#2 0x00007f26e441410d in g_closure_invoke () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#3 0x00007f26e442705e in () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#4 0x00007f26e442f715 in g_signal_emit_valist () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#5 0x00007f26e443012f in g_signal_emit () at /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0
#6 0x00007f26e219a22d in clutter_timeline_do_frame (timeline=0x55ba13c01970) at clutter-timeline.c:1092
#7 0x00007f26e21723f0 in master_clock_advance_timelines (master_clock=0x55ba13e61380) at clutter-master-clock-default.c:414
#8 0x00007f26e21723f0 in clutter_clock_dispatch (source=<optimised out>, callback=<optimised out>, user_data=<optimised out>) at clutter-master-clock-default.c:564
#9 0x00007f26e46a4417 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#10 0x00007f26e46a4650 in () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007f26e46a4962 in g_main_loop_run () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007f26e24d6a1c in meta_run () at core/main.c:664
#13 0x000055ba121d1630 in gala_main (args=<optimised out>, args_length1=<optimised out>) at ./src/Main.vala:59
#14 0x00007f26e0ddeb97 in __libc_start_main (main=0x55ba121c82a0 <main>, argc=1, argv=0x7ffdc01f0218, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7ffdc01f0208) at ../csu/libc-start.c:310
#15 0x000055ba121c82da in _start ()
Frame 1 is where the crash really comes from:
#1 0x000055ba121e2c28 in gala_window_manager_gala_end_switch_workspace (self=0x55ba13eec1c0) at ./src/WindowManager.vala:2122
2122 if (meta_window.get_workspace () != active_workspace
(gdb) l
2117 continue;
2118 }
2119
2120 kill_window_effects (window);
2121
2122 if (meta_window.get_workspace () != active_workspace
2123 && !meta_window.is_on_all_workspaces ())
2124 window.hide ();
2125
2126 // some static windows may have been faded out
meta_window.get_workspace() is the troublesome call, because meta_window is NULL:
(gdb) p meta_window
$1 = 0x0
(I think that the effective crash in meta_window_get_workspace in frame 0 is a detail of the Vala->C conversion. Not sure, but it makes sense to me).
Line 2109 is where meta_window is obtained:
unowned Meta.Window meta_window = window.get_meta_window ();
The fact that it doesn't check if this is valid, and there hasn't been any change in that part of the code for months, suggests to me that the data model assumes that all windows will have a meta window, so I'm guessing there's been a change elsewhere to break this assumption. But I can't guess where that is; I don't actually know much about Gala or Mutter/Clutter.
I'd be happy to keep helping with debugging, but this will need someone that actually knows how Gala is put together to find a fix.
@robn wow! Thanks for that. I don't know the inner workings of gala either, but looking at the way meta_window is used, I created a small patch:
diff --git a/src/WindowManager.vala b/src/WindowManager.vala
index bc5942e..0f3d8c8 100644
--- a/src/WindowManager.vala
+++ b/src/WindowManager.vala
@@ -2108,7 +2108,8 @@ namespace Gala {
unowned Meta.Window meta_window = window.get_meta_window ();
if (!window.is_destroyed ()) {
- if (meta_window.get_window_type () == Meta.WindowType.NOTIFICATION) {
+ if (meta_window != null
+ && meta_window.get_window_type () == Meta.WindowType.NOTIFICATION) {
reparent_notification_window (actor, parents.nth_data (i));
} else {
clutter_actor_reparent (actor, parents.nth_data (i));
@@ -2119,8 +2120,9 @@ namespace Gala {
kill_window_effects (window);
- if (meta_window.get_workspace () != active_workspace
- && !meta_window.is_on_all_workspaces ())
+ if (meta_window == null
+ || meta_window.get_workspace () != active_workspace
+ && !meta_window.is_on_all_workspaces ())
window.hide ();
// some static windows may have been faded out
I've rebuilt gala as described on the main page and I'm running this version at the moment. I'll report back how it behaves.
[EDIT] If you want to try that too, you can copy and paste the patch in a file, e.g. /tmp/gala-766.patch and then issue:
bash
git apply /tmp/gala-766.patch
from the gala prject directory.
It's a bit early to say for sure, but so far it looks like this fixes the problem. I tried making a call with viber, which before always created a problem. It worked without any problems, even though I was changing desktops, using the multi-task view like crazy.
I also modified the patch a bit:
````diff
diff --git a/src/WindowManager.vala b/src/WindowManager.vala
index bc5942e..525f557 100644
--- a/src/WindowManager.vala
+++ b/src/WindowManager.vala
@@ -2107,8 +2107,13 @@ namespace Gala {
}
unowned Meta.Window meta_window = window.get_meta_window ();
kill_window_effects (window);
````
There are two differences with the previous version:
meta_window is null. Perhaps I should also include the window ID in the message, but I'm not sure how to get it.meta_window is null. What I observed before (when things were working normally) was, that the small windows Skype or Zoom open while you're in a call tend to stay on one workspace. I think they are meant to follow the active workspace instead. I'm not sure if that would work as expected, but we'll see.Strangely, but so far I haven't seen the warning I added, showing up in the log.
I might have found out what triggers it:
My semi-educated guess is that if the window is open at the start of the transition, but closes during it, as might happen if its app window loses focus, then its no longer available to be repositioned once the transition completes and the new workspace is laid out. But without this case being checked for, we explode.
Great find, @robn. And the patch handles it.
The get_meta_window () method is used on several other places and they are not checked for null either. I read a bit more, and it turns out the meta window is the an object which wraps the actual X window. In most cases it really can't be null. My guess is, that mutter changed the way they handle pop-up menus and similar windows when switching workspaces, which created this problem.
@itoshkov congrats!
Most helpful comment
I might have found out what triggers it:
My semi-educated guess is that if the window is open at the start of the transition, but closes during it, as might happen if its app window loses focus, then its no longer available to be repositioned once the transition completes and the new workspace is laid out. But without this case being checked for, we explode.