Kitty: Occasional freezes while switching virtual desktops

Created on 8 May 2020  路  11Comments  路  Source: kovidgoyal/kitty

Symptoms are very similar to #1681 and #2016 , but seems to be a different cause. It looks like many of the comments at #1681 are referring to this bug...

Occasionally, when switching desktops in BSPWM, kitty becomes completely unresponsive, and doesn't respond to any input but SIGKILL and SIGTERM.

2020-05-07-234921

The frozen window turns into whatever the switched-from desktop had open (although this behavior seems to be just a general frozen-application-on-x symptom).

I hate to open this because I seemingly can't reproduce it reliably. It just happens often (every 1-3 days) to individual windows, and only when switching desktops.

2020-05-07-233151

System details:
OS: Manjaro Linux 20.0
WM: bspwm 0.9.9
TERM: kitty 0.17.3
DISPLAY: X
DRIVER: Dell UHD Graphics?

Assuming I can't really get a debug log out of the frozen window, is it possible to have kitty passively-log to a file until a crash, in hope of being able to reproduce it?

All 11 comments

This will almost certainly be a video driver/window manager issue. You can build kitty from source with make debug-event-loop which will give you a lot of output about events being processed by kitty and will allow you to pinpoint where exactly it is freezng.

I think I am also seeing this happen occasionally.

OS: Arch Linux
WM: i3
TERM: kitty 0.17.3
DISPLAY: X
DRIVER: amdgpu

Thanks @kovidgoyal , I got that set up and will post the debug log once it crashes again.

@kovidgoyal As requested: https://gist.github.com/j-james/eaf9b799c1dd4aa4d3bc9a177e7f446d.
I see some weird stuff starting at about line 1510.

Both the debug terminal and the one it was running it from crashed.

That indicates process_global_state is not returning, either inside render of after it. Look in child-monitor.c line 936 onwards to track down exactly where it is hanging. My guess would be in the render_os_window function at swap_window_buffers line 608.

I am having this happen to me as well on a similar system.

System details:
OS: Archlinux, kernel 5.6.11-arch1-1
WM: i3wm 4.18.1
TERM: kitty 0.17.4
DISPLAY: X
DRIVER: xf86-intel-video
from dmesg:

[    8.455395] [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4)
[    8.521342] i915 0000:00:02.0: GuC firmware i915/kbl_guc_33.0.0.bin version 33.0 submission:disabled
[    8.521345] i915 0000:00:02.0: HuC firmware i915/kbl_huc_4.0.0.bin version 4.0 authenticated:yes

I built kitty from source with debug-event-loop and the results are not that interesting, here's the end of the log before it froze:

[1239.3567] display_read_ok: 0
[1239.3572] other dispatch done
[1239.3573] --------- loop tick, wakeups_happened: 0 ----------
XIO:  fatal IO error 62 (Timer expired) on X server ":0"
      after 2021 requests (2021 known processed) with 0 events remaining.

I thought at first that it was a sleep or idle issue so I disabled automatic idle, but today I had kitty freeze while using another application for a while. I cannot consistently reproduce the bug, but I am trying to think of how to do so.

FWIW I have also seen a similar thing happen with electron applications (like signal desktop) and am not convinced it is 100% a kitty bug vs an i3 or X itself bug, but kitty is the only application I use that freezes regularly like this.

Motivated and capable of helping debug further, let me know what I can do to help!

What to do further is in the post just before yours

Sorry, I somehow missed that!

I'm rusty with debugging C but no time like the present to get back into it.

Oh and since these all seem to happen on Intel GPUs, see https://gitlab.freedesktop.org/mesa/mesa/-/issues/2960

I'm fairly convinced that this isn't a kitty bug. While I have had a miserable time trying to reproduce the issue, I have had it happen to me a few times while attached to kitty with gdb or strace and I have noticed nothing out of the ordinary. The most I could find from one trace is this:

[pid 398187] poll([{fd=3, events=POLLIN}], 1, -1) = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_BUSY, 0x7fff662830d0) = 0
[pid 398187] getpid()                   = 398187
[pid 398187] getpid()                   = 398187
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_EXECBUFFER2, 0x7fff66283270) = 0
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_MADVISE, 0x7fff66283194) = 0
[pid 398187] munmap(0x7fc8e57a8000, 81920) = 0
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_BUSY, 0x7fff66283130) = 0
[pid 398187] ioctl(8, DRM_IOCTL_GEM_CLOSE, 0x7fff66283140) = 0
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_BUSY, 0x7fff66283030) = 0
[pid 398187] ioctl(8, DRM_IOCTL_I915_GEM_MADVISE, 0x7fff6628309c) = 0
[pid 398187] ioctl(8, DRM_IOCTL_SYNCOBJ_CREATE, 0x7fff66283200) = 0
[pid 398187] ioctl(8, DRM_IOCTL_SYNCOBJ_WAIT, 0x7fff662832c0) = 0
[pid 398187] ioctl(8, DRM_IOCTL_SYNCOBJ_WAIT, 0x7fff662832c0) = 0
[pid 398187] ioctl(8, DRM_IOCTL_SYNCOBJ_WAIT, 0x7fff66283390) = 0
[pid 398187] ioctl(8, DRM_IOCTL_SYNCOBJ_DESTROY, 0x7fff662833a0) = 0
[pid 398187] ioctl(3, FIONREAD, [0])    = 0
[pid 398187] write(2, "X connection to :0 broken (expli"..., 63) = 63

... though strangely enough I don't see that in two other traces I have, from kitty with debug-event-loop.

I can share the traces but I am not convinced they will point to anything that kitty is doing wrong. Will approach it from the video driver side next. Thanks for the link, that issue looks promisingly similar at a quick glance.

Yeah I am very confident this is a driver bug, but it may be kitty can do something to workaround it. In any case am closing this issue since its not really a kitty bug, but feel free to keep posting with updates and if you identify anything kitty can do differently, I will be most interested.

Was this page helpful?
0 / 5 - 0 ratings