Sway: Crash in handle_keyboard_repeat

Created on 20 Feb 2020  路  19Comments  路  Source: swaywm/sway

Most helpful comment

Today I had some time to look into this, and think I've mostly tracked it down. Short version: to work around this issue until it's fixed, add seat * keyboard_grouping none to Sway's config.

Long version:

On reload, Sway iterates over all devices, and for every seat, resets the device. Resetting a keyboard device calls sway_keyboard_disarm_key_repeat, which works as expected, and does clear the repeat binding.

I added a few pointer prints into Sway which I think show the issue.

The very first "keyboard" to come online is my "Power Button" device.

00:00:00.645 [sway/input/keyboard.c:588] created keyboard 0x61a000048c80                                                                                 
00:00:00.645 [sway/input/keyboard.c:841] Adding keyboard 0:1:Power_Button to group 0x614000070040 

It's added to a new keyboard group 0x614000070040. A while later, my actual keyboard is added:

00:00:00.778 [sway/input/keyboard.c:588] created keyboard 0x61a00006de80                                                                                 
00:00:00.786 [sway/input/keyboard.c:793] Adding keyboard 10730:258:Kinesis_Advantage2_Keyboard to group 0x614000070040

Oddly, it's added to the same keyboard group as the power button, under the default keyboard_grouping smart policy. (Minor note here: the definitions of keymaps_match in sway/input/keyboard.c and wlroots/types/wlr_keyboard_group.c do not match; the latter has a bugfix in https://github.com/swaywm/wlroots/commit/b1a63bcd84adad5bffa8ba73dbf64e05c8ce9bc9 that the former does not. This isn't the issue here at any rate, since adding it does not resolve the crashes.)

Another interesting observation is that whenever I press any key on my keyboard, it is handled twice. For instance, pressing "a":

00:00:01.427 [sway/input/keyboard.c:499] keyboard GROUP event                                                                                            
00:00:01.427 [sway/input/keyboard.c:341] keyboard device 0x61a000048c80 sent event, key=30, down=1                      
00:00:01.427 [sway/input/keyboard.c:491] keyboard event                                                                 
00:00:01.427 [sway/input/keyboard.c:341] keyboard device 0x61a00006de80 sent event, key=30, down=1

It seems the event first gets processed through the group, whose representative keyboard is 0x61a000048c80, the power button, before coming in as an independent keyboard event to 0x61a00006de80, the actual keyboard.

This means that when the key combo for reload is executed, Sway thinks it came from the power button, since that's the first event. This ought to be fine, except... printing out which keyboard devices get reset during reload does _not_ print the power button keyboard — so the repeat timer isn't disarmed. Instead, when Sway reloads, the repeat event on the power button fires, but the binding for it has already been freed. Sway then crashes due to the use-after-free.

I see four issues here with my limited understanding, though some may not actually be issues:

  • keymaps_match between Sway and wlroots do not match
  • events are sent twice, once through the keyboard group, and once to the actual keyboard
  • smart keyboard grouping thinks a power button and a keyboard should be grouped
  • power button keyboards do not get reset on reload, probably because they're not actually added to the seat (?)

(/cc @RedSoxFan based on git blame :)

All 19 comments

Please type bt full in the gdb command line interpreter. The backtrace here is not good enough.

Updated in 1st post. Thanks

Looks like I'm getting a similar crash. I can only reproduce this when launching through a display manager or the following line in ~/.zprofile (my preferred way of starting sway):

[[ -z $WAYLAND_DISPLAY && $XDG_VTNR -eq 1 ]] && exec sway -c ~/dotfiles/i3/config &> /dev/null

It looks like the problem is related to for_window rules, possibly those related to xwayland? This is what I have:

for_window [class="Pinentry"] floating enable
for_window [instance="mpdclient"] floating enable

Commenting these two lines out works around the problem for me.
Oddly enough OP doesn't have those at all in their config, but the backtrace is the same.
Edit: nevermind, it's not these.

https://gist.github.com/9ary/b5993741e4acdf9cd70817c047a5e31e
Edit 2: added dump with unstripped symbols. Points to this line: https://github.com/swaywm/sway/blob/f681d529397bd6b5ab3850008857d423415999c1/sway/commands.c#L214
Probably something with a broken pointer, will investigate shortly.

I've played around with this as I also got crashes with the same backtrace, looks like it had to do with repeated keyboard events. Maybe I held the reload hotkey for too long. Once I added the delay suggested on the wiki I could not reproduce this anymore.

Indeed, bindsym $mod+Shift+c exec "sleep 1; swaymsg reload" works fine for me as well. I guess the problem is that multiple reload commands are being dispatched, and then it tries to reload before it's done reloading, causing the crash.

The backtrace you provided doesn't contain debug symbols. This most likely happens because the Sway binary you're using doesn't have debug information bundled.

Can you try again with a manually compiled Sway binary? See https://github.com/swaywm/sway/wiki/Development-Setup#compiling-as-a-subproject

I'm also running into this issue (happens spuriously alongside #5081). I have a debug build; here is the gdb output.

I've already posted backtraces with extensive debugging information as well, and I've just noticed they all have handle_keyboard_repeat in the stack.

Oops, must've missed that. The line my trace points to is a vprintf; I noticed the code was recently changed, so I re-ran on latest master: https://gist.github.com/Xyene/163389a13b3e30692fd993e00752eef6

The error happens here: https://github.com/swaywm/sway/blob/master/common/log.c#L89

Yeah and it'll point to a different location if you reproduce without debug logging enabled, but it's still being called while handling keyboard repeat.

Cases like these are most easily solved using the address sanitizer, can you rebuild and try to reproduce with meson build -Db_sanitize=address?

I'll give it a try tonight.

Well I just gave it a shot but it looks like I can't reproduce the crash with asan enabled...

I was able to reproduce with asan on: https://gist.github.com/Xyene/1e5602cf52e8ac7af1598fe81d154bdb

==27187==ERROR: AddressSanitizer: heap-use-after-free on address 0x60600015d7f0 at pc 0x56527d0a014e bp 0x7ffe3c249810 sp 0x7ffe3c249808
READ of size 8 at 0x60600015d7f0 thread T0
    #0 0x56527d0a014d in seat_execute_command ../sway/commands/bind.c:610
    #1 0x56527d07078b in handle_keyboard_repeat ../sway/input/keyboard.c:485
    #2 0x7febb9e9e63c in wl_event_loop_dispatch (/lib/x86_64-linux-gnu/libwayland-server.so.0+0xb63c)
    #3 0x7febb9e9c9e4 in wl_display_run (/lib/x86_64-linux-gnu/libwayland-server.so.0+0x99e4)
    #4 0x56527d03da30 in server_run ../sway/server.c:209
    #5 0x56527d03c163 in main ../sway/main.c:403
    #6 0x7febb9a83bba in __libc_start_main ../csu/libc-start.c:308
    #7 0x56527d01c3c9 in _start (/usr/local/bin/sway+0x3d3c9)

0x60600015d7f0 is located 48 bytes inside of 56-byte region [0x60600015d7c0,0x60600015d7f8)
freed by thread T0 here:
    #0 0x7febba55ffb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
    #1 0x56527d09c10b in free_sway_binding ../sway/commands/bind.c:30
    #2 0x56527d020055 in free_mode ../sway/config.c:63
    #3 0x56527d02072e in free_config ../sway/config.c:104
    #4 0x56527d024409 in load_main_config ../sway/config.c:547
    #5 0x56527d0b5e86 in do_reload ../sway/commands/reload.c:25
    #6 0x7febb9e9e31a in wl_event_loop_dispatch_idle (/lib/x86_64-linux-gnu/libwayland-server.so.0+0xb31a)

previously allocated by thread T0 here:
    #0 0x7febba560518 in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0xe9518)
    #1 0x56527d09dffd in cmd_bindsym_or_bindcode ../sway/commands/bind.c:338
    #2 0x56527d09fe1e in cmd_bindsym ../sway/commands/bind.c:570
    #3 0x56527d01ebc6 in config_command ../sway/commands.c:413
    #4 0x56527d0264a4 in read_config ../sway/config.c:823
    #5 0x56527d023a3d in load_config ../sway/config.c:400
    #6 0x56527d023fc2 in load_main_config ../sway/config.c:510
    #7 0x56527d0b5e86 in do_reload ../sway/commands/reload.c:25
    #8 0x7febb9e9e31a in wl_event_loop_dispatch_idle (/lib/x86_64-linux-gnu/libwayland-server.so.0+0xb31a)

I also ran into what looks like another issue while at it, when I unplugged my tablet device: https://gist.github.com/Xyene/0b13b4c7a64fe406f7b45900b9d31745

(I'm happy to file this separately if it looks like it should be separate.)

Can you report the tablet crash on the wlroots issue tracker?

Edit: I know nothing about Wayland or Sway, but it seems like the keyboard repeat timer event should be disarmed and removed from the event loop when the configuration file is reloaded. The timer event is created in sway_keyboard_create(), the configuration is reloaded in do_reload() which frees all of the bindings (but doesn't update the keyboard object which maintains its own pointer to the binding), and later the timer event triggers, accessing the old freed binding object. Instead sway should probably disarm/remove the key_repeat_source timer event when the configuration file is reloaded.


https://paste.debian.net/hidden/ad119c96/
https://paste.debian.net/hidden/27ca7aa0/

I've attached an additional stack trace with sway 1.4 if it's helpful.

You can see that the binding command string is corrupt on the following line:

buf = "[sway/commands/bind.c:610] running command for binding: \240x\000;\275\177\000\000\300u\000;\275\177\000\000\": { \"top\": 0, \"right\": [sway/desktop/transaction.c:376] Transaction 0x5613d198ac50 timed out (2 waiting)adding\": 3, \"wrap_scroll"...

I can reproduce this very easily (sway crashes almost every time I reload the config) so let me know if there's any way I can help debug this. I'm not currently able to rebuild any packages, but I can get more details from the sway I have installed.

Today I had some time to look into this, and think I've mostly tracked it down. Short version: to work around this issue until it's fixed, add seat * keyboard_grouping none to Sway's config.

Long version:

On reload, Sway iterates over all devices, and for every seat, resets the device. Resetting a keyboard device calls sway_keyboard_disarm_key_repeat, which works as expected, and does clear the repeat binding.

I added a few pointer prints into Sway which I think show the issue.

The very first "keyboard" to come online is my "Power Button" device.

00:00:00.645 [sway/input/keyboard.c:588] created keyboard 0x61a000048c80                                                                                 
00:00:00.645 [sway/input/keyboard.c:841] Adding keyboard 0:1:Power_Button to group 0x614000070040 

It's added to a new keyboard group 0x614000070040. A while later, my actual keyboard is added:

00:00:00.778 [sway/input/keyboard.c:588] created keyboard 0x61a00006de80                                                                                 
00:00:00.786 [sway/input/keyboard.c:793] Adding keyboard 10730:258:Kinesis_Advantage2_Keyboard to group 0x614000070040

Oddly, it's added to the same keyboard group as the power button, under the default keyboard_grouping smart policy. (Minor note here: the definitions of keymaps_match in sway/input/keyboard.c and wlroots/types/wlr_keyboard_group.c do not match; the latter has a bugfix in https://github.com/swaywm/wlroots/commit/b1a63bcd84adad5bffa8ba73dbf64e05c8ce9bc9 that the former does not. This isn't the issue here at any rate, since adding it does not resolve the crashes.)

Another interesting observation is that whenever I press any key on my keyboard, it is handled twice. For instance, pressing "a":

00:00:01.427 [sway/input/keyboard.c:499] keyboard GROUP event                                                                                            
00:00:01.427 [sway/input/keyboard.c:341] keyboard device 0x61a000048c80 sent event, key=30, down=1                      
00:00:01.427 [sway/input/keyboard.c:491] keyboard event                                                                 
00:00:01.427 [sway/input/keyboard.c:341] keyboard device 0x61a00006de80 sent event, key=30, down=1

It seems the event first gets processed through the group, whose representative keyboard is 0x61a000048c80, the power button, before coming in as an independent keyboard event to 0x61a00006de80, the actual keyboard.

This means that when the key combo for reload is executed, Sway thinks it came from the power button, since that's the first event. This ought to be fine, except... printing out which keyboard devices get reset during reload does _not_ print the power button keyboard — so the repeat timer isn't disarmed. Instead, when Sway reloads, the repeat event on the power button fires, but the binding for it has already been freed. Sway then crashes due to the use-after-free.

I see four issues here with my limited understanding, though some may not actually be issues:

  • keymaps_match between Sway and wlroots do not match
  • events are sent twice, once through the keyboard group, and once to the actual keyboard
  • smart keyboard grouping thinks a power button and a keyboard should be grouped
  • power button keyboards do not get reset on reload, probably because they're not actually added to the seat (?)

(/cc @RedSoxFan based on git blame :)

@Xyene Thanks for digging into this.

To touch on some of your points:

It seems the event first gets processed through the group, whose representative keyboard is 0x61a000048c80, the power button

The keyboard group actually has it's own keyboard. The log lines that you posted are for the creation of the keyboard group's sway_keyboard, not the power button's. There should be a separate line for the power button.

keymaps_match between Sway and wlroots do not match

This should be fine since none of the calls in Sway should be passing in NULL. However, it wouldn't hurt to add the extra checks.

events are sent twice, once through the keyboard group, and once to the actual keyboard

This is expected and allows for the --input-device=<device> option for bindcode/bindsym. For any keyboard that is in a group, only bindings with a matching --input-device are handled. The rest is handled by the keyboard group event. However, there are some issues that I plan to look into soon (probably this weekend) - #4975 and #4955.

smart keyboard grouping thinks a power button and a keyboard should be grouped

Any input device that is a keyboard that is in the same seat, has the same keymap set, and the same repeat delay and repeat rate set will be grouped together. A power button does emit the event code KEY_POWER (116), which is a keyboard event code. Restricting this to "real keyboards" is non-trivial and is unlikely to be worth the effort.

power button keyboards do not get reset on reload, probably because they're not actually added to the seat (?)

I'm pretty sure that every input device should be in a seat. If I remember correct, it is actually required that there be a fallback seat and if one doesn't exist, one will be created. By default seat0 is the implicit fallback seat. You can use swaymsg -t get_seats to see which devices are in each seat.

I believe this is also related to the confusion above about the keyboard group's keyboard. All keyboard group are destroyed and recreated on reload so there will not be a reset for it. This should be calling sway_keyboard_destroy (which in turn calls sway_keyboard_disarm_key_repeat) during the destroy.

I'll do some digging and see what I can find.

All keyboard group are destroyed and recreated on reload

So it turns out this is actually incorrect. The keyboard group associated with the default keymap, default repeat rate, and default repeat delay is not destroyed on reload when there is at least one keyboard that was in the group prior to the reload. All other keyboard groups are destroyed since the inputs gets reset. The use-after-free will only occur if the keyboard that triggered the reload is in that default keyboard group. Submitted #5317.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

J0nnyMak0 picture J0nnyMak0  路  3Comments

ddevault picture ddevault  路  4Comments

RyanDwyer picture RyanDwyer  路  3Comments

Alphare picture Alphare  路  3Comments

ddevault picture ddevault  路  3Comments