Sway: Sway does not handle subsurfaces that extend beyond the parent's bounding box very well (common in gtk applications)

Created on 18 Jan 2020  路  27Comments  路  Source: swaywm/sway

  • Sway Version: 1.2

  • Debug Log:

Relevant portion:

2020-01-18 02:15:28 - [sway/desktop/transaction.c:489] Transaction 0x558676e2c7f0 is ready
2020-01-18 02:15:28 - [sway/desktop/transaction.c:280] Applying transaction 0x558676e2c7f0
2020-01-18 02:15:32 - [types/seat/wlr_seat_pointer.c:362] button_count=1 grab_serial=0 serial=132
2020-01-18 02:15:32 - [sway/desktop/transaction.c:411] Transaction 0x558676e2c7f0 committing with 1 instructions
2020-01-18 02:15:32 - [sway/desktop/transaction.c:280] Applying transaction 0x558676e2c7f0
2020-01-18 02:15:32 - [types/seat/wlr_seat_pointer.c:362] button_count=0 grab_serial=132 serial=133
2020-01-18 02:15:38 - [sway/input/cursor.c:576] denying request to set cursor from unfocused client
2020-01-18 02:15:46 - [sway/ipc-server.c:334] Sending window::focus event
  • Configuration File: Happens with default configuration

Steps to reproduce:

Run the popover example code found here: https://python-gtk-3-tutorial.readthedocs.io/en/latest/popover.html#menu-popover (reproduced here):

It is more obvious if you change the border width on line 46 (for the app window) to something smaller, like 10.

Once the window opens switch to floating mode (Mod+Shift+space in default config). Then click on the button to make the popover appear. Although the popover displays outside of the parent window, it doesn't seem to receive mouse events, and if focus_follows_mouse is enabled, as soon as you move over the popover, the parent window (and the popover itself) loses focus. Clicking also seems to send the click event to the window underneath the popover.

Update

After more investigation I've determined this is because GTK uses subsurfaces, rather than popup surfaces or toplevel surfaces for menus and popovers. And sway seems to have some assumptions that subsurfaces fit inside the bounding box of the parent top-level surface. This results in the following undesirable behavior:

  • If a subsurface extends beyond the bounding box it does not receive cursor events
  • If a subsurface extends beyond the bounding box of a floating window, the border is drawn over the subsurface
  • If a subsurface extends beyond the bounding box of a tiling window, it is cut off by the bounding box.
bug

Most helpful comment

Good news: I found the problem. Bad news: I'm not sure what the best way to fix it is (mostly because of my unfamiliarity with sway's codebase).

So the issue is in the container_at function, and the functions it calls. So, surface_is_popup returns false for these subsurfaces. At first I thought that was wrong, but then I realized, that might not actually be what we want since I don't know if these popovers should have higher priority than floating windows above them. So, the problem is actually in how it checks the location for normal containers.

I think floating_container_at is definitely broken. It should probably call surface_at_view instead of just relying on the bounding box of the floating container itself. But what if the floating container isn't a view, and has a subsurface that extends beyond the floating container's bounding box?

And then I'm not sure about tiling windows. If a tiling window has a subsurface that extend beyond it's bounding box, should that receive events events only if the parent view is focused?

I'd be happy to implement a fix, but would like a little direction from a maintainer on what my approach should be.

I haven't figured out why the border is drawn on top of the subsurface yet, maybe that should be a different issue?

All 27 comments

Can you try master?

Yep. Still happens on master.

Oh, I also forgot to mention, if the popover overlaps with the parent window, then the window border of the parent window is drawn on top of the popover.

Is there other information that would be helpful? I've tried looking at the gtk/gdk source code but haven't been able to figure out how the overlay maps to the wayland protocols yet, or how it is different than a GtkMenu (which doesn't have this problem).

It does appear to be sway or wlroots related. The popup is not receiving any wl_pointer (or any) wayland messages with WAYLAND_DEBUG=1.

It also seems to work as you would expect in weston.

It's not just popovers. I think it is any "popup" type window. I can reproduce the issue with the following program, which just creates a new window with the "popup" type and transient for the main window:

import gi
gi.require_version('Gtk', '3.0')
from gi.repository import Gio, Gtk

class AppWindow(Gtk.ApplicationWindow):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        outerbox = Gtk.Box(spacing=6, orientation=Gtk.Orientation.VERTICAL)
        self.add(outerbox)
        outerbox.show()

        button = Gtk.Button(label="Click me")
        button.connect('clicked', open_window)
        button.show()

        outerbox.pack_end(button, False, True, 0)

        self.set_border_width(10)

def open_window(attachto):
    button = Gtk.Button(label="Click me 2")
    button.set_size_request(300, 30)
    button.connect('clicked', lambda x: print("Clicked 2"))
    win = Gtk.Window.new(Gtk.WindowType.POPUP)
    win.attached_to = attachto
    win.set_transient_for(attachto.get_toplevel())
    win.add(button)
    win.show_all()


if __name__ == "__main__":
    mainwindow = AppWindow(title="Main Window")
    mainwindow.connect('destroy', Gtk.main_quit)
    mainwindow.show_all()

    Gtk.main()

It appears to be for all transient windows that extend beyond the parent window.

After some debugging, I've discovered that in wlr_xdg_surface_surface_at, the surface->popups list is empty. Which seems strange to me. Even after looking through the gtk and gdk source code I can't figure out what that would be. At first I though maybe it was creating a subsurface instead of a popup, but I can't find any subsurfaces that don't have a width or height other than 0 for the test application. Setting breakpoints on the xdg_surface_handle_get_popup functions also doesn't ever break.

Ok, so I think it is creating a subsurface, not a popup surface. Here's where I can I think it is creating the new window in the output with WAYLAND_DEBUG=1 (on the application):

[644382.633]  -> [email protected]_surface(new id wl_surface@32)
[644382.771]  -> [email protected]_subsurface(new id wl_subsurface@33, wl_surface@32, wl_surface@24)
[644382.800]  -> [email protected]_position(0, 0)
[644383.073]  -> [email protected]_pool(new id wl_shm_pool@38, fd 16, 40800)
[644383.101]  -> [email protected]_buffer(new id wl_buffer@39, 0, 300, 34, 1200, 0)
[644384.005]  -> [email protected](wl_buffer@39, 0, 0)
[644384.042]  -> [email protected]_buffer_scale(1)
[644384.050]  -> [email protected](0, 0, 300, 34)
[644384.066]  -> [email protected]_region(new id wl_region@40)
[644384.074]  -> [email protected](0, 0, 300, 34)
[644384.091]  -> [email protected]_opaque_region(wl_region@40)
[644384.098]  -> [email protected]()
[644384.105]  -> [email protected]_input_region(nil)
[644384.132]  -> [email protected](new id wl_callback@41)
[644384.149]  -> [email protected]()

Good news: I found the problem. Bad news: I'm not sure what the best way to fix it is (mostly because of my unfamiliarity with sway's codebase).

So the issue is in the container_at function, and the functions it calls. So, surface_is_popup returns false for these subsurfaces. At first I thought that was wrong, but then I realized, that might not actually be what we want since I don't know if these popovers should have higher priority than floating windows above them. So, the problem is actually in how it checks the location for normal containers.

I think floating_container_at is definitely broken. It should probably call surface_at_view instead of just relying on the bounding box of the floating container itself. But what if the floating container isn't a view, and has a subsurface that extends beyond the floating container's bounding box?

And then I'm not sure about tiling windows. If a tiling window has a subsurface that extend beyond it's bounding box, should that receive events events only if the parent view is focused?

I'd be happy to implement a fix, but would like a little direction from a maintainer on what my approach should be.

I haven't figured out why the border is drawn on top of the subsurface yet, maybe that should be a different issue?

I'm also confused about this line:

    // Tiling (focused)
    if (focus && focus->view && !is_floating) {

Why is there a !is_floating there?

This gnome bug is related: https://bugzilla.gnome.org/show_bug.cgi?id=759738

Apparantly gtk now uses subsurfaces rather than popups in many cases. (for menus, tooltips, popovers, etc.)

I also discovered while playing around with this some more that if a subsurface of a tiled window extends beyond the applications bounding box it isn't rendered outside the bounding box at all. This can mean that menus aren't fully visible.

This seems like it might be a bigger task to fix than I thought. Maybe subsurfaces that extend beyond the bounding box of the parents should be marked somehow, and then treated kind of like popup windows?

@RyanDwyer @ddevault what are your thoughts?

We've always had problems with this. It's been difficult to get right for all cases.

I wonder if it would be easier with a floating window manager style approach for this part. Eg. store pointers to all surfaces, no matter their type, in a list along with their global coordinates. Include containers in the list too. Keep the list sorted by z-index similar to how a floating window manager would do it. Then for surface_at just traverse the list linearly from the top until you find something that intersects the coordinate. Keeping the list up to date could be tricky though, especially with transactions involved.

Why is there a !is_floating there?

Views are either floating or tiling. As per the comment on the line above it's checking if the focused view is tiling, or in other words, not floating.

Views are either floating or tiling. As per the comment on the line above it's checking if the focused view is tiling, or in other words, not floating.

Sorry, I should have been more specific. I understand that the !is_floating is checking if it is tiling. My question is, why do we call surface_at_view on the focused view if it is tiling, but not if it is floating? This seems especially strange since subsurfaces apparantly aren't even drawn beyond their bounding boxes.

Because everything floating has already been handled by the call to floating_container_at just before it. If it reaches the tiling checks then it is certain that the coordinates are not over a floating container. The !is_floating condition is just an optimisation to avoid rechecking the focused view if it happens to be floating.

My PR was closed in favor of clipping surfaces to the rectangle of the view (#5451). I think that will make this problem even worse.

This may be related to https://github.com/Alexays/Waybar/issues/771. where if the layer is waybar is "bottom", then tooltips are hidden and systray popups are truncated (see screenshot). I'm worried that if #5451 is merged, even if the layer is set to top these pop-ups, which I think use subsurfaces won't work.

image

Yes, this is the correct behaviour. Waybar needs to switch to xdg-popup instead of subsurfaces.

Waybar needs to switch to xdg-popup instead of subsurfaces.

That means it can't use GTK3 (at least not for popups). This behavior will be fixed in GTK4, but it sounds like they aren't willing to backport using xdg-popup to GTK3, and GTK4 hasn't been released yet. And it will probably take while for applications to migrate to GTK4, and some might not migrate at all.

  1. I don't see why that's the case. gtk-layer-shell supports xdg-popup.
  2. It doesn't seem like a good idea to fix GTK's limitations by adding workarounds in compositors.

gtk-layer-shell supports xdg-popup

I'm not sure what you mean by that. I don't see anything in gtk-layer-shell that explicitly supports xdg-popup. It just takes a GtkWindow, and GtkWindows created using API's for Popovers and tooltips use subsurfaces, not xdg-popup in gtk3. I've looked into what it would take to force Gtk to use xdg-popup instead of subsurfaces in an application, and it would basically require re-implementing the tooltip and popover apis.

It doesn't seem like a good idea to fix GTK's limitations by adding workarounds in compositors.

This particular behaviour seems to be underspecified by wayland. From my reading both Gtk and sway's interpretations are valid. I can try bringing this up with Gtk again. But the problem is Gtk can make a similar argument "It doesn't seem like a good idea to fix Sway's limitations by making a large change that might break applications in other compositors [that we care about more]."

From the above issue:

Note that popovers must move alongside their parent widgets, unlike menus, which in practice don't. Working around bugs in compositors unlikely to be a motivation for spending time on this for anyone in the gtk team.

https://gitlab.gnome.org/GNOME/gtk/-/issues/1097#note_867752

I'm not sure what you mean by that. I don't see anything in gtk-layer-shell that explicitly supports xdg-popup.

https://github.com/wmww/gtk-layer-shell/blob/ae088382b4be876e1bba0432e977ff60d1b1ffda/src/xdg-popup-surface.c#L24

This particular behaviour seems to be underspecified by wayland.

Yes: there's no guarantee subsurfaces outside of the window geometry will be displayed. It may be hidden by another surface, or may be hidden because off-screen, or may be hidden because any other reason.

From my reading both Gtk and sway's interpretations are valid.

No. If GTK needs to display a surface outside of the window geometry and over other surfaces, using a subsurface is incorrect.

Well this is embarrassing. 'After reading, the discussion in the Gtk issue I looked at the output from WAYLAND_DEBUG=1 waybar and it looks like it may actually be using xdg-popup anyway, which means this particular behavior may be a different bug, possibly relating to how popups are rendered for layers? I'll open a seperate bug for that.

The options at this point:

  1. Change Gtk+ to create popovers in a way that by the spec guarantees the expected behavior (e.g. xdg-popup).
  2. Change the spec to make Gtk+'s assumptions guaranteed, and update sway to comply with the spec change.
  3. Leave the spec, and thus sway, as is. In other words, do nothing.

It seems like this issue is not priority for the Gtk team, as they're understandably focused on Gtk4 (which already fixes this). Things also started off on the wrong foot, which does not help.

If the spec is not changed, and sway's current interpretation remains correct/passable, then it is healthier for the Wayland community to leave it as is in order to create a more heterogenous compositor environment which helps applications and toolkits identify incorrect assumptions about protocol details.

Wayland applications and toolkits must never assume anything other than what is explicitly written in the protocol specifications, and a compositor should never concern itself with applications and frameworks but instead just implement the exact details of its supported protocols.

Change the spec to make Gtk+'s assumptions guaranteed, and update sway to comply with the spec change.

I've created an issue with wayland-protocols to clarify this issue: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/24

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StephenBrown2 picture StephenBrown2  路  4Comments

cauebs picture cauebs  路  3Comments

johanhelsing picture johanhelsing  路  3Comments

DpoBoceka picture DpoBoceka  路  4Comments

ddevault picture ddevault  路  3Comments