Winit: X11 crash: `[xcb] Unknown sequence number while processing queue`

Created on 11 Apr 2018  路  23Comments  路  Source: rust-windowing/winit

The following error occurred at startup when running a native GUI program I'm working on. I've run a lot of GUI windows on X11 in the past year and I've never seen this error before, so thought I'd post it:

    Finished release [optimized + debuginfo] target(s) in 18.95 secs
     Running `target/release/audio_server`
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
audio_server: xcb_io.c:259: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.

rustc 1.26.0-nightly (9c9424de5 2018-03-27)
winit 0.12.0

It has only occurred once and I'm not entirely sure how to recreate it yet.

Curiously, the error suggests calling XInitThreads however it looks to me like we already do this.

The only thing I can think to note is that I do use an EventsLoopProxy to wakeup the main (GUI) thread from a separate audio monitoring thread. There's a chance that EventsLoopProxy::wakeup gets called prior to the a call to EventsLoop::run_forever. I have no idea if this has anything to do with the issue yet, just a thought.

hard X11 low help wanted needs investigation bug

Most helpful comment

More people keep encountering this, so I thought it would be a good idea to actually try to fix it!

Running off of this branch #554, multithread_window generates no errors for me.

All 23 comments

Hiya, I have a reliable way of reproducing it (albeit I haven't made a minimal example).

  • Create an app that has multiple threads, and make each thread spawn a Window.
  • Start a new shell session, and run the app.
  • It should fail with the error.

Subsequent runs of the app in the same shell session work.

So, I hit this every time I run my project's automated tests, from a new shell session. In my setup, I have unit tests within the same module spawning their own Windows. If I run my tests with cargo test -- --test-threads=1, then it works.

Before upgrading to winit 0.12 I just had those tests under an #[ignore], the version I was using before was before #416 was merged (0.10?).

@azriel91 thanks for the info. My winit test app has one thread per window, but I can't reproduce following those instructions. I'm not using EventsLoopProxy though; are you?

(Also, this is sort of embarrassing, but I don't really understand what EventsLoopProxy is used for. @mitchmindtree, can you help fill me in on that?)

I'm not using EventsLoopProxy either.
I managed to make a minimal example that does reproduce it (not always, but easily when run in a loop):
https://github.com/azriel91/multithread_window

The assumption about "open a new shell" is incorrect, the issue can be reproduced if you launch the exe enough times. Here's some output using winit 0.13:

Spawned threads.
[xcb] Unknown sequence number while appending request
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
multithread_window: ../../src/xcb_io.c:147: append_pending_request: Assertion `!xcb_xlib_unknown_seq_number' failed.
Spawned threads.
Waited for 100 ms.
Spawned threads.
Waited for 100 ms.
Spawned threads.
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":1"
      after 175 requests (174 known processed) with 0 events remaining.
[xcb] Unknown request in queue while dequeuing
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
Spawned threads.
[xcb] Unknown sequence number while appending request
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
multithread_window: ../../src/xcb_io.c:147: append_pending_request: Assertion `!xcb_xlib_unknown_seq_number' failed.

I also managed to get another error, but maybe defer it to a different issue:

Spawned threads.
thread '<unnamed>' panicked at 'Failed to open input method: PotentialInputMethods {
    xmodifiers: Some(
        PotentialInputMethod {
            name: "@im=ibus",
            successful: Some(
                false
            )
        }
    ),
    fallbacks: [
        PotentialInputMethod {
            name: "@im=local",
            successful: Some(
                false
            )
        },
        PotentialInputMethod {
            name: "@im=",
            successful: Some(
                false
            )
        }
    ],
    _xim_servers: Ok(
        [
            "@im=ibus"
        ]
    )
}', /home/azriel/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.13.0/src/platform/linux/x11/mod.rs:82:17
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Running with RUST_BACKTRACE=1 makes this error take a while to happen, but when it does, it gives this:

note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at libstd/sys_common/backtrace.rs:59
             at libstd/panicking.rs:380
   3: std::panicking::default_hook
             at libstd/panicking.rs:396
   4: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:576
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:537
   6: winit::platform::platform::x11::EventsLoop::new
             at /home/azriel/.cargo/registry/src/github.com-1ecc6299db9ec823/winit-0.13.0/src/platform/linux/x11/mod.rs:82
   7: winit::platform::platform::EventsLoop::new_x11

Ah, awesome, that works! I get a mixture of the XIO error, the XCB error, the XIM error, and the occasional segfault.

I also managed to get another error, but maybe defer it to a different issue:

I think that error would probably have the same general root cause here.

This doesn't seem like it will be easy to track down. I figured the segfaults would be the easiest to investigate (and with some luck, would be related), so I ran with valgrind until I eventually got this: https://gist.github.com/francesca64/5e8512e58d5f728d429bdd28003b4c1a

Honestly, it's also possible there's nothing we can do about this. Fundamentally speaking, X11 isn't thread-safe, and we could be hitting some race condition somewhere.

Anyway, I looked at your example and noticed something. You create a new EventsLoop per thread, whereas my program uses one EventsLoop, then sends windows to their own threads and uses channels to forward events. With that approach, I've never had these issues. You also need to only have one EventsLoop if you want your application to work on macOS, where the EventsLoop needs to live on the main thread.

That said, doing things the way I suggest leads to any X11 call made from that thread blocking until you receive another event.

Okay, I got the XCB error once while running Alacritty today. I don't know much about Alacritty's internals, but it has me wondering if this can happen in every winit application.

my program uses one EventsLoop, then sends windows to their own threads and uses channels to forward events. ... You also need to only have one EventsLoop if you want your application to work on macOS, where the EventsLoop needs to live on the main thread.

Oh I see, good to know, thanks! :slightly_smiling_face:

I don't have control over the thread creation in my case (cargo test), but perhaps being clever with lazy_static and a Mutex would enable that level of automated testing.

I think I ran into the same problem using the amethyst game engine.
Also I attached a stacktrace of my coredump in the associated issue.
If you want me to provide a coredump to debug I am happy to help!

@MrMinimal thanks. However, if you want this fixed soon, someone other than me will likely have to work on this.

  • Is anything different if you run winit from master?
  • Does this only happen if you open winit applications in quick succession?

(Also, this is sort of embarrassing, but I don't really understand what EventsLoopProxy is used for. @mitchmindtree, can you help fill me in on that?)

Alacritty uses EventsLoopProxy to wake up the event loop / render thread when the terminal state is updated from our I/O thread. This is only necessary in our case since the render thread is also the events loop thread, and there's no other way to wake-up the event loop from another thread without EventsLoopProxy.

@jwilm thanks, though fortunately I already got clarification on that. https://github.com/tomaka/winit/issues/462#issuecomment-385109937

@francesca64 Running the fullscreen example from master, I can't seem to reproduce the issue. Is there a certain example which would be prone to producing this one?

@MrMinimal did you try here? https://github.com/azriel91/multithread_window (you'll need to change the Cargo.toml)

@francesca64 Thanks for the fast responses! Just tried the link you provided but could not reproduce it with that example. The amethyst examples using winit still reliably reproduce it.

Maybe this was fixed by #491? I haven't tested it since then. I'll check when I get a chance.

Amethyst is still on winit 0.13.1, which doesn't include that PR.

@francesca64 Oh that might explain it! Thank you for the input, I have not too much experience with amethyst. I'd be interested in your findings, tell me if I can help in any way!

@francesca64

@jwilm thanks, though fortunately I already got clarification on that.

My bad; sorry! I searched this issue as a precaution but my search ended there.

@MrMinimal I can still reproduce this easily while true; do target/debug/multithread_window; done. Was this something that happens to you frequently on older versions? It's supposed to be extremely rare in normal usage (at least, that's why I have the priority set to low).

@jwilm there's no need to apologize. There's an awful lot to keep track of.

@francesca64 You are right, I can still reproduce it that way, but it happens once in 60 runs. With amethyst it happens more like once in 5 runs. So yes, it happens on older versions more frequently.

@MrMinimal I've only ever encountered this once in normal usage, which is fortunate, because I'm not optimistic about being able to fix this soon. Also, this is off-topic, but I saw you over on gilrs (I watch that repo) asking about DirectInput. Do you want to implement that yourself? I was planning on doing it when I get a chance; it looks straightforward judging by the SDL source that was linked to.

@francesca64 offtopic answer:
I have not yet had the time to look into it, currently amethyst has an issue wich tracks available input solutions and their advantages/disadvantages https://github.com/amethyst/amethyst/issues/414.
I also found a repo which does everything we want https://github.com/Jonesey13/multiinput-rust but only for windows (Raw Input seems superior to DirectInput in terms of supporting more hardware). So I was thinking about merging multiinput-rust's features into gilrs but don't have the time at all.

More people keep encountering this, so I thought it would be a good idea to actually try to fix it!

Running off of this branch #554, multithread_window generates no errors for me.

@francesca64 Your branch works for me as well. This is really great! It must have been frustrating trying to find the culprit here.

It was actually surprisingly okay! I really did expect to sink my whole day into it, though.

What immediately caught my attention was the fact that a single application-global XConnection was created via lazy_static. winit is big enough that there are still parts of the codebase I haven't really seen, and seeing this design choice surprised me.

I thought to myself "what happens if two threads try to initialize that concurrently?" so I switched it to thread_local. I also tried using Once instead, but that didn't help.

That only fixed the XCB and XIO errors, but not the XIM errors and segfaults. Introducing a global lock guarding XOpenIM was an easy guess, since we used to have one before my XIM rewrite. I removed it, since it didn't seem to solve any of the myriad threadsafety problems XIM has, but I guess I never considered the case of multiple concurrent event loops. Actually, I should probably guard all XIM calls; maybe doing that would finally fix this other weird issue: https://github.com/tomaka/winit/issues/347#issuecomment-386902673

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hobogenized picture hobogenized  路  3Comments

swiftcoder picture swiftcoder  路  3Comments

chemicstry picture chemicstry  路  3Comments

dhardy picture dhardy  路  3Comments

chrisduerr picture chrisduerr  路  4Comments