libgdx 🚀 - [Improvement] Lwjgl3 winproc handling thread/main loop thread separation

I've tested resizing and moving the window on macOS with the latest libGDX version from master and it works as intended. The rendering loop is being called as intended.

badlogic on 4 Dec 2016

The problem is on windows (haven't tested linux). I PRed a threaded audio update implementation, but since I'm using a custom backend I can't contribute the renderer (mine does away with multiple windows support).

voidburn on 4 Dec 2016

Will give it a try on Windows then.

badlogic on 4 Dec 2016

I implemented my separate thread for the renderer pretty much like shown in the example above.

Word of caution if you embark on this: Almost everything obvious works out of the box, but I had to proxy the *Graphics.setCursor() calls, since GLFW states they only work when called from the main thread (the window process if you decouple it from the renderer), until I did that its behavior was in fact unreliable.

I also was too much of a noob to retain multi-windows support, if you come up with a threaded renderer that includes that I look forward to check your implementation! I kept getting complains when trying to switch GL context.

voidburn on 4 Dec 2016

The stop happens on OS X when clicking and holding the mouse on the window border (like resize, without moving the mouse), or when navigating the application menu. On Windows it also happens when just clicking and holding the mouse button on the window title bar.

I believe it would be overkill (and out of scope) for libGDX to do a separate thread per window. It's already bad enough to properly handle GL resources shared among contexts, it'd be even worse if we invite multi-threaded windowing to the party.

What I could see working is to do GLFW setup on the main thread, then spawn a render thread, and do everything else - from window creation to application listener calls to window shutdown - inside this thread. Since glfwPollEvents()[*] and therefor GLFW callbacks would still need to happen from the main thread, it'd be required to pass events to the render thread somehow, before calling the appropriate ApplicationListener functions.

[*] Actually, the main thread could happily slumber away in glfwWaitEvents().

code-disaster on 9 Dec 2016

@code-disaster Regarding your (*), I implemented it exactly like that, but I had issues when the window close request came in from the rendering thread and the main loop was in the wait state, ultimately I had to switch to polling.

IF retaining multi-windows support is feasible, I see it just sharing the render thread, together with the audio update (if this pans out I'll close my PR for that), one thread per window would definitely be pointless, just additional overhead. (who uses multi-window btw? was it just a proof of concept since lwjgl3 implemented it?)

voidburn on 9 Dec 2016

You can wake it up from glfwWaitEvents() with glfwPostEmptyEvent().

We are using multiple windows for our level editor. We moved some content creation tools, asset browsers, and debug views to separate windows.

code-disaster on 9 Dec 2016

@code-disaster Thanks to your input I just cut average CPU usage from ~13-19% to a steady 1.5% on an i7 6700k :D And it works flawlessly, I just had to post that empty event in a couple of cases.

I read the input handling page on the GLFW docs twice over, and I don't understand how I missed that, it's right there front and center.

Well, thanks for making me have a second look at it! I always assumed the relatively high average cpu usage in libgdx was due to the JVM, well now I know what it was, multi-threading or not and based on what I'm seeing, I'd say that's some low hanging fruit for some big optimization :D

On Dec 9, 2016 02:28, "Daniel Ludwig" notifications@github.com wrote:

You can wake it up from glfwWaitEvents() with glfwPostEmptyEvent().

We are using multiple windows for our level editor. We moved some content
creation tools, asset browsers, and debug views to separate windows.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/libgdx/libgdx/issues/4419#issuecomment-265907569, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABXycZsVhUqyhJGmTBg-tyNJYsj_L5E5ks5rGK7XgaJpZM4KwZbw
.

voidburn on 9 Dec 2016

Any reason why this cannot be implemented?

public class Lwjgl3Window implements Disposable
-> switch private boolean iconified = false; to public (or make a getWindowStatus - foreground, background, minimised)

in the main loop in lwjgl3application:

    boolean haveWindowsRendered = false;
    closedWindows.clear();
    boolean is_minimised=false;
    boolean window_zero = true;
    for (Lwjgl3Window window : windows) {
        window.makeCurrent();
        currentWindow = window;
        if(window_zero && window.iconified) is_minimised = true;

        window_zero = false;
        synchronized (lifecycleListeners) {
            haveWindowsRendered |= window.update();
        }
        if (window.shouldClose()) {
            closedWindows.add(window);
        }
    }

    if (is_minimised)
    {
        GLFW.glfwWaitEvents();
    } else {
        GLFW.glfwPollEvents();
    }

It drastically reduces the cpu usage whilst minimised without affecting it when not minimised.
This is not for multithread usage, it is merely to reduce cpu usage in background, otherwise java goes cpu hog crazy, at least in OSX, 95% of my CPU and about 100% of my GPU for no reason.
Since there is no need to render when an application is minimised I don't see the impact on a general run cycle.
If there is a need to render as mentioned above, we can trigger a glfwPostEmptyEvent, I guess it would be application specific and probably the api would have to be extended to use this call.

bruceloco on 14 Jan 2017

any updates on this?

bruceloco on 27 Jan 2017

This issue is related to decoupling the main thread from the render loop, not event polling. Also, If you are hitting 90+% on both CPU and GPU you are probably not limiting your FPS in any way, and this has very little to do with the additional CPU (only) usage caused by polling vs waiting.

By the way, I think the reason you see that code you wrote reducing overall usage is because it hangs the main thread in wait for events if the window is minimized and I bet, when it is, very few come through..

voidburn on 28 Jan 2017

@voidburn apparently this only happens on OSX.
According to what I read in:
http://www.glfw.org/docs/latest/group__window.html#ga37bd57223967b4211d60ca1a0bf3c832
The poll returns immediately if there is nothing happening, so if none of the conditions in the loop are satisfied, ie, there is nothing to do, and the app is not set to sleep, it will behave like a while loop without conditions.
This bit:
HaveWindowsRendered |= window.update();
I have not dug down deep enough but I suspect that it always returns true when the window is minimised, hence the loop just keeps going and going and going at full throttle

bruceloco on 28 Jan 2017

What I could see working is to do GLFW setup on the main thread, then spawn a render thread, and do everything else - from window creation to application listener calls to window shutdown - inside this thread.

Small update to my original idea above: I had another look last night, and what I didn't consider was that window creation/destruction, and virtually every GLFW function except a few must be called from the main thread.

I believe this is possible to implement, but only with some major detours.

Some serious synchronization between main thread and render thread, in both directions. Any GLFW callbacks, including input, would need to be forwarded to the render thread. Any user call which wants to execute a GLFW function would need to be forwarded to the main thread. Could be done mostly postRunnable()-style, but requires careful implementation.
The caller of any window function must be aware that this call may not be executed immediately, but at a later point, in a different thread.
The backend must ensure that any listener callbacks are run from the render thread.
Many user-facing query function in Graphics (like, for example, getPrimaryMonitor()) wouldn't work that way. Either they'd need to use some callback mechanism, instead of an immediate return value (hard to understand), or the backend would need to cache the information.

code-disaster on 29 Mar 2017

I still think this is a complication that requires a step back to properly define the problem before going out searching for a solution.

1) Multiple windows make sense if you're developing an application that's primarily focused on building an editor.

2) Multithreaded main loop is something that's more important when you're shipping your game, will only ever use a single window, and don't want the main loop to be affected by window interactions when in windowed mode: network code, music, background AI processing, all those loops should never be interrupted unless a broken program state is encountered.

Given this premise, I'd rather have different backends to accomplish this (pretty much as I've done for myself) and pick the one that best suits the purpose for the given task. This is much easier, and I can already tell you that the only thing that must be synchronized manually because of the GLFW main thread dependency is cursor management: a blocking queue solved that issue admirably and does not affect performance at all (unless for some reason you change your cursor every frame, then I'd use a more performant queue).

I would propose at least two sets of LwjglApplication/LwjglApplicationConfig, and optionally a third:

1) Fully multithreaded, single window (easy to implement)
2) Single threaded, multi window (what we have now, wouldn't change)
3) Optional -> What you proposed: Multithreaded, Multi window, with performance implications due to the inherent synchronization required to be clearly explained.

The rest of the Lwjgl backend can remain unchanged. This would cover possibly all usage scenarios each with its pros/cons/limitations. The important factor here is to provide the user with multiple choices, without imposing a single solution.

It involves more work to maintain, but I definitely think it's worth it for us that use Libgdx on the desktop platform. I wouldn't have rolled my own backend otherwise.

voidburn on 29 Mar 2017

FYI: My fork at https://github.com/code-disaster/libgdx/tree/lwjgl3-mt-rebase2 contains a modified LWJGL3 back-end which implements my last idea.

The main thread runs window message loop(s), processes events, and synchronizes updates to/from the render thread. The render thread does all the rendering (doh!), and relays any GLFW calls which need synchronization to the main thread.

From the application's point of view, there aren't many changes besides a configuration flag to turn this behavior off.

The main thread sits idle most of the time, except if controllers are enabled - but even then, the CPU use is pretty low.

It's not a complete rewrite, but I pretty much started from a clean state, copied/modified anything I needed over, and deleted the remains afterwards, so it probably looks frightening if diff'd/compared to the original.

There are some features missing - e.g. I didn't bother with audio at all because I'm using a custom OpenAL + stb_vorbis implementation, which also runs on a separate thread.

This branch is "in production" with our game for over one year now. It works ... most of the time. There are some strange freeze/slowdown issues in our tools which pretty much forces our artists to turn multi-threading off for the editor.

code-disaster on 19 Sep 2018

Haven't experienced any freezes with my implementation (similar, only I
rely on posting runnables to interact with the main thread), but I don't
support multiple windows, so that might have somethong to do with it.

I'll check your code out, thanks!

On Wed, Sep 19, 2018, 08:45 Daniel Ludwig notifications@github.com wrote:

FYI: My fork at
https://github.com/code-disaster/libgdx/tree/lwjgl3-mt-rebase2 contains a
modified LWJGL3 back-end which implements my last idea.

The main thread runs window message loop(s), processes events, and
synchronizes updates to/from the render thread. The render thread does all
the rendering (doh!), and relays any GLFW calls which need synchronization
to the main thread.

From the application's point of view, there aren't many changes besides a
configuration flag to turn this behavior off.

The main thread sits idle most of the time, except if controllers are
enabled - but even then, the CPU use is pretty low.

It's not a complete rewrite, but I pretty much started from a clean state,
copied/modified anything I needed over, and deleted the remains afterwards,
so it probably looks frightening if diff'd/compared to the original.

There are some features missing - e.g. I didn't bother with audio at all
because I'm using a custom OpenAL + stb_vorbis implementation, which also
runs on a separate thread.

This branch is "in production" with our game for over one year now. It
works ... most of the time. There are some strange freeze/slowdown issues
in our tools which pretty much forces our artists to turn multi-threading
off for the editor.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/libgdx/libgdx/issues/4419#issuecomment-422693940, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABXycbBlR4dnG1eqMIhaO5lEG8X_0_c5ks5ucfYUgaJpZM4KwZbw
.

voidburn on 19 Sep 2018

Hey @code-disaster, I noticed that your fork is still alive and kicking. Do you still use it? And did you find what caused the freeze/slowdown issues?

@voidburn Do you still have your implementation lying around somewhere? I would be very interested in taking a look.

crykn on 14 Sep 2020

@crykn I can't easily clean it up shedding the gazillion customization I made to my GL backend, which raises the point I should at some point make an effort to make it generic and publish it as a plug & play backend.

Hope the following helps you on the right path, but do not consider this copy/paste material, it's heavily out of context and synchronization is really application specific. In general you can use atomics, mark members as volatile and/or use re-entrant locks liberally. If you need high perf queues, I myself use JCTools. In my case for example, I RUN_LOCK the main loop updates because they post cursor updates to the winproc thread, and I don't want to be in the middle of a shutdown when that happens, it can cause nullptr exceptions since resources are being destroyed.

You should absolutely only do this in Windows, it can cause random issues in linux depending on the window manager in use (this I discovered well after these conversations) and I have no way of testing on OSX.

The RUN_LOCK mainly protects from shutting down while in the middle of updating a frame, depending on your program's complexity it might or might not be needed.

In your ApplicationListener (window process)

// Start the main loop
try {
    // Initialize
    initGlfwWindow();

    // Start the window loop
    winprocLoop();
} catch (Throwable t) {
    if (t instanceof RuntimeException) {
        throw (RuntimeException) t;
    } else {
        throw new GdxRuntimeException(t);
    }
} finally {
    // Destroy GLFW resources and terminate
    dispose();
}

The winproc loop then is in charge of initializing the GL context and launching the application main loop:

private void winprocLoop() {
    if (Platform.isWindows()) {
        // Main loop and GL context on dedicated thread to avoid window resize and drag to stall execution (happens only on Windows)
        new Thread(() -> {
            init();
            mainLoop(false);

            // Shutdown is processed when the main loop exits
            synchronized (RUN_LOCK) {
                shutDownNow();
            }
        }, "MainLoop").start();

        // Events polling must occur on the main thread, they will be queued for processing on the next available frame
        while (!shouldClose()) {
            processSystemEvents();
        }
    } else {
        // Main loop is on the window thread for non Windows platforms, events will be polled directly by it.
        init();
        mainLoop(true);

        // Shutdown
        synchronized (RUN_LOCK) {
            shutDownNow();
        }
    }
}

Main loop init:

private void init() {
    // Initialize uncaught exception handler (implement your own, but this is mandatory)
    ThreadUtils.handleUncaughtExceptions();

    // Audio instance
    this.openALAudio = null;

    synchronized (RUN_LOCK) {
        // If we have an audio instance bind it
        if (audio instanceof OpenALAudio) {
            this.openALAudio = (OpenALAudio) audio;
        }

        // Create GL context and capabilities in this thread
        GLFW.glfwMakeContextCurrent(glfwWindowHandle);
        GLFW.glfwSwapInterval(config.vSyncEnabled ? 1 : 0);
        glCapabilities = GL.createCapabilities();

        // Initialize gdx graphics
        gdxGraphicsInit();

        // Initialize gdx GL version info
        initGLVersion();

        // Initialize application listener
        initializeApplicationListener();

        // Check 1
        if (!glVersion.isVersionEqualToOrHigher(3, 2)) {
            throw new GdxRuntimeException("OpenGL 3.2 or higher with the FBO extension is required. OpenGL version: " + GL11.glGetString(GL11.GL_VERSION) + " " + glVersion.getDebugVersionString());
        }

        // Check 2
        if (!supportsFBO()) {
            throw new GdxRuntimeException("OpenGL 2.0 or higher with the FBO extension is required. OpenGL version: " + GL11.glGetString(GL11.GL_VERSION) + ", FBO extension: false " + glVersion.getDebugVersionString());
        }

        // Setup GL debug callback
        if (config.debug) {
            glDebugCallback = GLUtil.setupDebugMessageCallback(config.debugStream);

            setGLDebugMessageControl(GLDebugMessageSeverity.NOTIFICATION, Settings.OPENGL_DEBUG_LEVEL_NOTIFICATIONS);
            setGLDebugMessageControl(GLDebugMessageSeverity.LOW, Settings.OPENGL_DEBUG_LEVEL_LOW);
            setGLDebugMessageControl(GLDebugMessageSeverity.MEDIUM, Settings.OPENGL_DEBUG_LEVEL_MEDIUM);
            setGLDebugMessageControl(GLDebugMessageSeverity.HIGH, Settings.OPENGL_DEBUG_LEVEL_HIGH);
        }

        // Show the window
        setVisible(config.initialVisible);

        // Preclear all buffers with the initial background color
        for (int i = 0; i < 2; i++) {
            GL11.glClearColor(config.initialBackgroundColor.r, config.initialBackgroundColor.g, config.initialBackgroundColor.b, config.initialBackgroundColor.a);
            GL11.glClear(GL11.GL_COLOR_BUFFER_BIT);
            GLFW.glfwSwapBuffers(glfwWindowHandle);
        }
    }
}

Finally the main loop is pretty much the same (exception made for whatever you need to syncronize across threads)

private void mainLoop(final boolean pollEvents) {
    while (!shouldClose()) {
        // If we're tasked to poll for system events, do it now
        if (pollEvents) {
            processSystemEvents();
        }

        // Execute runnables posted for this frame
        synchronized (RUNNABLES_LOCK) {
            // Clear the old runnables and add all new ones to the execution queue
            executedRunnables.clear();
            executedRunnables.addAll(runnables);
            runnables.clear();

            // Run all queued runnables
            for (Runnable runnable : executedRunnables) {
                runnable.run();
            }
        }

        // Application listener update
        synchronized (RUN_LOCK) {
            if (!iconified) {
                input.update();
            }

            // Update audio
            if (openALAudio != null) {
                openALAudio.update();
            }

            // Update graphics unless iconified
            if (!iconified) {
                graphics.update();
                applicationListener.render();
            }

            // Update maximization states
            if (wasMaximized != maximized) {
                wasMaximized = maximized;
            }

            // Presentation (let's make sure this isn't called while we're shutting down)
            GLFW.glfwSwapBuffers(glfwWindowHandle);
            if (!iconified) {
                input.prepareNext();
            }
        }
    }
}

Various utility methods:
```Java
private void requestShutDown() {
    synchronized (RUN_LOCK) {
        // Signal GLFW window it should close (will trigger exit from the main loop)
        GLFW.glfwSetWindowShouldClose(glfwWindowHandle, true);

        // Send an empty event, so we exit the input wait loop
        GLFW.glfwPostEmptyEvent();
    }
}

private void shutDownNow() {
    // Clear all lifecycle listeners
    for (LifecycleListener lifecycleListener : lifecycleListeners) {
        lifecycleListener.pause();
        lifecycleListener.dispose();
    }
    lifecycleListeners.clear();

    // Dispose application listener before the backend
    if (applicationListener != null) {
        applicationListener.pause();
        applicationListener.dispose();
    }

    // Dispose audio device if there was one
    if (openALAudio != null) {
        openALAudio.dispose();
    }

    // Graphics
    if (graphics != null) {
        graphics.dispose();
    }

    // Input
    if (input != null) {
        input.dispose();
    }

    // We're done cleaning up, exit immediately
    Runtime.getRuntime().exit(0);
}

voidburn on 14 Sep 2020

👍1

Hey @code-disaster, I noticed that your fork is still alive and kicking. Do you still use it? And did you find what caused the freeze/slowdown issues?

"Alive" is a strong word. I didn't touch it for a year. But yes, it's in use and shipped with Pathway, so I'd call it stable. I never found the cause of the editor slowdowns/freezes, we just settled with disabling the separate render thread when running our tools.

The latest version sits in the lwjgl3-mt-rebase3 branch. Note that we are far behind libGDX master. As said above, I wasn't exactly editing stuff in a git-merge-friendly manner, so any changes to the LWJGL3 backend happen to create nasty merge conflicts. Also, since we used this to ship a game, we are now cautious with updates anyway.

In retrospect I'm not sure if this has been worth the additional work. Most _customers_ probably won't notice the difference anyway.

code-disaster on 16 Sep 2020

👍1

Libgdx: [Improvement] Lwjgl3 winproc handling thread/main loop thread separation

All 19 comments

Related issues