Godot: ThreadPosix's get_thread_id only works with Godot threads

Created on 14 Nov 2018  路  12Comments  路  Source: godotengine/godot

Godot version:

d806ad4a3dcf7308147e1a243092d22091560d7d (3.0+)

Issue description:
ThreadPosix has a custom counter for thread IDs. This makes it not possible to reliably compare IDs between a Godot thread and a thread created by another library, for example Mono.

Steps to reproduce:
Suppose we want to check if the current thread is the main thread. We would do something like this:

if (Thread::get_caller_id() == Thread::get_main_id()) {
    print_line("This is the main thread");
}

What if the current thread was created by another library? The condition will always evaluate to true. That is because get_caller_id (on posix) will always return 0 for any thread created by another library, and 0 happens to be the ID of the main Godot thread.

The reason is the following:

ThreadPosix's get_caller_id is implemented as:

https://github.com/godotengine/godot/blob/88bfb27abf35b347b4fdb0def15b340f33c89209/drivers/unix/thread_posix.cpp#L90-L93

IINW pthread_getspecific returns 0 (NULL) if there is no value associated with the key, and that's the case if the thread wasn't created by Godot.

Godot's main thread ID (on posix) should be 0. Either because it's hard-coded:

https://github.com/godotengine/godot/blob/88bfb27abf35b347b4fdb0def15b340f33c89209/core/os/thread.cpp#L38
https://github.com/godotengine/godot/blob/88bfb27abf35b347b4fdb0def15b340f33c89209/main/main.h#L55

Or because in several places it may use the return value from get_caller_id, and since the main thread is not created by Godot, there will not be any value associated with the key, so get_caller_id (on posix) should return 0:

https://github.com/godotengine/godot/blob/88bfb27abf35b347b4fdb0def15b340f33c89209/main/main.cpp#L344
https://github.com/godotengine/godot/blob/88bfb27abf35b347b4fdb0def15b340f33c89209/platform/android/java_glue.cpp#L957

bug enhancement android ios linuxbsd macos core

Most helpful comment

I've talked about this with @reduz and @hpvb on IRC. Hopefully we can come up with a fix or workaround for 3.1.

All 12 comments

Kicking to 3.2, wont have time to do this properly as it needs to be tested.

This is the cause of #24108 and #17894.

Just as a data point, bumping this to 3.2 would prevent me from using 3.1 for my game High Hat. I'm sure other mono folks would be affected too.

Not that it affects me too much because I build from source :)

I've talked about this with @reduz and @hpvb on IRC. Hopefully we can come up with a fix or workaround for 3.1.

What if we do something like this?

Thread::ID ThreadPosix::get_thread_id_func_posix() { 
    void *tid = pthread_getspecific(thread_id_key);
    if (tid != NULL)
        return (ID)tid;
    ID new_tid = atomic_increment(&next_thread_id);
    pthread_setspecific(thread_id_key, (void *)new_tid);
    return new_tid;
} 

Basically, we create a new id for the thread if it doesn't have one.

Problem is the main thread id is 0, so the NULL check won't work with it. This solution would require the starting id to be 1 (initialize next_thread_id to 1 instead of 0).

Lets kick for after 3.1 is out.

Mmmm, let's please not? This is a serious issue...

Will discuss with @hpvb how to best fix this during Godot sprint (noted it) because its not simple and I would really prefer its done the best way as possible. The proper way to solve this on unixes involves using special APIs for every platform (OSX, Linux and BSD do it different). This is probably 3.1.1 material, but I dont think we have such tag :P

IMO, since this is mostly critical for Mono, if we can come up with a Mono-specific workaround for 3.1 that would be good. We can implement a better fix after 3.1 and backport it to 3.1.x afterwards, removing the workaround.

The short term fix (#25299) works perfectly for me. It resolves the Object.Dispose problem I reported in #24108 and I'm not noticing any other behavioral changes.

The long term fix (#25300) resolved my old cause of Object.Dispose, but resulted in some significant behavioral changes. The context won't mean much, but I'll share it anyway:

  1. can't pick up hats after about 2 seconds
  2. if i do pick them up within the first two seconds, i get an Object.Dispose error

Its possible that this new behavior is because "fixing" the threading issue revealed bugs in my code, but I think my use case is pretty typical and I wouldn't expect user code to cause the Object.Dispose errors we're seeing. I don't have time to poke around more, but I might be able to over the weekend.

@cart Did you test #25300 with the last changes I force pushed? Those may be critical.

Can you share the backtrace of the crash?

@neikeq I'm pretty sure I had the force push changes, but to be sure I reset everything, rebuilt, cleared my game's local files (.mono folder, .import folder), and the .local/share/godot/mono folder.

Everything works perfectly on #25300 and #25299. Not sure what my problem was last night. Sorry for the false alarm!

Was this page helpful?
0 / 5 - 0 ratings