Godot: Godot crashes upon removal of the original of a duplicated resource

Created on 10 Sep 2020  路  6Comments  路  Source: godotengine/godot

Godot version:
3.22

OS/device including version:
Linux DESKTOP-7AAQ6QQ 4.4.0-18362 x86_64 x86_64 x86_64 GNU/Linux

Issue description:
Upon spawning a bunch of nodes in a Game scene, we have the following array when we loop through our children (Game.get_children()) and map .name across the children to get:
[cargo, PDC, @cargo@12, @PDC@13, @PDC@14, @PDC@15, @cargo@17] (These differing names are auto-generated by Godot Engine when siblings have identical names)

Normally, removing one of these child nodes is not a problem when paired with call_deferred. I.e., when we call_deferred("remove_child", foo) the node corresponding to @PDC@15, everything works fine.

However, when we remove the "original node", i.e., PDC, then try to pick up another node, the entire program either crashes without error or has the following error:

ERROR: remove: Condition "p_elem->_root != this" is true.
   At: ./core/self_list.h:84

Even if I use call_deferred, this occurs

I suspect what's happening is that we are inadvertently deleting the "original node", and the later attempts to duplicate that node instance elsewhere fail because the original loaded node no longer exists. This is true even if I use .instance() or .duplicate():

For example, this is my dictionary of preloaded resources:

export(Dictionary) var pod_type_resources = {
    "CARGO": preload("res://common/game/pod/subsystem/Cargo.tscn"),
    "PDC": preload("res://common/game/pod/subsystem/PDC.tscn")
}

Even if I do .instance() on one of them or .duplicate().instance(), it seems like running remove_child on the "original instance" will cause a crash. What exactly is the right way to handle this? I don't want to just .hide() everything, that seems mega-janky.

bug crash core

Most helpful comment

@ZackingIt Please upload a minimal reproduction project to make this easier to troubleshoot.

All 6 comments

@ZackingIt Please upload a minimal reproduction project to make this easier to troubleshoot.

@Calinou I've invited you to the repo (called aaa_clientserver) which contains the minimum reproducible crash which produces no error message. You will need to run a server.64 linux binary (https://godotengine.org/download/server), and then run the client separately -- this is a server-authoritative game -- unfortunately the zip file is too big to upload. The project is fairly tightly coupled so I can't find a great way to get to the minimally reproducible point without breaking something else, but as you'll see it's a pretty small codebase of only 1,000 lines (300 of which are comments and blanks).

The issue does not seem to stem from "original object deletion" anymore -- the crash is now very intermittent and fairly unpredictable. The memory footprint of the project is only 100mb and the processor usage is not that high, so it's been very difficult to find what could be causing it, but the crash seems to happen for 5% of all pod-collisions (you'll see what I mean when you run the repo, branch v2), but does not give any error message whatsoever.

To run the server:
<godot_64_bit_binary> --main-pack <path_to>/Server.pck

To run our godot client:
./godot.exe -- hitting Play in debugger is also working, after you run Server.pck

Thanks so much in advance -- I was able to refactor enough code to resolve the

ERROR: remove: Condition "p_elem->_root != this" is true.
   At: ./core/self_list.h:84

issue but the no-error crash is really killing me.

@ZackingIt I would prefer to have a publicly available project, as I don't have time to troubleshoot this myself right now.

To get an helpful backtrace when it crashes, run the project with a debug build (such as the headless builds available here). If Godot's crash handler doesn't kick in, you need to run the binary using gdb <binary name> then enter run --main-pack <path_to>/Server.pck. Once it crashes, enter bt full and paste the backtrace here.

I should clarify something -- the client-side binary is x86/Windows while the server is a Linux binary -- and the crash appears to be client-side, so I'm not sure if gdb works. I'm currently using godot.windows.tools.64.exe for the client debug binary. Client and server are the same repo.

For simplicity's sake, you can assume I only have an x86 binary running in a Linux environment (using Bash on Windows as development environment).

This is what I get when I follow your instructions and use gdb on the server:

Thread 1 "godotserver.64" received signal SIGINT, Interrupt.
0x00007fffff0e4a30 in __GI___nanosleep (requested_time=0x7ffffffee350, remaining=0x7ffffffee350)
    at ../sysdeps/unix/sysv/linux/nanosleep.c:28
28      ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.
(gdb) bt full
#0  0x00007fffff0e4a30 in __GI___nanosleep (requested_time=0x7ffffffee350, remaining=0x7ffffffee350)
    at ../sysdeps/unix/sysv/linux/nanosleep.c:28
        resultvar = 18446744073709551100
        sc_cancel_oldtype = 0
        sc_ret = <optimized out>
#1  0x00000000005e9d4d in OS_Unix::delay_usec(unsigned int) const [clone .constprop.12618] ()
No symbol table info available.
#2  0x0000000000000000 in ?? ()
No symbol table info available.

One other question which I suspect may be at the root of the issue:
How does one set an owner if using call_deferred on an add_child? Setting owners must come after add_child and call_deferred is necessary for thread-safety, but set_owner requires that the child be added first -- this seems contradictory to me. Is there a way to resolve this very fundamental issue?

I.e.

    podInstance.call_deferred("add_child", podFrame)
    podFrame.set_owner(podInstance) # This will throw an error, and using call_deferred on set_owner of course does not help either.

I think the ownership async issue may be the underlying issue: if I take ownership away completely, my reference "decays" into a null object, but I can't set ownership properly while using call_deferred per above comment.

after I set a dictionary's value to point to a node, 1002
{1002:[Area2D:3472], 177799106200:[Area2D:1781]}

Later on, when running the same function again
{1002:[Object:null], 177799106200:[Area2D:1781]}

Was this page helpful?
0 / 5 - 0 ratings