Godot: GDScript vs C# performance

Created on 9 Feb 2020  Â·  45Comments  Â·  Source: godotengine/godot

Godot export templates:
https://downloads.tuxfamily.org/godotengine/3.2/mono/Godot_v3.2-stable_mono_export_templates.tpz

Projects:

GDScript takes 10x longer than C# to execute. Results for rendering the first 1000 frames are (in milliseconds):

Intel Core i5 650 @ 3210 MHz, GTX 950

# | GDScript (gdlife) | C# (gdlifenet)
-- | -- | --
1 | 17090 | 1743
2 | 17216 | 1711
3 | 17575 | 1700

How to benchmark:

  1. Clone the projects repositories
  2. Create exported builds of the projects
  3. Run the exported builds and wait for the run to complete - the result is printed to stdout

--

The purpose of this issue is to keep a track on the performance in case there are any optimizations implemented in the future.

discussion gdscript mono

Most helpful comment

Bad to hear. I really like GDScript.

Please don't let specific benchmarks like this change your decision on GDScript :)

GDScript is not just for beginners, and it's certainly not slow (relative to a smooth gaming experience). There are trade-offs with any language. The sheer power and rapid prototyping GDScript offers, while giving the developer the chance to build their dream game, should not be overseen.

All 45 comments

I am kind of curious why GDScript would be that much slower. Isn't it converted into bytecode?

@nathanfranke I don't remember if bytecode compilation is still present. If it is, keep in mind it only speeds up how fast the script loads, not how fast it runs (just like in Python).

I've been doing GDScript microbenchmarks for a while: https://github.com/Zylann/gdscript_performance

@nathanfranke C# uses JIT to convert its bytecode into CPU instructions and is able to use ptrcalls (calling engine functions directly). It also had years of optimizations and features that allow to use CPU power and memory more efficiently (structs, string interning, GC...).
GDScript on the other hand uses a giant switch/case for its instructions, checks types of everything at runtime before operating, accesses functions using maps, uses Variant for everything and does not use ptrcalls. So right now it makes sense that it's much slower. It could be theoretically better due to its tight integration, the rest outweights those benefits.

Bad to hear. I really like GDScript.

As I understand, GDScript is not intended for use in CPU-bound situations, it's just a convenient language for beginners that's tightly coupled with the engine specifics. Most games' CPU-bound use-cases are intended to be covered by specific nodes.

For CPU-bound situations you can use C#, GDNative or write a module.

As I understand, GDScript is not intended for use in CPU-bound situations, it's just a convenient language for beginners that's tightly coupled with the engine specifics. Most games' CPU-bound use-cases are intended to be covered by specific nodes.

For CPU-bound situations you can use C#, GDNative or write a module.

Agreed, though it is interesting topic.
I wonder if Native script (C++) is much faster than C#?

I wonder if Native script (C++) is much faster than C#?

I've got a Godot module implementation of the algorithm available here: https://github.com/dragmz/gdlifemod (tested with MSVC 2019 only)

I wonder if Native script (C++) is much faster than C#?

C++ is theoretically as fast as it can possibly be. Unless you want to write assembly :D

Seriously though, C# is pretty fast as long as you're not memory-bound. Otherwise you get the same problem as Java: lag spikes on garbage collection.

Unless you want to write assembly :D

A bit off topic, but... due to the sheer level of sophistication of compilers, it is extremely likely that C++ will outperform handwritten assembly. For trivial cases, the compiler has those cases optimized. For non-trivial cases, the problems are likely too complex for a human to hand-optimize the assembly.

Back on topic, here are some more benchmarks: http://www.royaldonut.games/2019/03/29/cpu-voxel-benchmarks-of-most-popular-languages-in-godot/ Note that voxel generation is a use case which can be given many low-level optimizations, so C# and especially C++ benefit a lot here.

Bad to hear. I really like GDScript.

Please don't let specific benchmarks like this change your decision on GDScript :)

GDScript is not just for beginners, and it's certainly not slow (relative to a smooth gaming experience). There are trade-offs with any language. The sheer power and rapid prototyping GDScript offers, while giving the developer the chance to build their dream game, should not be overseen.

Please don't let specific benchmarks like this change your decision on GDScript :)

Yes, I'm will use it :) I think that if necessary, I will rewrite the bottlenecks in C++.
The documentation also says that static typing in GDScript may increase performance in future.

Note that GDScript uses real Godot Objects and the ObjectDB for managing them when creating sub-classes in a script. Also it will create ScriptInstances for each of them, and make every variable access on such a script go through a few levels of indirection (thankfully, it will not do string comparisons, but hashmap and red-black tree lookups are enough).
C# on the other hand, can get away with not boxing int-s and bool-s as Variant-s, not storing any excess information in the private classes, apart from what the garbage collector needs, and not doing any hashmap or tree lookups when reading variables from them. Also, it can place objects next to each other in memory, improving cache locality.
Thus, I suspect a lot of the slowdown in this benchmark comes from the fact that the GDScript version does a lot more computation than the C# version. Some of the difference may be eliminated by using a PoolByteArray or two to store the values of the tiles, instead of creating actual tile objects.

@bojidar-bg both implementations are intentionally naive to mimic what a beginner would do

I accept that c# is more efficient than gdscript. But the gaming coding is not necessarily what causes performance issues. The performance of game is more likely to decide by the parts of the game that the engine is already handling. The average user case does not include rendering 10k sprites or anything like that.
On the other hand, GDSript allows for rapid prototyping and makes it easy to get something ready quickly.
So don't think GDScript is a bad option just because it's not optimized as something that optimized over years and years.

There are a number of performance improvements that can be made for GDScript, first and foremost by implementing some basic back-end optimization to the byte code. As of version 3.2 there is no optimization of the byte code beyond some basic constant folding that happens in the compiler front end (that I have observed, at least). Because GDScript VM executes code in a loop containing a giant switch, reducing the number of byte code instructions generated will go a long way to improving the performance.

I have a fork in the works that I've begun implementing the following optimizations of the GDScript byte code:

  • Dead code/store elimination
  • Jump threading
  • Constant propagation
  • Typed registers (Involves adding a number of virtual registers to the function state and instructions to address them. Access to virtual registers can be done without type checking for non-object types (int, real, vector, etc)). Not sure if it would make sense to extend this to further types.)

Potential:

  • ptrcall registers - Same basic concept as typed registers for values, but with ptrcalls to avoid more expensive calls through script API.

Some of these optimizations would affect the semantics of the running application if GDScript is being run in multiple threads. Optimizations that store into typed temporaries in the virtual registers will not be visible externally until the values are stored, which may not be guaranteed until the exit of the function. C# has the volatile qualifier to tell the optimizer to always load the value before it's read and commit it to memory after it is modified. I don't think we'd want to add this level of additional level of complexity to GDScript, so the optimizations will need to be very conservative when they could introduce concurrency issues.

Even with all of these optimizations in place, there is very little chance that GDScript will perform as well as C#. Sure, you could build a JIT for GDScript, but I think it would be easier just to implement GDScript as a .NET language and use the mono tooling that is already in place.

but I think it would be easier just to implement GDScript as a .NET language

Out of curiosity, do you know how such a thing would be done? It sounds like a fun experiment.

Out of curiosity, do you know how such a thing would be done?

It's not something I've ever done in .NET, but it has been done. You would need to write your own front end to parse the GDScript, then transform that code to CIL instructions to build an assembly. Theoretically, at this point the assembly would be loaded and run no differently than one built with C#. Obviously it is easier said than done, but I think this solution would be easier than rolling your own JIT. Another approach is to compile GDScript to LLVM IL. I'm not sure which approach would be easier, but I suspect building the GDScript language for .NET would be as the mono integration is already in place.

A good place to start would be to have a look at the Boo source code. Boo is a python-like .NET language. It used to be an option for scripting in Unity, but support has been dropped long ago. Since then people have mostly seemed to have lost interest, but you may be able to use it as a jumping-off point.

If we're going to use a target for GDScript, it should probably be LLVM.
Dotnet is a bit big to bundle when targeting mobile phones, I think.

On Mon, Feb 17, 2020, 23:58 pchasco notifications@github.com wrote:

Out of curiosity, do you know how such a thing would be done?

It's not something I've ever done in .NET, but it has been done. You would
need to write your own front end to parse the GDScript, then transform that
code to CIL instructions to build an assembly. Theoretically, at this point
the assembly would be loaded and run no differently than one built with C#.
Obviously it is easier said than done, but I think this solution would be
easier than rolling your own JIT. Another approach is to compile GDScript
to LLVM IL. I'm not sure which approach would be easier, but I suspect
building the GDScript language for .NET would be as the mono integration is
already in place.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/godotengine/godot/issues/36060?email_source=notifications&email_token=ABQ3BPLKFZXLCSCAVZI237DRDNFFXA5CNFSM4KSEI67KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMALUEI#issuecomment-587250193,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABQ3BPIWUKAMLH3D6LWU5K3RDNFFXANCNFSM4KSEI67A
.

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

Possibly... but the team has already decided that dotnet will be supported, and that support is implemented.

@pchasco Mono support is optional and the engine is officially and primarily compiled without it, since it about doubles the size of the engine to have Mono support.

@pchasco Mono support is optional and the engine is officially and primarily compiled without it, since it about doubles the size of the engine to have Mono support.

That is true. If I am not mistaken LLVM has been discussed in the past and was decided against because it adds a rather sizable build dependency.

Personally, I think the best approach is to make GDScript as fast as it can be without going to JIT. I have done a significant amount of work on a GDScript to C compiler, but I have largely abandoned that effort in favor of improvements to the VM. After some initially encouraging results, real-world performance wasn't that much improved over GDScript to make it worth my while. The compiler targeted GDNative, which was the primary performance bottleneck. I could have it generate an engine module to avoid GDNative, but then it would be much less convenient to build.

Note that GDScript performance improvements are planned, as discussed in a google doc.

Just a few cents on ideas listed previously:

Dead code/store elimination
Constant propagation

We were thinking of using SSA (single static assignment) in the compiler, which might alleviate this issue.

Jump threading

Definitely sounds useful, and I can see how the compiler might sometimes generate bytecode which chains jumps needlessly.

Typed registers

We were thinking of going for typed instructions instead. They will still use Variant, but they will not check its type, or unbox it, but will just take the bytes out.
With typed instructions, I feel there might not be much performance gain from this one, except for cache optimizations (due to less "padding" around the data from Variant).

ptrcall registers

Not sure about this one either. Maybe there could be some instructions which do a ptrcall using a pointer stored within the bytecode directly?

[..] I think it would be easier just to implement GDScript as a .NET

As already mentioned, non-mono builds of Godot should have working GDScript.
Not sure how well .NET will perform for untyped GDScript where everything might have to be dynamic, instead of having a type.

I think the best approach is to make GDScript as fast as it can be without going to JIT.

I agree. Going to JIT can only eliminate the time spent dispatching instructions. As currently implemented by #11518, the dispatching code is pretty fast, and can even be branch predicted by the CPU. (It is not a giant switch in a loop, except on compilers which do not support jumptables.)

@bojidar-bg Thanks for the share! I've been trying to find a summary of the planned improvements on github but I haven't had much luck to this point.

We were thinking of using SSA (single static assignment) in the compiler, which might alleviate this issue.

You could go that way... But I think it's overkill. Deconstructing SSA form correctly is not trivial as it can introduce hard-to-find bugs in the code when not done correctly. You can still implement dead code elimination, constant folding, common subexpression elimination, and most other beneficial optimizations without it.

We were thinking of going for typed instructions instead. They will still use Variant, but they will not check its type, or unbox it, but will just take the bytes out.

I suppose this could work... In my fork where I have been trying some of these things out I went direction of registers. Mostly because I did not want to suggest breaking encapsulation of the variant structure to work in its data directly. Getting directly into the variant data might be a bit faster because it would avoid the load/store operations for the registers.

the dispatching code is pretty fast, and can even be branch predicted by the CPU.

Technically true but I wonder if this actually happens in practice.

Edit:
I should qualify the branch prediction statement... Branch predictor is not involved here as it is used when the decision is to branch or not to branch. The CPU uses the indirect branch predictor, if available, to keep the pipeline full in the case of a jump table. I don’t know whether the indirect predictor will do much good in the dispatch loop because the patterns will be difficult for the CPU to predict, and the amount of code being run between each iteration for many bytecode instructions may replace the history buffer before the next iteration.

Edit 2:
If you really wanted to try to get something out of the branch predictor, it would be best to schedule instructions with others of the same opcode (same branch taken). This is quite possible for sequences of arithmetic operations where the dependency order is known.

I have a partial implementation of some of these optimizations in my fork:

https://github.com/pchasco/godot

Notably missing is support for the iterate and yield instructions, and functions with default arguments. Also incomplete is typed instructions. This is published now as only a PoC. In the next week or so I will implement the missing instructions. There is no challenge there beyond finding the time; there is no significant difference to them versus the standard jump instructions.

Typed instructions will take somewhat longer.

I also plan to implement optimization for built-ins that are pure functions. Calls to pure functions are candidates for elimination via common sub expression elimination optimization.

Bad to hear. I really like GDScript.

Please don't let specific benchmarks like this change your decision on GDScript :)

GDScript is not just for beginners, and it's certainly not slow (relative to a smooth gaming experience). There are trade-offs with any language. The sheer power and rapid prototyping GDScript offers, while giving the developer the chance to build their dream game, should not be overseen.

On the other hand, I know Python and I know C# but I still need to learn GDScript almost from scratch. While it's not that hard, it messes up with my brain when I come back to regular python development at work. It's similar to python in some cases, but it's not python.
I guess will switch to C# for now, to save Python mental stuff 😅

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

How small is LLVM? As far as I can tell it'd still more than double the current download size.

But honestly, going from 20MB to 50MB isn't exactly much. Sure, a 250% increase in download size sounds _huge_, but in context of modern day phones, computers, and available bandwidth it is absurdly tiny.

If we're going to use a target for GDScript, it should probably be LLVM. Dotnet is a bit big to bundle when targeting mobile phones, I think.

How small is LLVM? As far as I can tell it'd still more than double the current download size.

But honestly, going from 20MB to 50MB isn't exactly much. Sure, a 250% increase in download size sounds _huge_, but in context of modern day phones, computers, and available bandwidth it is absurdly tiny.

I was talking about the export's size, not the editor's. Games that use C# need to ship the virtual machine too on targets that don't have dotnet.

LLVM outputs native binary so it becomes a non-issue.

For what it's worth, I'd also make it optinal, like the Mono version is.

If GDScript can be compiled using LLVM, all GDScript users will download the LLVM version.

@DriNeo I don't think that'd be quite the case. LLVM is notoriously difficult to compile from source, so any user who's using an OS without a recent prepackaged LLVM will likely not bother about it. (What about Raspberry Pis and the like?)

But honestly, going from 20MB to 50MB isn't exactly much.

Godot 4.0's editor binary size will likely be a fair bit larger than 3.2. The new baseline will probably be around 40 MB compressed by the time Godot 4.0 is released. Either way, it's still quite small -- for instance, it's smaller than AssaultCube. Just keep in mind that binary sizes generally keep growing as software evolves :slightly_smiling_face:

@DriNeo I don't think that'd be quite the case. LLVM is notoriously difficult to compile from source, so any user who's using an OS without a recent prepackaged LLVM will likely not bother about it.

Most users don't compile Godot from source, though. If somebody figures out the whole LLVM compilation issue and adds the source code to the repo (like we do with basically every external lib), all the end user will have to do is download the proper version from the website.

It would also be possible to implement LLVM support through an addon that outputs GDNative binaries, so this wouldn't necessarily need to be baked into the engine.

@jabcross The LLVM source code repository weighs dozens of gigabytes. It'll never be added to the Godot source repository at this point :stuck_out_tongue:

It would also be possible to implement LLVM support through an addon that outputs GDNative binaries, so this wouldn't necessarily need to be baked into the engine.

This is arguably a better course of action. I think we should focus on improving GDNative integration instead (and possibly add WebAssembly support as a replacement in the long-term, but that's another discussion).

I think we should focus on improving GDNative integration instead (and possibly add WebAssembly support as a replacement in the long-term).

It would be awesome. Currently GDNative is very inconvenient.

I'm not 100% updated on this, but the problem with GDNative currently is the interface bottleneck, right? Function calls still need string comparison and whatnot?

The LLVM source code repository weighs dozens of gigabytes. It'll never be added to the Godot source repository at this point stuck_out_tongue

Yeah, but a considerable portion is non-essential, like tests and benchmarks. It could be trimmed down. MLIR is also in the LLVM repository, after all.

We should also consider MLIR in the future, it's a really cool project.

That's not entirely correct. You can basically acquire a handle to a script
method and use ptrcall to invoke it, eliding any lookup.

On Wed, May 13, 2020 at 9:04 AM Pedro Ciambra notifications@github.com
wrote:

I'm not 100% updated on this, but the problem with GDNative currently is
the interface bottleneck, right? Function calls still need string
comparison and whatnot?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/godotengine/godot/issues/36060#issuecomment-628011441,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAUFIAE22AFHMMFFQEMIFGTRRKSH7ANCNFSM4KSEI67A
.

Still, lots of hoop jumping for something arguably simple. It happens with variable assignment too, right?

Are there any known estabilished games that used GDNative so far?

Still, lots of hoop jumping for something arguably simple. It happens with variable assignment too, right?

I don't think it's a lot of hoops. And I wouldn't recommend using ptrcall for every method; premature optimization adages still apply.

Are there any known estabilished games that used GDNative so far?

Karroffel mentioned one on my blog but he didn't give the name of it...

What is the current state of GDScriptt right now and could you guys make some performance increase, be it GDNative and GDScript?

GDScript is being remade by vnen which will hopefully improve performance.

be it GDNative and GDScript?

GDNative is a completely different language so I don't understand the second part

You can likely achieve performance increase by porting your GDScript to compiled GDNative scripts.

I am not confident that the currently proposed changes being made to GDScript will significantly improve performance, but time will tell.

Out of curiosity, accordingly to Godot documentation:

.NET / C#
As Microsoft's C# is a favorite amongst game developers, we have added official support for it. C# is a mature language with tons of code written for it, and support was added thanks to a generous donation from Microsoft.

It has an excellent tradeoff between performance and ease of use, although one must be aware of its garbage collector.

I'm curious what manual care I'd have to have when coding with C#. Is there any recommendation to where and when I should manually call garbage collection operations in C# scripts?

I'm curious what manual care I'd have to have when coding with C#. Is there any recommendation to where and when I should manually call garbage collection operations in C# scripts?

I don’t know why you would need to manually force a collection. I definitely would not do it during gameplay. The only time when it may be beneficial to do a manual collection would be after loading or saving a game, or after dismissing a menu when you can get away with some stutter. But you can generally just ignore the GC altogether if your game is performing well.

You'd be better off asking your question either on the subreddit or Godot Q&A.

Thanks! Well, truth be told: the way putted out in the documentation seemed there should be some sort of handling or expectations for memory leak if things are kept 'automatically'.

All in all, I was just curious whether instantiations and real time calculations could leave garbage behind without any proper manual handling. I'll query about those on Reddit. Thanks!

There is new kid on the block: MIR https://github.com/vnmakarov/mir that is aiming to be lightweight JIT for multiple intermediate representations https://developers.redhat.com/blog/2020/01/20/mir-a-lightweight-jit-compiler-project/ and initially being implemented as JIT for Ruby 3.0.
It seems it could be interesting target also for typed GDScript.

@milkowski so would u then use GDScript or GDNative code in combination with this new M-JIT?

Was this page helpful?
5 / 5 - 1 ratings