Godot: Add support for Clang ThinLTO

Created on 5 Jan 2018  路  7Comments  路  Source: godotengine/godot

ThinLTO is a link-time optimization method supported by Clang 3.9 and newer. It works similarly to LTO, bringing noticeable performance improvements, but is faster to link and uses less RAM during the linking process.

enhancement buildsystem

Most helpful comment

Some tests:

macOS:

macOS 10.13.2 (17C88), Intel Xeon E5-1620v2, 12 GB RAM
clang: Apple/900.0.39.2, Xcode 9.2 (9C40B)
Godot: fab0d53

Full LTO:

Build command: scons -j4 p=osx tools=yes target=release_debug CCFLAGS=-flto=full LINKFLAGS=-flto=full
Full build time: 17 min
Linking time: 11 min, 1 thread, peak memory usage: 6.32 GB

Thin LTO:

Build command: scons -j4 p=osx tools=yes target=release_debug CCFLAGS=-flto=thin LINKFLAGS=-flto=thin
Full build time: 11 min
Linking time: 5 min, 5 threads, peak memory usage: 0.62 GB

Linux:

Debian sid/4.14.0-2-amd64, same hardware
clang: 4.0.1-8
Godot: fab0d53

Full LTO:

Build command: scons -j4 p=x11 tools=yes use_llvm=yes target=release_debug CCFLAGS=-flto=full LINKFLAGS=-flto=full
Full build time: 30 min
Linking time: 21 min, 1 thread, peak memory usage: 7.21 GB

Thin LTO:

Build command: scons -j4 p=x11 tools=yes use_llvm=yes target=release_debug CCFLAGS=-flto=thin LINKFLAGS=-flto=thin
Full build time: 15 min
Linking time: 6 min, 5 threads, peak memory usage: 1.22 GB

All 7 comments

Some tests:

macOS:

macOS 10.13.2 (17C88), Intel Xeon E5-1620v2, 12 GB RAM
clang: Apple/900.0.39.2, Xcode 9.2 (9C40B)
Godot: fab0d53

Full LTO:

Build command: scons -j4 p=osx tools=yes target=release_debug CCFLAGS=-flto=full LINKFLAGS=-flto=full
Full build time: 17 min
Linking time: 11 min, 1 thread, peak memory usage: 6.32 GB

Thin LTO:

Build command: scons -j4 p=osx tools=yes target=release_debug CCFLAGS=-flto=thin LINKFLAGS=-flto=thin
Full build time: 11 min
Linking time: 5 min, 5 threads, peak memory usage: 0.62 GB

Linux:

Debian sid/4.14.0-2-amd64, same hardware
clang: 4.0.1-8
Godot: fab0d53

Full LTO:

Build command: scons -j4 p=x11 tools=yes use_llvm=yes target=release_debug CCFLAGS=-flto=full LINKFLAGS=-flto=full
Full build time: 30 min
Linking time: 21 min, 1 thread, peak memory usage: 7.21 GB

Thin LTO:

Build command: scons -j4 p=x11 tools=yes use_llvm=yes target=release_debug CCFLAGS=-flto=thin LINKFLAGS=-flto=thin
Full build time: 15 min
Linking time: 6 min, 5 threads, peak memory usage: 1.22 GB

How about something like that?

diff --git a/platform/x11/detect.py b/platform/x11/detect.py
index 478b42f9f..fa557b483 100644
--- a/platform/x11/detect.py
+++ b/platform/x11/detect.py
@@ -123,12 +123,12 @@ def configure(env):
             env.Append(LINKFLAGS=['-fsanitize=leak'])

     if env['use_lto']:
-        env.Append(CCFLAGS=['-flto'])
-        if not env['use_llvm'] and env.GetOption("num_jobs") > 1:
+        if env['use_llvm']: # use ThinLTO, multithreaded by default
+            env.Append(CCFLAGS=['-flto=thin'])
+            env.Append(LINKFLAGS=['-flto=thin'])
+        else: # GCC
+            env.Append(CCFLAGS=['-flto'])
             env.Append(LINKFLAGS=['-flto=' + str(env.GetOption("num_jobs"))])
-        else:
-            env.Append(LINKFLAGS=['-flto'])
-        if not env['use_llvm']:
             env['RANLIB'] = 'gcc-ranlib'
             env['AR'] = 'gcc-ar'

BTW @hpvb, read this in the GCC docs: "Link-time optimization does not work well with generation of debugging information. Combining -flto with -g is currently experimental and expected to produce unexpected results."

macOS build with LTO enabled also produce empty debug information, godot.osx.opt.tools.64.dSYM is 4 MB instead of 82 MB (valid and loaded by debugger normally but contains no symbols)

...
warning: (x86_64) /tmp/lto.o unable to open object file: No such file or directory
warning: (x86_64) /tmp/lto.o unable to open object file: No such file or directory
warning: (x86_64) /tmp/lto.o unable to open object file: No such file or directory
warning: (x86_64) /tmp/lto.o unable to open object file: No such file or directory
warning: no debug symbols in executable (-arch x86_64)

Thanks for checking, there's the same issue with the 3.0 beta 2 Linux binaries which come with an empty .debugsymbols file. So I think I'll disable LTO for debug binaries, and keep it only for the release export templates.

Fixed by #28402.

Hi! I ended up here after googling "/tmp/lto.o". I wrote an explanation of why this error happens on macOS here: https://github.com/conda-forge/gdb-feedstock/pull/23/#issuecomment-643008755
Short version: add -Wl,object_path_lto,lto.o to the linking command.

Was this page helpful?
0 / 5 - 0 ratings