Node: `-flto` on clang/gcc?

Created on 24 Jun 2016  ·  27Comments  ·  Source: nodejs/node

I wonder if we should give a try to the Link Time Optimizations flag in clang. It appears to be used when building v8 (though, as far as I can tell no on mac), and may be useful for building OpenSSL and/or core itself.

Thoughts?

cc @nodejs/collaborators

build stalled

All 27 comments

cc @bnoordhuis

Here are tls-throughput result comparison on my mac (10 runs):

tls/throughput.js dur="5" type="buf" size="2": ./out/Release/node-flto: 6.1295 ./out/Release/node: 5.6876 ....... 7.77%
tls/throughput.js dur="5" type="buf" size="1024": ./out/Release/node-flto: 1567 ./out/Release/node: 1455 ........ 7.70%
tls/throughput.js dur="5" type="buf" size="1048576": ./out/Release/node-flto: 4167.8 ./out/Release/node: 4026.7 . 3.50%
tls/throughput.js dur="5" type="asc" size="2": ./out/Release/node-flto: 5.5664 ./out/Release/node: 5.0363 ...... 10.52%
tls/throughput.js dur="5" type="asc" size="1024": ./out/Release/node-flto: 1462.2 ./out/Release/node: 1336.8 .... 9.38%
tls/throughput.js dur="5" type="asc" size="1048576": ./out/Release/node-flto: 3947.5 ./out/Release/node: 3847.4 . 2.60%
tls/throughput.js dur="5" type="utf" size="2": ./out/Release/node-flto: 5.5544 ./out/Release/node: 5.0786 ....... 9.37%
tls/throughput.js dur="5" type="utf" size="1024": ./out/Release/node-flto: 1328.5 ./out/Release/node: 1255.7 .... 5.80%
tls/throughput.js dur="5" type="utf" size="1048576": ./out/Release/node-flto: 3051.1 ./out/Release/node: 2985.1 . 2.21%

I suspect that it will also slightly improve start time and overall performance.

Out of curiosity, was this prompted by news of ThinLTO?

@indutny Did you build with CXX=clang++? If yes, can you pass LINK=clang++ as well? GYP links with g++ by default and in that case you won't get cross-compilation unit LTO.

@mscdex yup

@bnoordhuis yep, and also I am on OS X, so gcc is not an option anyway.

@bnoordhuis btw, g++ has -flto option too. Though, it accepts number of threads there.

Something to consider too: http://clang.llvm.org/docs/UsersManual.html#cmdoption-fwhole-program-vtables

btw, g++ has -flto option too.

What I mean is that when you compile with clang++ but link with g++, then g++ won't know how to do LTO across files because the .o files don't contain GIMPLE (because they contain LLVM bitcode.)

@bnoordhuis oh, of course. Do you think we need to do something about it?

FWIW I tried -flto last night with gcc/g++ (5 and 6) and didn't see any performance change (although the binary was a few MB smaller). I even tried adding other lto-related arguments and changing linkers, but still nothing. I still could have been doing something wrong though.

@indutny The PR probably needs to enforce that LINK and LINK.target == CXX.target, unless explicitly overridden.

and didn't see any performance change (although the binary was a few MB smaller

In fact, that's pretty common case - from what I observed, LTO in gcc usually gives only cross-object dead code elimination (which is already a good thing, even though in theory it can do more).

Alright, so clang only then?

Well, I think it's still favorable for release builds, no matter what compiler we're using.

So, if we wanted to play around with release binaries for 64-bit linux we'd want to set up a recent clang on centos5?

@RReverser I'm not so sure this would be good to have in release builds (and/or as a default) if -flto strips out symbols that may be used by third party addons.

@jbergstroem gcc supports this too.

This should stay open, yes? (Hi! I'm triaging inactive issues!)

7408 is the accompanying (but stalled) pull request.

@addaleax @indutny Is this still something you want to do?

Hello

I am trying to build Node.js with lto. However, I get the following error, when executing the tests, i.e. when running - make test - :


Path: abort/test-addon-uv-handle-leak
assert.js:270
    throw err;
    ^

AssertionError [ERR_ASSERTION]: uv loop at [0x3882800] has active handles
[0x7f59f800b490] timer
    Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
    Data: 0x7f5a08f1b120 example_instance [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
    (First field): 0x7f5a08f1adc0  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
[0x7f59f800b530] timer
    Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
    Data: (nil) 
[0x7f59f800b5d0] timer
    Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
    Data: 0x42 
/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node[40943]: ../src/debug_utils.cc:218:void node::CheckedUvLoopClose(uv_loop_t*): Assertion `0 && "uv_loop_close() while having open handles"' failed.
 1: 0x728c80 node::Abort() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 2: 0x74bf30  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 3: 0x718525  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 4: 0x82fc16 node::worker::Worker::~Worker() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 5: 0x82fe01 node::worker::Worker::~Worker() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 6: 0x743c13 node::Environment::RunCleanup() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 7: 0x1062131  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 8: 0x7acf4e node::Start(int, char**) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 9: 0x7f5a0c40a830 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
10: 0x71de89 _start [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]

    at Object.<anonymous> (/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/abort/test-addon-uv-handle-leak.js:56:5)
    at Module._compile (internal/modules/cjs/loader.js:689:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:700:10)
    at Module.load (internal/modules/cjs/loader.js:599:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:538:12)
    at Function.Module._load (internal/modules/cjs/loader.js:530:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:742:12)
    at startup (internal/bootstrap/node.js:260:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:584:3)
Command: out/Release/node --experimental-worker /home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/abort/test-addon-uv-handle-leak.js

Could you please help? Did anybody encounter the same error? I saw that there was some activity in the recent past regarding memory leaks, therefore, I am wondering if this is something related to newer version of Node.js or not.

I am looking forward to hearing from you.

Thank you in advance,

@octaviansoldea

(edit by @addaleax: formatting)

@octaviansoldea Which compiler/OS/libc are you using?

Generally, that test is very new, and you shouldn’t worry about it failing with this error at this point

It might also help to have something like “steps to reproduce”?

@addaleax

Thank you for your message and feedback. Here are the steps I am using when compiling Node.js.

The versions of gcc and g++ used are
gcc (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1 20160904
g++ (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1 20160904

The Linux version I am using is
50~16.04.1-Ubuntu SMP Wed May 30 11:18:27 UTC 2018

The modifications I am introducing are related to two files
configure and common.gypi, and they are described in the attached file
lto_code_ changes.txt.

For compilation, please use

./configure --enable-lto

at the configuration step, and

make -j number

at proper compilation. In this context, I recommend using

number = 1.5 * number_of_processors_available.

Moreover, please note that the line

'lto': ' -flto=4 -fuse-linker-plugin -ffat-lto-objects ',

influences the linking time. One can use "... -flto=number ...", where number is different
than 4, provided there are enough processors. In this context, I recommend using

number = 1.5 * number_of_processors_available

as mentioned above.

Any help and feedback is greatly appreciated.

@octaviansoldea

Ping... what's the status on this?

21677 landed and does roughly the same as #7408, but with a different flag name and only for Linux.

Perhaps that should be enough to mark this resolved?

Or should the discussion for other OS and/or enabling it by default happen in this issue?

Closing due to inactivity.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

danielstaleiny picture danielstaleiny  ·  3Comments

stevenvachon picture stevenvachon  ·  3Comments

dfahlander picture dfahlander  ·  3Comments

danialkhansari picture danialkhansari  ·  3Comments

Icemic picture Icemic  ·  3Comments