Clickhouse: Slow build times

Created on 16 May 2019  Â·  18Comments  Â·  Source: ClickHouse/ClickHouse

ClickHouse build is pretty slow, on 40 core machine I'm able to build it in ~7m but on a local VM, which usually is more handy, it takes ~2h.

Someone suggested an out of the box solution https://github.com/sakra/cotire

Curious if you have experience with it or might want to look into it.

build

All 18 comments

ClickHouse uses a lot of C++ templates. Also, it's has a lot of builtin dependencies. So it's quite a hard task for the compiler & linker.

Do you use ccache?

Maybe you already tried adding cotire (it looks like it should be simple)? That request will be more convincing if you will show results 'before' and 'after' :)

We have not tried cotire.

Slow build time is mostly unavoidable. Although some parts can be sped up: for example, translation units for cache dictionaries are using excessive amount of template instantiations.

To speed up build time on developer machine, you can:

  • install ccache;
  • use debug build;
  • use clang instead of gcc;
  • disable as much optional components as possible (look at ENABLE_* CMake variables).

We also have support for distcc.

Time spend on linking can be reduced by switching to shared linking, but it's not recommended.

Assigned to @proller for reviewing cotire.

@filimonov I didn't test it and have no real incentive for it, plus build systems is not something I want to spend time on. It is more like a suggestion for the team to look into this if they have the incentive, or maybe someone else who has time and motivation to contribute :)

@alexey-milovidov thanks for suggestions, I'll adopt them.

My 5 cents. Just finished building from clean clone. 40 minutes, on 8 cores, including all ./contrib/ stuff. AFAIR cotire is for precompiled header, in any case, there is another approach to speed-up the build. Why not to push all third parties to package manager like vcpkg or conan (upstream or your own fork), so, when I build ClickHouse it will use ready-to-use packages from package manager system (build time), eliminating git submodules from the main project (clone time).
WDYT?

its possible with unbundled mode: cmake -DUNBUNDLED=1
example for debian/ubuntu:
https://github.com/yandex/ClickHouse/blob/master/utils/build/build_debian_unbundled.sh

its possible with unbundled mode: cmake -DUNBUNDLED=1
example for debian/ubuntu:
https://github.com/yandex/ClickHouse/blob/master/utils/build/build_debian_unbundled.sh

Do you see any obvious drawbacks of using this approach? Doesnt the ClickHouse tightly coupled to some specific version of libraries in contrib?

Yes, ClickHouse requires specific version (more exactly - fixed source code) of libraries for reproducible builds. This is needed to rebuild code with different options, different ABI, sanitizers, to easy debugging, etc.

Yes, ClickHouse requires specific version (more exactly - fixed source code) of libraries for reproducible builds. This is needed to rebuild code with different options, different ABI, sanitizers, to easy debugging, etc.

I'm a bit confused, so, what the unbundled mode is good for?

I'm a bit confused, so, what the unbundled mode is good for?

Non production builds and builds for partially supported platforms where libraries are used from OS packages.

I'm a bit confused, so, what the unbundled mode is good for?

Non production builds and builds for partially supported platforms where libraries are used from OS packages.

oh... I see... was afraid this will be your answer. we dont see this approach as acceptable. We want the CI/production/dev env to yield exactly same results (of course, given the build tools are the same in all cases), so I guess packaging every and single submodule into conan or vcpkg is the only solution

My experience:

For development I build debug version with clang on my laptop; I use ccache and it helps a lot when switching branches; for performance testing I ssh to one or another development server and build and test there.

In our CI we build on multi-core servers; we use ccache and the build time varies from 4 minutes to 48 minutes. But the latter is caused by slow compressing of deb and rpm packages, not the build time itself.

Ways to make build faster:

  • use ccache;
  • use debug build (for development);
  • use clang instead of gcc if acceptable;
  • disable debug info (not recommended but linking is faster);
  • disable all unneeded libraries (with -D ENABLE_xxx=0 in cmake); examples: LLVM, Hyperscan;
  • build on server instead of workstation/laptop;

Usage of Conan or vcpkg is most likely not going to happen. But you can try and report the results.

Somehow I got back to cotire and on my laptop:
Clean build, clean ccache
1 h 16 m
Clean build, clean ccache, cotire
57 m 11 s

is it worth considering creating PR for it?

Yes, ~ 20% difference makes sense.
If it's not so hard to add it, let's do.

From https://github.com/sakra/cotire

The functionality provided by cotire has been superseded by features added to CMake 3.16. Support for pre-compiling and unity builds is now built into CMake. Thus, there will not be any further updates or support for this project.

ClickHouse build is pretty slow, on 40 core machine I'm able to build it in ~7m but on a local VM, which usually is more handy, it takes ~2h.

Someone suggested an out of the box solution https://github.com/sakra/cotire

Curious if you have experience with it or might want to look into it.

8894

Yes, ~ 20% difference makes sense.

It will be interesting to see how the precompiled headers will help with modifications to headers that included in lot's of places (i.e. Context.h), but I guess it could not be more then 20%

To whom it may concern.
PCHed build RFC PR (which does not work)
https://github.com/ClickHouse/ClickHouse/pull/9162

Was this page helpful?
0 / 5 - 0 ratings

Related issues

innerr picture innerr  Â·  3Comments

fizerkhan picture fizerkhan  Â·  3Comments

hatarist picture hatarist  Â·  3Comments

greenx picture greenx  Â·  3Comments

bseng picture bseng  Â·  3Comments