Envoy: Reduce resource consumption of test compilation by simplifying mock library inclusions

Created on 23 Apr 2020 · 17Comments · Source: envoyproxy/envoy

Description:
Many tests/test libraries over-include especially mock libraries which causes increased memory usage, disk usage, and compilation time. This is especially a problem on Windows with cl.exe but we have also observed significant resource usage that can be reduced by as much as half on Linux with clang.

Attached to this issue is a file showing the peak working set memory usage of cl.exe compiling various tests/test libraries (units in KB). We can see many of the tests cause the compiler to use multiple GB of memory, often for simple tests that do not actually require many of the includes they are being built with. We have been able to reduce this usage (see linked PR) by cleaning up these includes. Our primary focus has been enabling faster/less resource intensive compilation of tests in Windows CI but this endeavor seems like it would be relevant to compilation performance on other platforms. We can potentially reduce compilation time, disk usage of artifacts/parameter files, and memory usage of the compiler (by orders of magnitude in some cases, multiple GB reduction in others).

While we have had some success empirically assessing the memory usage of compiling tests and finding the most expensive, it is not super feasible to go one-by-one and manually assess/fix the includes etc. to pare-down each test. We are hoping to potentially get some help reducing the overhead of compiling these tests from other contributors working toward similar changes as in the PR linked below as well as finding a programmatic solution that could potentially be automated (in CI or otherwise) that could identify over-inclusion of mocks and other large libraries that are not used. Other strategies to help solve this could be further breaking down larger libraries with many classes into their component parts so tests only include exactly what interfaces etc. they need.

Relevant Links:

See https://github.com/envoyproxy/envoy/pull/10915 for a relevant PR that addresses this for the kafka network filter extension tests
Gist with peak working set memory usage for tests that exceed 2GB: https://gist.github.com/sunjayBhatia/d2d4a0af1eab16777f94a4dc4959b342
Powershell script used to collect data: https://github.com/greenhouse-org/envoy/blob/84fca545722432a48371fca1ced02253cddb86d0/get-cl-workingset-peak.ps1

arebuild help wanted tech debt

Source

sunjayBhatia

Most helpful comment

Nice!! I just saw this. I have been thinking about this in the context of our frequent timeouts and OOMs in OSS-Fuzz. I ran a few bazel performance analysis on some fuzz targets and found bottlenecks in server/mocks.cc and in gperftools:

   39.021 s   42.47%   action 'CcConfigureMakeRule external/envoy/bazel/foreign_cc/gperftools_build/include'
   40.090 s   43.63%   action 'Compiling test/mocks/server/mocks.cc'

I had some draft patches splitting up server/mocks.cc as well, and I'm happy that this affects others, as I will happily help review/breakdown mock libraries.

asraa on 19 Jun 2020

👍2

All 17 comments

cc @wrowe

sunjayBhatia on 23 Apr 2020

cc @mattklein123 @lizan as you have been working on improving build performance in CI etc.

sunjayBhatia on 23 Apr 2020

See https://github.com/envoyproxy/envoy/issues/8770. cc also @htuch as we have noted understanding and improving compile performance as a possible intern project for this summer. I would love to make progress on this.

mattklein123 on 23 Apr 2020

I think this is also related to getting https://include-what-you-use.org/ working for Envoy. Last time I looked it needed a custom toolchain but I think this has changed.

htuch on 23 Apr 2020

/assign ahedberg

ahedberg on 15 May 2020

👍1

/assign foreseeable

ahedberg on 29 May 2020

😕1

foreseeable cannot be assigned to this issue.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/issues/10917#issuecomment-636021878 was created by @ahedberg.

see: more, trace.

repokitteh[bot] on 29 May 2020

/assign foreseeable

ahedberg on 29 May 2020

👍1

   39.021 s   42.47%   action 'CcConfigureMakeRule external/envoy/bazel/foreign_cc/gperftools_build/include'
   40.090 s   43.63%   action 'Compiling test/mocks/server/mocks.cc'

I had some draft patches splitting up server/mocks.cc as well, and I'm happy that this affects others, as I will happily help review/breakdown mock libraries.

asraa on 19 Jun 2020

👍2

Currently, we divided the hugetest/mocks/server/mocks.h into ~25 different mock classes (#11797), and refactored all test library the referred this header to only include the mock class header file they used(#11912, #11952 ).

This refactoring improves build time for most test changed. Here's the spreadsheet comparing the building time before and after refactoring(some refactoring test libraries have not been pushed yet, we will send PR about them later):

https://drive.google.com/file/d/18EfskReKtPQawBdI_XiP-Aw0hgAmztNK/view?usp=sharing

We can see most tests affected have shorter building time. But for a few test the building time is even increased. I am looking for the reason about this right now.

foreseeable on 9 Jul 2020

cc @alyssawilk @jmarantz

ahedberg on 9 Jul 2020

Very cool. Are longer builds repeatable (you could just check on one test or two)? I notice a bunch of noise in build and test times even under fairly ideal circumstances so I wouldn't worry too much as long as the trend is positive.

alyssawilk on 9 Jul 2020

Currently, we divided the hugetest/mocks/server/mocks.h into ~25 different mock classes (#11797), and refactored all test library the referred this header to only include the mock class header file they used(#11912, #11952 ).

This refactoring improves build time for most test changed. Here's the spreadsheet comparing the building time before and after refactoring(some refactoring test libraries have not been pushed yet, we will send PR about them later):

https://docs.google.com/spreadsheets/d/e/2PACX-1vTx4OuTC1H_KFaEOxLxeUlEU70s8N8W6VDC7P5Vj9nk3mvA0263gPE-1hwFPgYwfmusTsUHyUjdyoZe/pubhtml

We can see most tests affected have shorter building time. But for a few test the building time is even increased. I am looking for the reason about this right now.

@foreseeable the spreadsheet is not publicly accessible

lizan on 14 Jul 2020

Currently, we divided the hugetest/mocks/server/mocks.h into ~25 different mock classes (#11797), and refactored all test library the referred this header to only include the mock class header file they used(#11912, #11952 ).
This refactoring improves build time for most test changed. Here's the spreadsheet comparing the building time before and after refactoring(some refactoring test libraries have not been pushed yet, we will send PR about them later):
https://docs.google.com/spreadsheets/d/e/2PACX-1vTx4OuTC1H_KFaEOxLxeUlEU70s8N8W6VDC7P5Vj9nk3mvA0263gPE-1hwFPgYwfmusTsUHyUjdyoZe/pubhtml
We can see most tests affected have shorter building time. But for a few test the building time is even increased. I am looking for the reason about this right now.

@foreseeable the spreadsheet is not publicly accessible

@lizan now the link should be fixed

foreseeable on 15 Jul 2020

I used Bazel to generated a few performance profiles for some test after simplifying some mock library inclusions:
https://drive.google.com/file/d/1O8B0mLy-VjNXLRSqquNHYaa7izp4Sk7b/view?usp=sharing

(one can open them from chrome://tracing with chrome)

The performance bottlenecks for those tests before was building the monolithic mock headers. But now I can not figure out what to optimize for next step. Can someone have a look of them? Appreciated.

foreseeable on 28 Jul 2020

@foreseeable Great job. I think we can call this issue done.

For the whole build performance, it would be nice if you can take a look why compiling test.cc take that long time. If you pass -ftime-trace to the compiler (via bazel --copt or --per_file_copt), you should be able to get details of that.

lizan on 29 Jul 2020