Pants: Improve python tool startup overhead.

Created on 6 Feb 2021  Â·  3Comments  Â·  Source: pantsbuild/pants

Tools are created once and used many times. When created as a traditional PEX file they suffer the overhead of traditional PEX startup time on every run. We should switch to using pex --venv --seed to acheive ~0 overhead native virtual environment startup times. Since the venv created with --venv --seed is housed in the PEX_ROOT, the resulting venv will also persist .pyc compilation.

This should help with Pytest in #11169 as well as all other Python tools Pants uses to implement goals.

performance python

Most helpful comment

OK, 1st test after adding this feature against a very small test.

Before:

$ time ./pants test --force --output=all src/python/pants/util/strutil_test.py
17:50:09.86 [INFO] Completed: test - src/python/pants/util/strutil_test.py:tests succeeded.
============================= test session starts ==============================
collected 7 items

src/python/pants/util/strutil_test.py .......                            [100%]

============================== 7 passed in 0.09s ===============================


✓ src/python/pants/util/strutil_test.py:tests succeeded.

real    0m1.984s
user    0m0.471s
sys     0m0.055s

After:

$ time ./pants test --force --output=all src/python/pants/util/strutil_test.py
17:50:39.69 [INFO] Completed: test - src/python/pants/util/strutil_test.py:tests succeeded.
============================= test session starts ==============================
collected 7 items

src/python/pants/util/strutil_test.py .......                            [100%]

============================== 7 passed in 0.08s ===============================


✓ src/python/pants/util/strutil_test.py:tests succeeded.

real    0m0.921s
user    0m0.488s
sys     0m0.043s

So pytest shows 80ms and Pants now takes ~925ms.

Given that just getting the version back from Pants takes ~675ms:

$ time ./pants --version
2.3.0.dev3

real    0m0.677s
user    0m0.504s
sys 0m0.036s

This means 675 + 80 = 755ms vs 925ms wall time. So we still have ~175ms of overhead, but that's down from 1225ms of overhead.

All 3 comments

OK, 1st test after adding this feature against a very small test.

Before:

$ time ./pants test --force --output=all src/python/pants/util/strutil_test.py
17:50:09.86 [INFO] Completed: test - src/python/pants/util/strutil_test.py:tests succeeded.
============================= test session starts ==============================
collected 7 items

src/python/pants/util/strutil_test.py .......                            [100%]

============================== 7 passed in 0.09s ===============================


✓ src/python/pants/util/strutil_test.py:tests succeeded.

real    0m1.984s
user    0m0.471s
sys     0m0.055s

After:

$ time ./pants test --force --output=all src/python/pants/util/strutil_test.py
17:50:39.69 [INFO] Completed: test - src/python/pants/util/strutil_test.py:tests succeeded.
============================= test session starts ==============================
collected 7 items

src/python/pants/util/strutil_test.py .......                            [100%]

============================== 7 passed in 0.08s ===============================


✓ src/python/pants/util/strutil_test.py:tests succeeded.

real    0m0.921s
user    0m0.488s
sys     0m0.043s

So pytest shows 80ms and Pants now takes ~925ms.

Given that just getting the version back from Pants takes ~675ms:

$ time ./pants --version
2.3.0.dev3

real    0m0.677s
user    0m0.504s
sys 0m0.036s

This means 675 + 80 = 755ms vs 925ms wall time. So we still have ~175ms of overhead, but that's down from 1225ms of overhead.

And to really get at where the time goes, just running the sandbox directly nets:

$ time ./__run.sh 
==================================================================================================================================== test session starts =====================================================================================================================================
collected 7 items                                                                                                                                                                                                                                                                            

src/python/pants/util/strutil_test.py .......                                                                                                                                                                                                                                          [100%]

===================================================================================================================================== 7 passed in 0.07s ======================================================================================================================================

real    0m0.285s
user    0m0.269s
sys 0m0.015s

So FAPP 100% of the overhead now over the native tool is in the pants client connecting to pantsd and asking the engine for the result; i.e.: The overhead of ./pants -V.

And - for trivia - 70ms of the above __run.sh 285ms is due to the fact that src/python/pants/__init__.py declares a namespace package using pkg_resources ... which does a full sys.path "scan". That overhead though is of course user's choice and not due to any Pants issue.

Was this page helpful?
0 / 5 - 0 ratings