Run make check or even make -j.. check. Observe how long it takes to finish the tests.
I don't think it's broken, but the way it is now, it's really annoying.
I have a 12-core system, but the Python tests seem to be mostly single-threaded. If we could run different tests in parallel on different cores, that could really speed up the test suite. Of course, care must be taken that tests don't interfere with each other, e.g. on the file system or in binding to port numbers.
Are there any ideas on how to do this?
We just updated the testing section of docs/HACKING.md to address this (#1725). Basically once you install:
pip3 install pytest-xdist
you can then run something like:
make -j12 check PYTEST_PAR=24 DEVELOPER=1 VALGRIND=0
Adjust PYTEST_PAR for your hardware. My testing shows its mostly memory dependent.
With 2 cores I regularly use PYTEST_PAR=10 since we spend a lot of time just waiting for timers to time out, or blocks to be propagated. Disabling valgrind is another important one, and should be safe since we run it on travis on every PR.
Some tests fail when running in parallel but pass when run individually, for example using:
make -j6 check PYTEST_PAR=12 DEVELOPER=1 VALGRIND=0 results in
[gw4] [ 28%] FAILED tests/test_lightningd.py::LightningDTests::test_closing_while_disconnected
===================================================================================================== FAILURES ======================================================================================================
__________________________________________________________________________________ LightningDTests.test_closing_while_disconnected __________________________________________________________________________________
[gw4] linux -- Python 3.5.3 /usr/bin/python3
But make -j12 check PYTEST_PAR=1 DEVELOPER=1 VALGRIND=0 passes all tests
===================================================================================== 111 passed, 1 skipped in 2350.97 seconds ======================================================================================
Running the test separately also passes:
DEVELOPER=1 VALGRIND=1 PYTHONPATH=contrib/pylightning python3 tests/test_lightningd.py -f LightningDTests.test_closing_while_disconnected
test_closing_while_disconnected (__main__.LightningDTests) ... /home/simon/.local/lib/python3.5/site-packages/ephemeral_port_reserve.py:47: ResourceWarning: unclosed <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 33233), raddr=('127.0.0.1', 49782)>
s.accept()
/home/simon/.local/lib/python3.5/site-packages/ephemeral_port_reserve.py:47: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 43307), raddr=('127.0.0.1', 59276)>
s.accept()
/home/simon/.local/lib/python3.5/site-packages/ephemeral_port_reserve.py:47: ResourceWarning: unclosed <socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 42253), raddr=('127.0.0.1', 47728)>
s.accept()
ok
----------------------------------------------------------------------
Ran 1 test in 82.740s
OK
Another thing I noticed is that _after_ a (presumably failed) test, ps -ax | grep bitcoind still shows a long list of bitcoind processes.
After calling pkill -f bitcoind and running tests with different parameter PYTEST_PAR=6
make -j6 check PYTEST_PAR=6 DEVELOPER=1 VALGRIND=0 now fails on a _different_ test
[gw2] [ 36%] FAILED tests/test_lightningd.py::LightningDTests::test_fundee_forget_funding_tx_unconfirmed
===================================================================================================== FAILURES ======================================================================================================
_____________________________________________________________________________ LightningDTests.test_fundee_forget_funding_tx_unconfirmed _____________________________________________________________________________
and using make -j6 check PYTEST_PAR=3 DEVELOPER=1 VALGRIND=0 passes all tests and now ps -ax | grep bitcoind is empty afterward. My system is 4 core i5-7200U CPU @ 2.50GHz and 4GB RAM.
So I guess some tests fail when you set PYTEST_PAR larger then number of cores?
I mentioned using PYTEST_PAR of 24. But this is on a 6 core i7 desktop with 64GB RAM. I still get the occasional intermittent failure, so I just rerun.
I think it has more to do with RAM size than core count. Swapping RAM leads to timeout errors.
The docs can be further improved to address these issues.
Setting PYTEST_PAR to a higher value definitely speeds things up, but my experience so far:
PYTEST_PAR > 3 on my 4-core machine fails some testsbitcoind, valgrind,...)my system: Debian stretch 9.4
EDIT: using PYTEST_PAR=5 PASSED and reboot is not always needed. So it remains a bit of a mystery to me. FWIW, 3 out of 4 tests using different PYTEST_PAR values failed on this test:
test_fundee_forget_funding_tx_unconfirmed so maybe that is the culprit.
Since we have PYTEST_PAR, this can be closed right?
sure
I'll also work on speeding these up a bit more :-)
Another trick is to run these on a ramdisc by setting TEST_DIR like this:
TEST_DIR=/dev/shm/ltest PYTEST_PAR=10 make pytest
On my machine this results in the following timings:
PYTEST_PAR | TEST_DIR | Time
--|--|--
10 | /dev/shm/ltest | 251.73 seconds
1 | /dev/shm/ltest | 1454.22 seconds
1 | /tmp/ltest | 1479.40 seconds
10 | /tmp/ltest | 247.99 seconds
On the other hand the ramdisc doesn't seem to do all that much...
Most helpful comment
With 2 cores I regularly use
PYTEST_PAR=10since we spend a lot of time just waiting for timers to time out, or blocks to be propagated. Disablingvalgrindis another important one, and should be safe since we run it on travis on every PR.