Pip: Speeding up Tests

Created on 19 May 2017  路  14Comments  路  Source: pypa/pip

The test suite is _slow_. It takes a long time to run them and to get results. Because of this, occasionally the CI builds timeout due to how long it takes for the builds to complete.

There are various issues that affect the speed of the test suite. This issue is an overall tracking issue for them. I'll take a shot at each of those, given that I have the time. The intent is to try to speed up the test suite without compromising the degree of confidence on it.


  • [x] Investigate CI improvements

    • [x] Containers for Travis (#4590)

    • [x] Breakup tests across 2 workers (#5436)

  • [x] Look into speeding up the virtualenv fixture (#4706)
  • [ ] Require all non-pip invoking tests to be unit tests
  • [ ] Categorize all non-script tests as unit tests
  • [ ] Increasing unit test coverage
  • [ ] Remove redundant tests -- #2640
  • [ ] Simplify or Break up slow tests -- #1721
  • [ ] Speed up pip's startup time (#4768)
  • [ ] In-Memory tests where appropriate
  • [ ] Removing exhaustive file checks in scripttest (#7141)

(I'll add more as they come)

tests maintenance

Most helpful comment

Each PR takes up to 30 minutes from submission until the checks are done. See for example:

  1. #7653 - 26m
  2. #7652 - 30m
  3. #7651 - 28m

which can be calculated as the max of the times for:

  1. "Linux" Azure Pipelines check (visible in check UI)
  2. "Windows" Azure Pipelines check (visible in check UI)
  3. "macOS" Azure Pipelines check (visible in check UI)
  4. Travis CI check (visible on Travis site)
  5. GitHub actions builds

In general the Windows checks on Azure pipelines take the longest time at up to 30m.

The next slowest after Windows Azure pipelines is Travis at around 22m.

We can generally save time by:

  1. Increasing parallelization

    1. Splitting jobs across more workers - if we split up tests into multiple "jobs" that each can run on a CI worker then the individual jobs will complete faster

    2. Allocate workers with more cores - this would only really be an option for self-hosted runners on Azure Pipelines

  2. Don't wait to run long jobs - on Travis CI if we kick off everything at the same time then it would take as long as the longest job instead of the longest job from "Primary" + the longest job from "Secondary", for example (see here for a specific example)
  3. Detailed analysis of where time is being spent during tests - from the investigation for #7263 I have experience and tools that can help with this on Windows that I can communicate to someone.

Generally with 1 and 2 we need to keep in mind the maximum number of concurrent jobs we can have executing at any time, otherwise we may run into limitations that would cause one PR to delay another one. At the moment that is not a concern because each CI provider gives us enough concurrent executors that we could have (I think) 3 PRs submitted at the same time without them delaying each other.

All 14 comments

Just for self-reference later, here are the integration test run times (as reported by pytest) of the last 5 runs of the tests on Travis CI (on master):

| Version | #6914 | #6913 | #6901 | #6899 | #6898 |
| :-----: | :-----: | :-----: | :-----: | :-----: | :-----: |
| 2.7 | 1491.73 | 1499.83 | 1461.50 | 1465.82 | 1465.74 |
| 3.3 | 1758.89 | 1662.34 | 1653.14 | 1648.96 | 1588.04 |
| 3.4 | 1589.49 | 1757.04 | 1696.77 | 1687.61 | 1608.33 |
| 3.5 | 1797.19 | 1795.63 | 1645.96 | 1603.81 | 1658.20 |
| 3.6 | 1669.28 | 1759.57 | 1814.60 | 1669.06 | 1695.59 |
| pypy | 2566.34 | 2579.24 | 2633.35 | 2575.63 | 2518.47 |

4586 shows a huge speedup in the CI builds. That doesn't mean this isn't useful. :)

It seems that virtualenv fixture takes up a huge part of the test time... A rough test shows that 98 second test run spends 35 seconds in that fixture.

Would it be worth using venv when available as a quick fix? I don't know if venv is quicker, but it might be worth a try.

Oh, I missed that last comment. I'll look into it soon. :)


FTR - now that YAML tests have been added, a bunch of installation tests can removed and made into YAML fixtures. That would reduce the clutter in data/packages. :)

Update on status quo:

| Version | #6914 | #7594 |
| :-----: | :-----: | :-----: |
| 2.7 | 1491.73 | 645.50 |
| 3.6 | 1669.28 | 767.25 |
| pypy | 2566.34 | 1500.26 |

Here are the start-to-finish "Ran for X min Y sec" waiting times of a full build, for the three master builds before https://github.com/pypa/pip/pull/5436 was merged, and the three after.

| Build | Time |
| :--: | -- |
| 8769 | 34 min 46 sec |
| 8771 | 35 min 25 sec |
| 8779 | 34 min 54 sec |
| _Average_ | _35 min 02 sec_ |
| 8786 | 20 min 27 sec |
| 8787 | 19 min 47 sec |
| 8793 | 19 min 49 sec |
| _Average_ | _20 min 01 sec_ |

@hugovk Not to suggest that the CI improvements aren't awesome; this issue is for tracking improving the test-suite speed -- which affects both local development and CI.

Hence the timings noted above being of "test run times", not CI run times. :)

It might be handy to look at https://github.com/kvas-it/pytest-console-scripts, either as a replacement for our use of scripttest or as a way to extend scripttest.

I prefer the former.

We talked a bit about this issue in a call last week (and a little in a call on the 8th) as part of our donor-funded work on pip. @pradyunsg, as I understand it from our call last week, currently, going through the entire execution cycle across all our providers/pipelines for a single pull request is ~1hour. Is that right?

Each PR takes up to 30 minutes from submission until the checks are done. See for example:

  1. #7653 - 26m
  2. #7652 - 30m
  3. #7651 - 28m

which can be calculated as the max of the times for:

  1. "Linux" Azure Pipelines check (visible in check UI)
  2. "Windows" Azure Pipelines check (visible in check UI)
  3. "macOS" Azure Pipelines check (visible in check UI)
  4. Travis CI check (visible on Travis site)
  5. GitHub actions builds

In general the Windows checks on Azure pipelines take the longest time at up to 30m.

The next slowest after Windows Azure pipelines is Travis at around 22m.

We can generally save time by:

  1. Increasing parallelization

    1. Splitting jobs across more workers - if we split up tests into multiple "jobs" that each can run on a CI worker then the individual jobs will complete faster

    2. Allocate workers with more cores - this would only really be an option for self-hosted runners on Azure Pipelines

  2. Don't wait to run long jobs - on Travis CI if we kick off everything at the same time then it would take as long as the longest job instead of the longest job from "Primary" + the longest job from "Secondary", for example (see here for a specific example)
  3. Detailed analysis of where time is being spent during tests - from the investigation for #7263 I have experience and tools that can help with this on Windows that I can communicate to someone.

Generally with 1 and 2 we need to keep in mind the maximum number of concurrent jobs we can have executing at any time, otherwise we may run into limitations that would cause one PR to delay another one. At the moment that is not a concern because each CI provider gives us enough concurrent executors that we could have (I think) 3 PRs submitted at the same time without them delaying each other.

It does vary a fair bit depending on the load/queue times, but a little over 30 minutes sounds about right to me -- taking a look at the more recent CI runs. Seems like my understanding of the CI run times was a bit outdated.

Looks like we've had significant improvement recently, by removing a bottleneck service in #7564 (AppVeyor) which might've brought the time down.

3\. I have experience and tools that can help with this on Windows that I can communicate to someone.

@chrahunt Color me interested. :)

@pradyunsg, let's hope I saved my notes. :)

Was this page helpful?
0 / 5 - 0 ratings