Pyro: Slow test clean up time

Created on 2 Mar 2018  路  13Comments  路  Source: pyro-ppl/pyro

Here is our winning list of tests taking longer than 2 minutes. Let us figure out a plan for these - remove them (I can run them as a cron job on my workstation weekly), or modify them to run within 2 minutes ideally.

I am taking the liberty to remove the first one, which was being skipped but was added back by me in #806. Addressing these slowest tests will give a respite to all Pyro developers who have been facing build timeouts due to slow tests.

More than 5 mins - should definitely be removed, or changed

  • [ ] 519.81s tests/infer/test_inference.py::PoissonGammaTests::test_elbo_nonreparameterized
  • [ ] 406.33s tests/infer/test_inference.py::NormalNormalTests::test_elbo_nonreparameterized
  • [x] 341.14s tests/infer/test_sampling.py::ImportanceTest::test_importance_prior
  • [ ] 328.93s tests/infer/test_sampling.py::ImportanceTest::test_importance_guide
  • [ ] 302.56s tests/infer/test_inference.py::LogNormalNormalTests::test_elbo_nonreparameterized

More than 3 mins

  • [ ] 274.57s tests/integration_tests/test_conjugate_gaussian_models.py::GaussianPyramidTests::test_elbo_nonreparameterized_three_layers

    • [ ] 268.61s tests/infer/test_inference.py::LogNormalNormalTests::test_elbo_with_transformed_distribution

    • [ ] 264.82s tests/infer/test_inference.py::LogNormalNormalTests::test_elbo_reparameterized

    • [ ] 264.01s tests/infer/test_inference.py::ExponentialGammaTests::test_elbo_nonreparameterized

    • [ ] 229.62s tests/infer/test_inference.py::BernoulliBetaTests::test_elbo_reparameterized

    • [ ] 223.50s tests/infer/test_inference.py::PoissonGammaTests::test_elbo_reparameterized

    • [ ] 206.49s tests/infer/test_inference.py::BernoulliBetaTests::test_elbo_nonreparameterized

    • [ ] 181.70s tests/infer/test_inference.py::ExponentialGammaTests::test_elbo_rsvi

More than 2 mins

  • [ ] 153.72s tests/integration_tests/test_conjugate_gaussian_models.py::GaussianPyramidTests::test_elbo_reparameterized_three_layers
  • [ ] 136.95s tests/infer/test_inference.py::ExponentialGammaTests::test_elbo_reparameterized
  • [ ] 136.70s tests/infer/test_inference.py::NormalNormalTests::test_elbo_reparameterized

Note that some of this slowdown might be due to https://github.com/uber/pyro/issues/839.

testing

All 13 comments

I propose moving most of these tests to Travis cron that I鈥檒l set up to run daily. However, the top five should be fixed anyhow as they are known to time out individually. Also cron jobs will have time limits too.

Before we remove these I'd like to see if our fixes to enumeration will help.

Note that many of these can be sped up by converting data from a list to a tensor. I've done that in many such tests in the past.

Also, I'd prefer doing profiling work to speed up Pyro on these tests, rather than removing or moving or reclassifying the tests. The real problem is that Pyro is slow 馃槈

Also, I'd prefer doing profiling work to speed up Pyro on these tests, rather than removing or moving or reclassifying the tests. The real problem is that Pyro is slow

I鈥檓 all for profiling and making Pyro faster but some of these tests have historically been slow (like the sampling tests). I am not sure if there is an easy fix for all of them (if there is we should certainly fix them), but I don鈥檛 think tests taking longer than 3 minutes should be running on every commit. That costs more dev time cumulatively than any benefits from the protections these tests offer on a per commit basis, and can be safely delegated to Travis cron.

842 is an easy fix

tests/infer/test_sampling.py::ImportanceTest::test_importance_prior

Can be removed, and

tests/infer/test_sampling.py::ImportanceTest::test_importance_guide

Can probably be replaced with some unit tests for the components, or sped up by #825

The real problem is that Pyro is slow

To add to @neerajprad's point, the test_elbo_nonreparameterized tests are inherently statistically slow. We could speed them up with #791 if we're not worried about introducing too much complexity into a single test, but even then it would still take many steps for SVI to converge. I think moving many of those to a cron job is a fine solution.

843 seems to have fixed a lot of the worst contenders, so for the time being, I'm happy to just skip the test_importance_prior test in #841 which does seem to timeout occasionally. I have started a daily cron too, where I will be adding some perf regression tests so that we can detect any major slowdown early. I do not want to add it to regular CI because the detection process will be noisy, and will require some tune-up. If there are other expensive tests that we are comfortable delegating to cron, we should consider adding them too.

test_elbo_nonreparameterized tests are inherently statistically slow.

I was able to speed up tests/infer/test_enum.py in #828 from >10 minutes to <7 seconds by hand-parallelizing num_particles via iarange. @martinjankowiak do you think can we apply the same technique to the test_elbo_nonreparameterized SVI tests? The idea is that each step of SVI would parallelize over say 100 or 1000 particles to reduce variance at negligible wallclock cost, and thereby reduce the number of inference.step()s by a factor of 10?

i'm definitely for moving some tests to a cron job. the conjugate pyramid models are a good target for example. but i think we shouldn't be too aggressive in doing so quite yet. on the one hand because we need to make sure that the cron job is stable/mature/usable/legible/etc. and also because, as @fritzo points out, we still have room for speed-ups. the reality is that we should do a reasonably systematic test overhaul before we do the 0.2 release. (as for the svi integration tests, yeah, we can probably do something like that with an increased number of particles, but there are diminishing returns and i don't know if we'll gain more than a factor of ~5-10)

we should do a reasonably systematic test overhaul before we do the 0.2 release

I think it would be fine to target this for 0.2.1.

well either way but in any case we should start making progress towards it

Closing this. I will create specific task items as needed - #844, #845 are some candidate items.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

neerajprad picture neerajprad  路  4Comments

fritzo picture fritzo  路  5Comments

tristandeleu picture tristandeleu  路  3Comments

fritzo picture fritzo  路  4Comments

neerajprad picture neerajprad  路  4Comments