I appreciate that having a global timeout in order to catch unexpected behavior in a test and not consume too much CI resources. However, some tests take a long time intentionally, and I would like to be able to use t.timeout() to explicitly override the global timeouts for just one test. Right now, you can only override the global timeouts if then t.timeout() is shorter than the global timeout. It would be nice if it overrides when t.timeout() is longer as well.
I think I am not the only one, as I found this question referring to the same problem.
There is no way to have a global timeout that is something like 10s and a timeout for a particular test that is longer, like 30s.
I'm not sure how the timeouts are coded, it might be a one-liner :)
The global timeout is a "stop tests in case they're stuck" kind of timeout.
I suppose we could interpret that as "stop tests in case they're stuck, but don't worry for the next 30 seconds nice the test will fail if it's stuck anyway".
The workers will have to send a message to the main process whenever a per-test timeout starts, with the timeout duration. And then in the main process we'd have to make sure we don't stop tests because they're stuck until that duration has passed.
This should work with multiple t.timeout() calls, of course.
I don't think it's quite a one-liner, but this approach is consistent with what the global timeout is there for, so let's do it.
How is described behavior different from removing the global timeout and setting a default timeout on every test?
Basically,
t.timeout(10000)The net effect is the same IMHO.
When tests run concurrently, within and across worker processes, their duration may vary wildly. The global timeout (misnamed as it may be) is only concerned with making sure something is happening in your test suite.
Individual timeouts can be useful in certain cases, but far from all.
Disclaimer: I have now found my peace with the nested timeout and can live with the timeout change.
However, it seems my initial concerns are being experienced/reported by others.
Hopefully, the example below provides an insight.
The output below is meant to be slightly hyperbolic, but convey the confusion I sensed
with the timeout change on first blush.
$ npx ava dist/sdk/test/*.js
â ´ Server listening at http://127.0.0.1:54369
✖ Timed out while running tests
3 tests were pending in dist/sdk/test/01_core-wait-for-registry.js
◌ 01_core-wait-for-registry › inProgress
◌ 01_core-wait-for-registry › isErrored
◌ 01_core-wait-for-registry › isAlive
18 tests were pending in dist/sdk/test/01_core.js
◌ 01_core › waitFor server-state tests
◌ 01_core › getAllProviderMetadata !refresh
◌ 01_core › getAllProviderMetadata
◌ 01_core › getProviderMetadata aws
◌ 01_core › getRelativePath 0.0.2-rc1
◌ 01_core › getRelativePath abra
◌ 01_core › getRelativePath git-url#0.0.2-rc1
◌ 01_core › getRelativePath git-url#hash
◌ 01_core › getRelativePath file#hash
◌ 01_core › getRelativePath non-existant version
◌ 01_core › getRelativePath non-existant hash
◌ 01_core › getProviderMetadata aws w/ refresh
◌ 01_core › getProviderMetadata bad provider
◌ 01_core › getProviderMetadata [email protected]
◌ 01_core › getProviderMetadata [email protected] (non-existant version)
◌ 01_core › publish invalid type
◌ 01_core › shutdown server - work from cache after
◌ 01_core › getProviderMetadata github from cache
This output is completely useless in diagnosing any problem that might exist.
I know there is a timeout, I just have no idea where to look for the cause.
The only way to debug this:
npx ava dist/sdk/test/*.js -T 30000
npx ava dist/sdk/test/*.js -T 120000
md5-9c8545758ec6710408dfce95b70aa260
3 tests were pending in dist/sdk/test/01_core-wait-for-registry.js
◌ 01_core-wait-for-registry › inProgress
◌ 01_core-wait-for-registry › isErrored
◌ 01_core-wait-for-registry › isAlive
18 tests were pending in dist/sdk/test/01_core.js
✖ 01_core › waitFor server-state tests - Timed out after 10 seconds
◌ 01_core › getAllProviderMetadata !refresh
◌ 01_core › getAllProviderMetadata
◌ 01_core › getProviderMetadata aws
◌ 01_core › getRelativePath 0.0.2-rc1
◌ 01_core › getRelativePath abra
◌ 01_core › getRelativePath git-url#0.0.2-rc1
◌ 01_core › getRelativePath git-url#hash
◌ 01_core › getRelativePath file#hash
◌ 01_core › getRelativePath non-existant version
◌ 01_core › getRelativePath non-existant hash
◌ 01_core › getProviderMetadata aws w/ refresh
◌ 01_core › getProviderMetadata bad provider
◌ 01_core › getProviderMetadata [email protected]
◌ 01_core › getProviderMetadata [email protected] (non-existant version)
◌ 01_core › publish invalid type
◌ 01_core › shutdown server - work from cache after
◌ 01_core › getProviderMetadata github from cache
md5-33ce7dbc4905ca1574674639dda76477
✖ 01_core › waitFor server-state tests - Timed out after 10 seconds
will result in a substantial quality of life improvement.
I appreciate that schedulers managing workers is a serious and complex problem,
but this nested timer business makes it harder to grok from the test writer point of view.
Hopefully this helps. I'll go back to holding my peace....
@sramam are those tests using serial? Are they async? It's quite possible that the list here does not actually take into account which tests started, just those which haven't completed yet.
yes - you caught that one quick!
They are serial and async.
Each test is async, because it allows multiple suites to run in parallel.
Each test suite is serial, since they manipulate state on disk in a coordinated fashion - this allows me to use one test.before() to setup, overall, reducing run time by a very large factor.
yes - you caught that one quick!
They are serial and async.
Each test is async, because it allows multiple suites to run in parallel.
Each test suite is serial, since they manipulate state on disk in a coordinated fashion - this allows me to use one test.before() to setup, overall, reducing run time by a very large factor.
I've opened https://github.com/avajs/ava/issues/2421 to track. I'm going to mark our conversation here as "off-topic", so it doesn't distract from what this issue is hoping to address.
I just ran into this, and it took longer than I'd like to admit to realize this was the issue. We have two tests that are much slower (~2m), while the other tests are below the default 10s. I wanted to only mark to "slow" tests to use a higher timeout, instead of needing to set t.timeout() for every test.
Would be awesome to have this issue solved! 😄
@csvn would you be interested in taking this on?
Most helpful comment
The global timeout is a "stop tests in case they're stuck" kind of timeout.
I suppose we could interpret that as "stop tests in case they're stuck, but don't worry for the next 30 seconds nice the test will fail if it's stuck anyway".
The workers will have to send a message to the main process whenever a per-test timeout starts, with the timeout duration. And then in the main process we'd have to make sure we don't stop tests because they're stuck until that duration has passed.
This should work with multiple
t.timeout()calls, of course.I don't think it's quite a one-liner, but this approach is consistent with what the global timeout is there for, so let's do it.