Example:
https://github.com/denoland/deno/pull/4173
http_bench.ts errors out but benchmark CI step does not fail.
https://github.com/denoland/deno/issues/4149
deno_tcp.ts segfaults and yet the benchmark CI step does not fail.
If that was the case it could help us catch bugs described above faster.
This requires to change run function in tools/http_benchmark.py such that it check error code of spawned server and even better if it did one more request after benchmark is done to ensure that server is still alive.
It looks like worker_round_robic_bench.ts suffers the same problem:
Compile file:///home/runner/work/deno/deno/cli/tests/workers_round_robin_bench.ts
error TS2532: Object is possibly 'undefined'.
â–º file:///home/runner/work/deno/deno/cli/tests/workers_round_robin_bench.ts:34:3
34 promise.resolve(data);
~~~~~~~
https://github.com/denoland/deno/pull/4180/checks?check_run_id=476742985
deno_tcp.ts segfaults and yet the benchmark CI step does not fail.
correction: no longer segfaults
I'm marking as a "good first issue" because it can be solved without building Deno from sources.
One can substitute this line to use downloaded deno binary:
https://github.com/denoland/deno/blob/3ed6ccc905394ed9c5d9cbcb8fa2426151780788/tools/benchmark.py#L238-L240
Hi @bartlomieju, I would like to try to solve this issue.
@avillega sure, go ahead!
I've been playing with the test. Any tips on how to reproduce this issue locally?
I have created a draft, please let me know what do you think about it
I was trying to make the test fail when the command returns a not 0 exit code.
We could remove the --ignore-failure flag when running hyperfine, but that would
cause problems with the tests that checks for errors like error_001.ts.
We could wrap those errors in a try/catch block but I am not sure about the implications in the meaning of the test. Please let me know what do you think about this approach.
Described problem still happens;
eg. worker benchmarks broke here, but CI was green:https://github.com/denoland/deno/runs/663033560
Since some of the benchmarks are testing how long it takes to produce a stack trace, we can't always assume and check that benchmark scripts are going to have a zero exit code. Two options
Fixed in #5422