At transformers - there are multiple CIs and multiple test suites and multiple developers who all want to see different reports. So dumping all reports together makes it for a very difficult time to find things one wants. So I put together a hacked version of the report generator which generates multiple reports which can then be uploaded as artifacts, both in circleCI and github actions.
The CIs are special since one can't just log into the machine and re-run failing tests most of the time, and some of those test suites run for many hours, so getting as much information as possible, yet, being able to locate what one needs quickly is essential. Currently this is not so - it's very difficult to find the relevant error messages on failure as they are hidden in a multitude of warnings, logs and other reports.
Request:
The following code generates multiple reports putting them all in one directory with a custom prefix (as some test suites are run multiple times with different setups on the same CI job and we want those reports to be separate). There are probably other reports that could be added, I currently I have:
--durations--tb=long--tb=short--tb=line-rAeach ending up in its own file, e.g. reports/some_id_durations.txt.
All these get generated without needing to add any arguments to pytest and allowing it to function normally with any other reports one does choose to dump to the terminal.
# transformers/testing_utils.py
import re
from pathlib import Path
def pytest_terminal_summary_main(tr, id):
"""
Generate multiple reports at the end of test suite run - each report goes into a dedicated file in the current
directory. The report files are prefixed with the test suite name.
This function emulates --duration and -rA pytest arguments.
This function is to be called from `conftest.py` via `pytest_terminal_summary` wrapper that has to be defined
there.
Args:
- tr: `terminalreporter` passed from `conftest.py`
- id: unique id like `tests` or `examples` that will be incorporated into the final reports
filenames - this is needed as some jobs have multiple runs of pytest, so we can't have them overwrite each other.
NB: this functions taps into a private _pytest API and while unlikely, it could break should
pytest do internal changes - also it calls default internal methods of terminalreporter which
can be hijacked by various `pytest-` plugins and interfere.
"""
from _pytest.config import create_terminal_writer
if not len(id):
id = "tests"
config = tr.config
orig_writer = config.get_terminal_writer()
orig_tbstyle = config.option.tbstyle
orig_reportchars = tr.reportchars
dir = "reports"
Path(dir).mkdir(parents=True, exist_ok=True)
report_files = {
k: f"{dir}/report_{id}_{k}.txt"
for k in [
"durations",
"errors",
"failures_long",
"failures_short",
"failures_line",
"passes",
"stats",
"summary_short",
"warnings",
]
}
# custom durations report
# note: there is no need to call pytest --durations=XX to get this separate report
# adapted from https://github.com/pytest-dev/pytest/blob/897f151e/src/_pytest/runner.py#L66
dlist = []
for replist in tr.stats.values():
for rep in replist:
if hasattr(rep, "duration"):
dlist.append(rep)
if dlist:
dlist.sort(key=lambda x: x.duration, reverse=True)
with open(report_files["durations"], "w") as f:
durations_min = 0.05 # sec
f.write("slowest durations\n")
for i, rep in enumerate(dlist):
if rep.duration < durations_min:
f.write(f"{len(dlist)-i} durations < {durations_min} secs were omitted")
break
f.write(f"{rep.duration:02.2f}s {rep.when:<8} {rep.nodeid}\n")
def summary_failures_short(tr):
# expecting that the reports were --tb=long (default) so we chop them off here to the last frame
reports = tr.getreports("failed")
if not reports:
return
tr.write_sep("=", "FAILURES SHORT STACK")
for rep in reports:
msg = tr._getfailureheadline(rep)
tr.write_sep("_", msg, red=True, bold=True)
# chop off the optional leading extra frames, leaving only the last one
longrepr = re.sub(r".*_ _ _ (_ ){10,}_ _ ", "", rep.longreprtext, 0, re.M | re.S)
tr._tw.line(longrepr)
# note: not printing out any rep.sections to keep the report short
# use ready-made report funcs, we are just hijacking the filehandle to log to a dedicated file each
# adapted from https://github.com/pytest-dev/pytest/blob/897f151e/src/_pytest/terminal.py#L814
# note: some pytest plugins may interfere by hijacking the default `terminalreporter` (e.g.
# pytest-instafail does that)
# report failures with line/short/long styles
config.option.tbstyle = "auto" # full tb
with open(report_files["failures_long"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_failures()
# config.option.tbstyle = "short" # short tb
with open(report_files["failures_short"], "w") as f:
tr._tw = create_terminal_writer(config, f)
summary_failures_short(tr)
config.option.tbstyle = "line" # one line per error
with open(report_files["failures_line"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_failures()
with open(report_files["errors"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_errors()
with open(report_files["warnings"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_warnings() # normal warnings
tr.summary_warnings() # final warnings
tr.reportchars = "wPpsxXEf" # emulate -rA (used in summary_passes() and short_test_summary())
with open(report_files["passes"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_passes()
with open(report_files["summary_short"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.short_test_summary()
with open(report_files["stats"], "w") as f:
tr._tw = create_terminal_writer(config, f)
tr.summary_stats()
# restore:
tr._tw = orig_writer
tr.reportchars = orig_reportchars
config.option.tbstyle = orig_tbstyle
and conftest.py:
def pytest_addoption(parser):
parser.addoption(
"--make_reports",
action="store",
default=False,
help="generate report files - the value will be used as a `report_`+val+`reportname.txt`",
)
def pytest_terminal_summary(terminalreporter):
from transformers.testing_utils import pytest_terminal_summary_main
make_reports = terminalreporter.config.getoption("--make_reports")
if make_reports:
pytest_terminal_summary_main(terminalreporter, id=make_reports)
I'd probably call it --ci-mode, as this would be a killer feature for CIs.
Thank you!
Hi @stas00,
That's an interesting idea actually. I think this would also be useful at work where we find ourselves in the same situation.
However not sure this is "core" material, because there is a bunch of questions like 1) which reports to generate, 2) file format of each report, and perhaps others if we bring more people into the discussion.
Given that this doesn't need any changes into pytest itself, this actually is great candidate for a plugin. You can then play around with how the reports are generated (suiting your needs better) without having to make formal changes to pytest itself.
As for how to make the code less reliable on internals: I see you deal with the stats from the terminal reporter directly. If you take at a look at the terminal reporter code, you will see that it builds its stats/reports using the hook implementations. While the terminal reporter is not really that much stable, the hooks definitely are, so I would go down the route of implementing the logic of computing the stats using hooks as well. This will duplicate a bit of work, but will be more stable and also add some flexibility there.
IMHO this should be a post processing step on a normal result log
Certainly not a part of the core framework
+1 to @RonnyPfannschmidt and @nicoddemus. I think this is a great plugin idea
If I may argue I'm not proposing anything different from what the core does already. The only request is to be able to log all the reports pytest is able to generate into separate log/report files rather than dumping them all into the console. As you can see in my code with the exception of the long vs short report all I'm doing is redirecting the writing from the console to a separate file each.
And since as I explained originally in large projects with huge test suites different developers will want to see different reports, let's dump them all - have a dedicated file for each possible report. Then each dev can go directly to the report(s) they want and not need to add --whatever because they need this yet another report and cause grief to the rest of the team as now it'll be even harder to find things.
So --ci would activate all possible reports pytest can generate following the formats it already has. If a new feature gets added to pytest that gets printed to terminal, it will be added to ci_mode and a new report file will appear.
However not sure this is "core" material, because there is a bunch of questions like 1) which reports to generate, 2) file format of each report, and perhaps others if we bring more people into the discussion.
To summarize:
pytest has a set of features it logs by default to the console--ci mode wouldpytest functionality - i.e. all other flags will operate normally.pytest reports now.The only nuance I struggled with is short vs long tbstyle, which are currently mutually exclusive - but this too can be easily solved - always save both styles in tr.getreports("failed") - this won't be a breaking change as it adds things and doesn't change/remove anything and then it can log the short and the long failure reports separately.
My personal answer to that one is a plain NO
FWIW I also agree this is something which seems a bit too exotic to be in the core.
My personal answer to that one is a plain NO
While I agree with @RonnyPfannschmidt and @The-Compiler that this should not be in the core, I would like to ask @RonnyPfannschmidt to elaborate a bit more when giving a hard NO like that; after all @stas00 did present his case with a number of arguments that should be considered. It's the polite thing to do.
As for how to make the code less reliable on internals: I see you deal with the stats from the terminal reporter directly. If you take at a look at the terminal reporter code, you will see that it builds its stats/reports using the hook implementations. While the terminal reporter is not really that much stable, the hooks definitely are, so I would go down the route of implementing the logic of computing the stats using hooks as well. This will duplicate a bit of work, but will be more stable and also add some flexibility there.
I appreciate that bit of useful advice answering my "if not core" part of the question, @nicoddemus!
I guess with the vehement opposition from @RonnyPfannschmidt, this can be safely closed.
Most helpful comment
While I agree with @RonnyPfannschmidt and @The-Compiler that this should not be in the core, I would like to ask @RonnyPfannschmidt to elaborate a bit more when giving a hard NO like that; after all @stas00 did present his case with a number of arguments that should be considered. It's the polite thing to do.