Meson: Add TAP tests support

Created on 15 Jan 2018  ·  19Comments  ·  Source: mesonbuild/meson

It would be very useful if meson would support the TAP protocol for tests,
that would make it easy to write tests that do not have to be split up into individual binaries.

I could not find anything about that in the documentation, so I am quite sure this is not supported currently.

enhancement help wanted

Most helpful comment

I too would like to see an explicit and pretty list of the failed tests.

I can understand not listing individual TAP lines when things go as expected, but it would be good to see the counts in a summary, where you count each TAP test line rather than only each executable. Seeing the counts of XFAIL, SKIP etc. separate from the just OK and FAIL would be really good. Especially SKIP count, if you have a few cases inside one test executable that skip - that might be unexpected.

All 19 comments

Hmm... I kind of like this idea. I have an external testsuite that runs a number of individual tests, and this protocol would work well for communicating the results, assuming that it can express XFAIL (expected failures). We could also make the 30-second time-out function as a per-test timeout if there was a way for testsuites to communicate with the test harness like this.

This overlaps with #186

@bredelings XFAIL is the TODO directive. The test is run and fails, but the TODO means it appears in the XFAIL section of the results, e.g.
not ok 10 # TODO See issue #123456789

If the test passes then it is XPASS:
ok 10 # TODO See issue #123456789
Reference https://testanything.org/tap-version-13-specification.html#todo-tests

TAP is a streaming protocol so there should be no problem with a 30 second delay for a particular test result to be sent. Additional diagnostic information on the test failure can be sent in a YAMLish block: https://testanything.org/tap-version-13-specification.html#yaml-blocks

At present there is no spec for sub-tests. I think one of the Javascript frameworks came up with a good solution. I can't recall if it was this one: http://www.node-tap.org/subtests/
Discussion: https://github.com/TestAnything/Specification/issues/2

TAP has a large number of tools like pretty printers:
https://github.com/substack/faucet
https://github.com/axross/tap-notify
and converters. For example converters to JUnit for integration with other CI tools:
https://github.com/dhershman1/tap-junit
https://github.com/jmason/tap-to-junit-xml

I have been using https://github.com/endlessm/webhelper/blob/master/test/tap.py to integrate a test harness that outputs TAP (not version 13 though) with Meson. However, Meson only supports outputting the results on a per-file level, not per-test. I'd really like to see per-test, since I find the per-test output in Autotools useful.

This is currently the only blocker for switching one of my projects to Meson.

FWIW this is how I get a list of tests from my test runner script and pass it to the Meson test harness. Each Meson test () runs one test:

test_runner = find_program( 'test_runner' )

env = environment()
env.set( 'MESON_SOURCE_ROOT', meson.source_root() )
env.set( 'MESON_BUILD_ROOT', meson.build_root() )

test_list = run_command( test_runner, 'list' ).stdout().split()
foreach test : test_list
    description = run_command( test_runner, 'describe', test ).stdout()
    test( description, test_runner, args: ['run', test], env: env )
endforeach

So my test_runner script has three commands: list, describe and run. I can post the test script if it is of any interest, but it is longer.

This approach has lead me to wonder if Meson test() function should be left pretty much as is and for two new functions to be introduced: test_runner () and test_harness. test_runner() is used to declare an executable that responds to the Meson test protocol and sets up the environment for the test runner. test_harness() takes test_runner() return object and uses that to interact with the projects tests. I'd like it to be able to interact with test cases, test suites and test tags.

I've also noted that changing a test's description doesn't currently update in the test harness. I assume it is cached by Meson.

@jpakkane I believe this is not fully addressed by #4958, as meson does not really integrate the tests the way it does with "native" tests:

$ ninja test
[0/1] Running all tests.
1/1 Example                                 FAIL    0.005408048629760742 s (exit status (0,))

Ok:                 0   
Expected Fail:      0   
Fail:               1   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   


The output from the failed tests:

1/1 Example                                 FAIL    0.005408048629760742 s (exit status (0,))

--- command ---
12:57:58 /Users/epirat/Desktop/tap-test/build/tap_test
--- stdout ---
1..4
ok 1 - Input file opened
not ok 2 - First line of the input valid
# Expected line to start with foo but found food!
ok 3 - Read the rest of the file
not ok 4 - Suammarized correctly # TODO Not written yet
-------

Full log written to /Users/epirat/Desktop/tap-test/build/meson-logs/testlog.txt
FAILED: meson-test 
/Users/epirat/Library/Python/3.7/bin/meson test --no-rebuild --print-errorlogs
ninja: build stopped: subcommand failed.

While it should look somewhat like:

$ ninja test
[0/1] Running all tests.
1/4 Example: Input file opened              SUCCESS
2/4 Example: First line of the input valid  FAIL
3/4 Example: Read the rest of the file      SUCCESS
4/4 Example: Suammarized correctly          FAIL (TODO)

Ok:                 1   
Expected Fail:      1   
Fail:               2   
Unexpected Pass:    0   
Skipped:            0   
Timeout:            0   


The output from the failed tests:

2/4 Example: First line of the input valid  FAIL
--- Diangostics ---
Expected line to start with foo but found food!
-------

TODO Tests:
4/4 Example: Suammarized correctly - Not written yet


Full log written to /Users/epirat/Desktop/tap-test/build/meson-logs/testlog.txt
FAILED: meson-test 
/Users/epirat/Library/Python/3.7/bin/meson test --no-rebuild --print-errorlogs
ninja: build stopped: subcommand failed.

Basically currently meson just remembers success/failure for the whole TAP suite of tests depending if any of the tests failed, but it does not properly output those the way I would expect it to and does not print diagnostics properly either but just dumps the full TAP output.

It seems TODO is currently not handled correctly either, as TODO indicates it is not expected to succeed currently:

These tests represent a feature to be implemented or a bug to be fixed and act as something of an executable “things to do” list. They are not expected to succeed. Should a todo test point begin succeeding, the harness should report it as a bonus. This indicates that whatever you were supposed to do has been done and you should promote this to a normal test point.

Another issue is #6810

Each test is handled as a single Meson test, because it's normal for a single TAP output to produce hundreds, or even thousands of output lines. If for example you're using gtest, the way TAP was implemented means you have the same ninja test output for

  • protocol: 'exitcode' where the test executables are launched without arguments

  • protocol: 'tap' where the test executables are launched with the --tap argument.

TODO works as expected, it makes the overall test fail if you have an ok # TODO and pass if you have a not ok # TODO.

I don't really see it being useful at all in the current way. Whats the point in tap reporting detailed tests outputs if meson does not properly reports those either?

I compare the behavior to how autotools does it, which is similar to how I explained it.

If not having detailed output, but instead just a basic success/fail parsing of a TAP-output test is a needed feature for some cases, we could probably add a new kwarg for that? But lots of people that used TAP in autotools probably want more detailed output and not just a raw TAP dump on failure.

The detailed output is available if you use meson test --verbose.

Yes but I believe the current behavior is not how it should work. It's ok if you disagree with that as people have different usecases and thanks a lot for the initial support, I just think it is not integrated well enough yet and does not work out of the box the way one would expect it to…

Maybe it would make more sense to be able to specify that as testsuite() or something though, so its clear that it will be multiple tests vs one test…

Personally I prefer to see grouped results because the group leads me to which executable to run in order to reproduce. So what I do is to run "meson test -v", which lets me both see which subtests failed (and ideally cut-and-paste the command line into a shell, see #5025).

But I can see how people can feel different; these are things that affect the workflow quite directly.

In other words, I would keep the current overall behavior, but add support to mtest for reporting subtests individually.

That sounds good. My main issue is currently especially on failure that it just dumps the whole raw tap output, which is hard to read sometimes when there are a lot of tests. Would be nicer to have it just show the subtests that actually failed.

I too would like to see an explicit and pretty list of the failed tests.

I can understand not listing individual TAP lines when things go as expected, but it would be good to see the counts in a summary, where you count each TAP test line rather than only each executable. Seeing the counts of XFAIL, SKIP etc. separate from the just OK and FAIL would be really good. Especially SKIP count, if you have a few cases inside one test executable that skip - that might be unexpected.

@ppaalanen, @ePirat, see #7830 for the plan around further TAP improvements.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sarum9in picture sarum9in  ·  3Comments

keszybz picture keszybz  ·  3Comments

eyelash picture eyelash  ·  4Comments

ignatenkobrain picture ignatenkobrain  ·  4Comments

robertsanseries picture robertsanseries  ·  6Comments