Rust: Stabilize unit tests with non-`()` return type

Created on 8 Mar 2018 · 47Comments · Source: rust-lang/rust

CURRENT STATUS:

Awaiting someone to write the stabilization PR! Mentoring instructions here.

This is a sub-issue of the larger tracking issue devoted to the ? in main RFC. This issue corresponds to stabilizing the use of unit tests whose return type is something other than () -- it basically an extension of https://github.com/rust-lang/rust/issues/48453, which was discussing the same thing but for the main function.

What would be stabilized

As before, unit tests that return ():

Succeed unless they panic
Can be annotated with #[should_panic]

However, unit tests can now have other return types:

The return type must implement the Termination trait
- Test and expected error
The test passes if the return value is considered a "success" (e.g., an Ok from Result)
- Test
If the type is not (), then #[should_panic] is disallowed
- Test and expected error

Unknowns

[ ] Question: what's up with #[should_panic] and #[bench] anyway? (Example)
[ ] Can we use this with rustdoc yet?

Older proposal

Possible changes needed before stabilizing

[ ] Adjust libtest runner to distinguish #[should_panic] from #[should_error], as suggested by @scottmcm?
[ ] Successful examples of using with rustdoc

What would be stabilized

Unit tests (and #[bench] functions) would join the main function in being allowed to return any value that implements the Termination trait.

This commits us to the following (each link is to a test):

tests that return Result and other Termination types. In the case of Result:
- Ok means test passes
- Err causes test failure
- more generally, success or failure is determined by invoking report(), although that detail is not being stabilized
benchmarks work the same way (Ok, Err)
- note that #[bench] is still effectively unstable though in general

What remains unstable

The library details: where the trait is and its methods
- Which methods the test runner invokes on the trait -- this is unstable as the trait itself is unstable

C-tracking-issue E-easy E-mentor T-dev-tools T-lang finished-final-comment-period

Source

nikomatsakis

❤1 😕1 👍1

Most helpful comment

I agree #[should_panic] is confusing

I'd use stronger language than "confusing". If the only test framework we can use (and will forever be the default test framework) says that "returning a Result is the same as a panic", that's going to work against any kind of teaching that we have ever done that they are different methods of error handling with different use cases.

Tests are something that should check specific conditions. #[should_panic] is already fairly loose in that you can't use it to identify a specific line of code that should generate a panic:

#[test]
#[should_panic(expected = "index out of bounds")]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    let c = t(&a, b);
}

fn t(a: &[usize], idx: usize) {
    a[idx];
}

It's my belief that should_panic was basically a workaround for the fact that we didn't have a lightweight, stable way of catching panics at Rust 1.0. In a magical world of test frameworks, I'd want such an assertion to be part of the test, not just metadata:

#[test]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    assert_that(|| {
        let c = t(&a, b);
    }).panics()
        .with("index out of bounds")
}

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

assert!(f.is_err());

shepmaster on 12 Mar 2018

👍7

All 47 comments

@rfcbot fcp merge

I propose that we further stabilize unit tests etc that return non-() return types.

nikomatsakis on 8 Mar 2018

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

[x] @aturon
[x] @cramertj
[x] @eddyb
[x] @nikomatsakis
[ ] @nrc
[ ] @pnkfelix
[ ] @withoutboats

Concerns:

experience (https://github.com/rust-lang/rust/issues/48854#issuecomment-371706185)

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot on 8 Mar 2018

Thanks for including the test links! Super helpful.

I think letting Ok(()) (and ExitCode::SUCCESS 🙂) be test pass is great.

This one surprised me, though:

#[test]
#[should_panic]
fn not_a_num() -> Result<(), ParseIntError> {
    let _: u32 = "abc".parse()?;
    Ok(())
}

I get cognitive dissonance from that, because I my brain says "what do you mean parse should panic? I thought the point was that it doesn't panic!", and I think it would be unfortunate if the test were to continue to pass should it actually start panicking.

I think I was expecting

(unannotated): test pass is .report()==SUCCESS, everything else is test fail
[should_panic]: catch_panic caught a panic is test pass, everything else is test fail
[should_error] (hypothetical): test pass is .report()!=SUCCESS, everything else is test fail

(Unimportant thought: a -> ! test seems fundamentally useless. But probably not worth prohibiting?)

scottmcm on 9 Mar 2018

👍5

Unimportant thought: a -> ! test seems fundamentally useless. But probably not worth prohibiting?

#[test]
#[should_panic]
fn panic_panics() -> ! { panic!() }

:stuck_out_tongue:

cramertj on 9 Mar 2018

😄6

I think this definitely warrants eventually adding an alias for should_panic (perhaps should_error or should_fail), but that seems like it's strictly orthogonal to stabilizing this

sgrif on 9 Mar 2018

cc @jonhoo re: test frameworks

aturon on 9 Mar 2018

👍1

@sgrif Note that I'm explicitly _not_ suggesting that it should be an alias: I would like should_panic to only result in a test pass if there was an _actual panic_, which is something that I don't think we'd want to change after stabilization if there were a bunch of tests using should_panic to check for Err.

scottmcm on 9 Mar 2018

👍4

@scottmcm Fair point. I agree.

sgrif on 9 Mar 2018

In the context of custom test frameworks, this seems fine. The path we're taking going forward is that what ships with Rust by default is the same thing that libtest does at the time of the change, so committing to this now just means that the libtest we build using ctf will also have to support it (which seems fine).

Note that with ctfs, new test frameworks can implement entirely new testing annotations, and are not bound to the annotations, signatures, and semantics as the current built-in test support.

Also cc @Manishearth .

jonhoo on 9 Mar 2018

@rfcbot concern experience

Has this had much use? It's the first I knew about it even being a thing (I've not been following the main return types stuff) so I wouldn't be surprised if this has not had much usage. In which case I'm wary of stabilising without getting some experience of it in practice.

nrc on 9 Mar 2018

Yeah, I didn't even know this had landed on nightly yet? Super excited about the feature though!

jonhoo on 9 Mar 2018

I don't think it's had much experience. I agree with @scottmcm that #[should_panic] surprised me a bit -- so perhaps I was premature in moving to stabilize. (I'd welcome a PR to adjust that, though.)

I'm ok with waiting a bit, although it makes stabilizing the "main" case a bit harder, since we have to split the two. But I would encourage those who have doubts about the feature to experiment -- there really isn't much more to it, I don't think.

In other words: under this feature, there are really three (and a half) distinct possible outcomes:

Test panics
Test returns a "successful" result (libc::EXIT_SUCCESS)
Test returns a "failed" result
- This could be subdivided to depend on the specific return code

I'm personally not inclined to subdivide the final case: I feel like if you want to test for a specific return code, you ought to just add an assert into your test.

(One other reason for holding back on this feature: a major motivation -- perhaps the motivation -- was using it from within rustdoc, so that those examples can show idiomatic code -- I'm not sure if that works yet or what it takes to make it work.)

nikomatsakis on 12 Mar 2018

@rfcbot fcp cancel

I'm going to cancel the FCP. I added @scottmcm's concern to the header, as well as the desire to see a rustdoc example in practice. =)

It might however be hard to make #[should_panic] not cover the case of a failed return. Or, well, maybe not hard, but that will require deeper changes to libtest.

Thinking more about it, I'm not sure how desirable it is. I agree #[should_panic] is confusing but I wonder if it makes sense to introduce more than one way sense of unit test "failure". Is that making things overly complex? (We could rename #[should_panic])

nikomatsakis on 12 Mar 2018

@nikomatsakis proposal cancelled.

rfcbot on 12 Mar 2018

I agree #[should_panic] is confusing

Tests are something that should check specific conditions. #[should_panic] is already fairly loose in that you can't use it to identify a specific line of code that should generate a panic:

#[test]
#[should_panic(expected = "index out of bounds")]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    let c = t(&a, b);
}

fn t(a: &[usize], idx: usize) {
    a[idx];
}

#[test]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    assert_that(|| {
        let c = t(&a, b);
    }).panics()
        .with("index out of bounds")
}

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

assert!(f.is_err());

shepmaster on 12 Mar 2018

👍7

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

Hmm. Plausible. Maybe that's how we should refactor the test runner's internal workings, actually. That is, we could refactor #[should_panic] tests so that they catch the panic (and make the test runner consider any uncaught panic as a failure).

nikomatsakis on 13 Mar 2018

I've been thinking about this a bit. Here is one simple possible design:

#[should_panic] can only be applied to tests with () return type and continues to have the same meaning.
For non-should-panic tests, the return type T defines the error result using the Termination trait.

Under this version, if we ignore #[should_panic] then:

A panic is always a failure.
The "non-Ok" variant of the return is also a failure.
- If you want that to be considered success, you should write assert!(foo.is_err());
This seems nicely biased towards "all the code you see in the test executed without panic'ing or returning an error".

With #[should_panic], the test either returns () (fails) or panics (succeeds).

Thoughts?

nikomatsakis on 22 Mar 2018

👍4

all the code you see in the test executed without panic'ing or returning an error

I think this is the right mindset. I've always envisioned ? in tests to be a shorthand for what is currently an unwrap/expect: "theoretically this can fail, but it shouldn't in this test".

shepmaster on 23 Mar 2018

That sounds great, @nikomatsakis; it resolves all my concerns about #[should_panic].

And it allows the great "replace .unwrap() with ?" crusade to continue :slightly_smiling_face: I added unit tests as another place a "whatever, I just want to use ?" type would be nice (https://github.com/rust-lang/rfcs/issues/2367).

scottmcm on 25 Mar 2018

👍1

I opened this issue https://github.com/rust-lang/rust/issues/49909 describing exactly the work to be done.

nikomatsakis on 12 Apr 2018

@rfcbot fcp merge

Now that we have made the proposed changes, I'd like to move to stabilize again. I've updated the issue header with the latest status, but I'm reproducing it here:

What would be stabilized

As before, unit tests that return ():

Succeed unless they panic
Can be annotated with #[should_panic]

However, unit tests can now have other return types:

The return type must implement the Termination trait
- ~~Test and expected error~~
- Test and expected error
The test passes if the return value is considered a "success" (e.g., an Ok from Result)
- Test
If the type is not (), then #[should_panic] is disallowed
- Test and expected error

Unknowns

[ ] Question: what's up with #[should_panic] and #[bench] anyway? (Example)
[ ] Can we use this with rustdoc yet?

nikomatsakis on 25 Apr 2018

❤1 🎉1

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

[x] @aturon
[x] @cramertj
[x] @eddyb
[x] @fitzgen
[x] @japaric
[x] @joshtriplett
[x] @killercup
[x] @michaelwoerister
[x] @nikomatsakis
[x] @nrc
[x] @pnkfelix
[x] @scottmcm
[x] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot on 25 Apr 2018

Yes!!

@rfcbot reviewed

killercup on 25 Apr 2018

@rfcbot reviewed

nit: the expected error for "The return type must implement the Termination trait" is pointing to an error about fn main, not a test. Should there be a test for a #[test] that returns non-Termination?

scottmcm on 25 Apr 2018

👍1

I think we need two additional bits of documentation for this feature.

First, the examples of this all use Result<(), SomeError>. Does it make any sense to allow a success type other than () in such a Result? We should document that any such success type will get ignored and thrown away, unless a custom test framework makes use of them.

And second, we should document the intended replacement for #[should_panic] tests: if you have a test that you expect to fail, such as a test of error handling, then you should check the Result yourself against the error you expect, and then return success if you find what you expect or failure if you don't. We might want to show a couple of examples of doing that idiomatically. (And I'd expect custom test frameworks to make that easier.)

joshtriplett on 25 Apr 2018

@rfcbot reviewed

fitzgen on 25 Apr 2018

@scottmcm

nit: the expected error for "The return type must implement the Termination trait" is pointing to an error about fn main, not a test. Should there be a test for a #[test] that returns non-Termination?

Good catch! Must have gotten overlooked.... care to open a PR? =)

nikomatsakis on 26 Apr 2018

@joshtriplett

First, the examples of this all use Result<(), SomeError>. Does it make any sense to allow a success type other than () in such a Result? We should document that any such success type will get ignored and thrown away, unless a custom test framework makes use of them.

Actually, Termination is implemented for Result<T, _> where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

(Actually, I thought we said we'd only implement it for Result<()>, but I might be remembering wrong?)

And second, we should document the intended replacement for #[should_panic] tests

Sounds reasonable.

nikomatsakis on 26 Apr 2018

In any case, those comments are important for doc writing -- cc @steveklabnik, how can we ensure these concerns don't get forgotten?

nikomatsakis on 26 Apr 2018

@nikomatsakis

Actually, Termination is implemented for Result where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

Ah, right, I'd forgotten that! We definitely need to document that, then, and give an example of it.

@nikomatsakis

In any case, those comments are important for doc writing -- cc @steveklabnik, how can we ensure these concerns don't get forgotten?

Hopefully https://github.com/rust-lang/book/issues/1284 .

joshtriplett on 26 Apr 2018

Actually, Termination is implemented for Result where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

(Actually, I thought we said we'd only implement it for Result<()>, but I might be remembering wrong?)

Yes, that changed:

https://doc.rust-lang.org/nightly/std/process/trait.Termination.html

impl Termination for ()
impl<E: Debug> Termination for Result<(), E>
impl Termination for !
impl<E: Debug> Termination for Result<!, E>
impl Termination for ExitCode

So the other success type is !, which is unstable and pretty much useless for tests, so I don't know if it needs to be mentioned in any detail.

scottmcm on 26 Apr 2018

Opened PR https://github.com/rust-lang/rust/pull/50272; the message is ungreat:

error[E0277]: `main` has invalid return type `std::result::Result<f32, std::num::ParseIntError>`
  --> $DIR/termination-trait-test-wrong-type.rs:18:1
   |
LL | / fn can_parse_zero_as_f32() -> Result<f32, ParseIntError> { //~ ERROR
LL | |     "0".parse()
LL | | }
   | |_^ `main` can only return types that implement `std::process::Termination`
   |
   = help: the trait `std::process::Termination` is not implemented for `std::result::Result<f32, std::num::ParseIntError>`
   = note: required by `__test::test::assert_test_result`

scottmcm on 27 Apr 2018

@scottmcm I think unsupported instead of invalid would make it pretty good.

eddyb on 27 Apr 2018

@eddyb well, the span is poor, and the function is not called main -- it's a unit test. =) The message is obviously targeting the main case.

nikomatsakis on 27 Apr 2018

😄1

@scottmcm do you feel the message is bad enough to hold up stabilization?

nikomatsakis on 27 Apr 2018

Filed https://github.com/rust-lang/rust/issues/50291, in any case

nikomatsakis on 27 Apr 2018

@nikomatsakis Nope, I'm leaving my box checked.

scottmcm on 27 Apr 2018

@rfcbot reviewed

pnkfelix on 30 Apr 2018

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot on 30 Apr 2018

The final comment period, with a disposition to merge, as per the review above, is now complete.

rfcbot on 10 May 2018

🎉2

The final comment period is up, let's do this! There are general instructions on how to stabilize a feature given here:

https://forge.rust-lang.org/stabilization-guide.html

In this case, the feature name in question is termination_trait_test.

nikomatsakis on 24 May 2018

hi @nikomatsakis I can work on this

Dylan-DPC on 24 May 2018

👍5

Since #51298 is merged, we can close this issue?

Dylan-DPC on 11 Jun 2018

Pretty sure this can be closed now that this is stabilized, thanks @Dylan-DPC!

frewsxcv on 3 Jul 2018

I'm missing something. I went to use this, but what's all this junk about 1 and 0 that has nothing to do with my test? This is what we stabilized?

#[test]
fn example() -> Result<(), Box<dyn std::error::Error>> {
    Err(String::from("boom"))?;
    Ok(())
}

(Playground)

Output:

running 1 test
test example ... FAILED

failures:

---- example stdout ----
Error: StringError("boom")
thread 'example' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`', libtest/lib.rs:326:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    example

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

shepmaster on 25 Jul 2018

@shepmaster I think there is a PR at #52453 to fix that.