Rust: Stabilize unit tests with non-`()` return type

Created on 8 Mar 2018  路  47Comments  路  Source: rust-lang/rust

CURRENT STATUS:

Awaiting someone to write the stabilization PR! Mentoring instructions here.


This is a sub-issue of the larger tracking issue devoted to the ? in main RFC. This issue corresponds to stabilizing the use of unit tests whose return type is something other than () -- it basically an extension of https://github.com/rust-lang/rust/issues/48453, which was discussing the same thing but for the main function.

What would be stabilized

As before, unit tests that return ():

  • Succeed unless they panic
  • Can be annotated with #[should_panic]

However, unit tests can now have other return types:

  • The return type must implement the Termination trait
  • The test passes if the return value is considered a "success" (e.g., an Ok from Result)
  • If the type is not (), then #[should_panic] is disallowed

Unknowns

  • [ ] Question: what's up with #[should_panic] and #[bench] anyway? (Example)
  • [ ] Can we use this with rustdoc yet?


Older proposal

Possible changes needed before stabilizing

  • [ ] Adjust libtest runner to distinguish #[should_panic] from #[should_error], as suggested by @scottmcm?
  • [ ] Successful examples of using with rustdoc

What would be stabilized

Unit tests (and #[bench] functions) would join the main function in being allowed to return any value that implements the Termination trait.

This commits us to the following (each link is to a test):

  • tests that return Result and other Termination types. In the case of Result:
  • benchmarks work the same way (Ok, Err)

    • note that #[bench] is still effectively unstable though in general

What remains unstable

  • The library details: where the trait is and its methods

    • Which methods the test runner invokes on the trait -- this is unstable as the trait itself is unstable

C-tracking-issue E-easy E-mentor T-dev-tools T-lang finished-final-comment-period

Most helpful comment

I agree #[should_panic] is confusing

I'd use stronger language than "confusing". If the only test framework we can use (and will forever be the default test framework) says that "returning a Result is the same as a panic", that's going to work against any kind of teaching that we have ever done that they are different methods of error handling with different use cases.

Tests are something that should check specific conditions. #[should_panic] is already fairly loose in that you can't use it to identify a specific line of code that should generate a panic:

#[test]
#[should_panic(expected = "index out of bounds")]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    let c = t(&a, b);
}

fn t(a: &[usize], idx: usize) {
    a[idx];
}

It's my belief that should_panic was basically a workaround for the fact that we didn't have a lightweight, stable way of catching panics at Rust 1.0. In a magical world of test frameworks, I'd want such an assertion to be part of the test, not just metadata:

#[test]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    assert_that(|| {
        let c = t(&a, b);
    }).panics()
        .with("index out of bounds")
}

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

assert!(f.is_err());

All 47 comments

@rfcbot fcp merge

I propose that we further stabilize unit tests etc that return non-() return types.

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @aturon
  • [x] @cramertj
  • [x] @eddyb
  • [x] @nikomatsakis
  • [ ] @nrc
  • [ ] @pnkfelix
  • [ ] @withoutboats

Concerns:

  • experience (https://github.com/rust-lang/rust/issues/48854#issuecomment-371706185)

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Thanks for including the test links! Super helpful.

I think letting Ok(()) (and ExitCode::SUCCESS 馃檪) be test pass is great.

This one surprised me, though:

#[test]
#[should_panic]
fn not_a_num() -> Result<(), ParseIntError> {
    let _: u32 = "abc".parse()?;
    Ok(())
}

I get cognitive dissonance from that, because I my brain says "what do you mean parse should panic? I thought the point was that it doesn't panic!", and I think it would be unfortunate if the test were to continue to pass should it actually start panicking.

I think I was expecting

  • (unannotated): test pass is .report()==SUCCESS, everything else is test fail
  • [should_panic]: catch_panic caught a panic is test pass, everything else is test fail
  • [should_error] (hypothetical): test pass is .report()!=SUCCESS, everything else is test fail

(Unimportant thought: a -> ! test seems fundamentally useless. But probably not worth prohibiting?)

Unimportant thought: a -> ! test seems fundamentally useless. But probably not worth prohibiting?

#[test]
#[should_panic]
fn panic_panics() -> ! { panic!() }

:stuck_out_tongue:

I think this definitely warrants eventually adding an alias for should_panic (perhaps should_error or should_fail), but that seems like it's strictly orthogonal to stabilizing this

cc @jonhoo re: test frameworks

@sgrif Note that I'm explicitly _not_ suggesting that it should be an alias: I would like should_panic to only result in a test pass if there was an _actual panic_, which is something that I don't think we'd want to change after stabilization if there were a bunch of tests using should_panic to check for Err.

@scottmcm Fair point. I agree.

In the context of custom test frameworks, this seems fine. The path we're taking going forward is that what ships with Rust by default is the same thing that libtest does at the time of the change, so committing to this now just means that the libtest we build using ctf will also have to support it (which seems fine).

Note that with ctfs, new test frameworks can implement entirely new testing annotations, and are not bound to the annotations, signatures, and semantics as the current built-in test support.

Also cc @Manishearth .

@rfcbot concern experience

Has this had much use? It's the first I knew about it even being a thing (I've not been following the main return types stuff) so I wouldn't be surprised if this has not had much usage. In which case I'm wary of stabilising without getting some experience of it in practice.

Yeah, I didn't even know this had landed on nightly yet? Super excited about the feature though!

I don't think it's had much experience. I agree with @scottmcm that #[should_panic] surprised me a bit -- so perhaps I was premature in moving to stabilize. (I'd welcome a PR to adjust that, though.)

I'm ok with waiting a bit, although it makes stabilizing the "main" case a bit harder, since we have to split the two. But I would encourage those who have doubts about the feature to experiment -- there really isn't much more to it, I don't think.

In other words: under this feature, there are really three (and a half) distinct possible outcomes:

  • Test panics
  • Test returns a "successful" result (libc::EXIT_SUCCESS)
  • Test returns a "failed" result

    • This could be subdivided to depend on the specific return code

I'm personally not inclined to subdivide the final case: I feel like if you want to test for a specific return code, you ought to just add an assert into your test.

(One other reason for holding back on this feature: a major motivation -- perhaps the motivation -- was using it from within rustdoc, so that those examples can show idiomatic code -- I'm not sure if that works yet or what it takes to make it work.)

@rfcbot fcp cancel

I'm going to cancel the FCP. I added @scottmcm's concern to the header, as well as the desire to see a rustdoc example in practice. =)

It might however be hard to make #[should_panic] not cover the case of a failed return. Or, well, maybe not hard, but that will require deeper changes to libtest.

Thinking more about it, I'm not sure how desirable it is. I agree #[should_panic] is confusing but I wonder if it makes sense to introduce more than one way sense of unit test "failure". Is that making things overly complex? (We could rename #[should_panic])

@nikomatsakis proposal cancelled.

I agree #[should_panic] is confusing

I'd use stronger language than "confusing". If the only test framework we can use (and will forever be the default test framework) says that "returning a Result is the same as a panic", that's going to work against any kind of teaching that we have ever done that they are different methods of error handling with different use cases.

Tests are something that should check specific conditions. #[should_panic] is already fairly loose in that you can't use it to identify a specific line of code that should generate a panic:

#[test]
#[should_panic(expected = "index out of bounds")]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    let c = t(&a, b);
}

fn t(a: &[usize], idx: usize) {
    a[idx];
}

It's my belief that should_panic was basically a workaround for the fact that we didn't have a lightweight, stable way of catching panics at Rust 1.0. In a magical world of test frameworks, I'd want such an assertion to be part of the test, not just metadata:

#[test]
fn e() {
    let a = [1, 2, 100];
    let b = a[3]; // oops
    assert_that(|| {
        let c = t(&a, b);
    }).panics()
        .with("index out of bounds")
}

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

assert!(f.is_err());

IMO, returning a Result::Err from a test should always cause the test to fail. If it's expected that the operation fails, that should be part of the test:

Hmm. Plausible. Maybe that's how we should refactor the test runner's internal workings, actually. That is, we could refactor #[should_panic] tests so that they catch the panic (and make the test runner consider any uncaught panic as a failure).

I've been thinking about this a bit. Here is one simple possible design:

  • #[should_panic] can only be applied to tests with () return type and continues to have the same meaning.
  • For non-should-panic tests, the return type T defines the error result using the Termination trait.

Under this version, if we ignore #[should_panic] then:

  • A panic is always a failure.
  • The "non-Ok" variant of the return is also a failure.

    • If you want that to be considered success, you should write assert!(foo.is_err());

  • This seems nicely biased towards "all the code you see in the test executed without panic'ing or returning an error".

With #[should_panic], the test either returns () (fails) or panics (succeeds).

Thoughts?

all the code you see in the test executed without panic'ing or returning an error

I think this is the right mindset. I've always envisioned ? in tests to be a shorthand for what is currently an unwrap/expect: "theoretically this can fail, but it shouldn't in this test".

That sounds great, @nikomatsakis; it resolves all my concerns about #[should_panic].

And it allows the great "replace .unwrap() with ?" crusade to continue :slightly_smiling_face: I added unit tests as another place a "whatever, I just want to use ?" type would be nice (https://github.com/rust-lang/rfcs/issues/2367).

I opened this issue https://github.com/rust-lang/rust/issues/49909 describing exactly the work to be done.

@rfcbot fcp merge

Now that we have made the proposed changes, I'd like to move to stabilize again. I've updated the issue header with the latest status, but I'm reproducing it here:

What would be stabilized

As before, unit tests that return ():

  • Succeed unless they panic
  • Can be annotated with #[should_panic]

However, unit tests can now have other return types:

Unknowns

  • [ ] Question: what's up with #[should_panic] and #[bench] anyway? (Example)
  • [ ] Can we use this with rustdoc yet?

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @aturon
  • [x] @cramertj
  • [x] @eddyb
  • [x] @fitzgen
  • [x] @japaric
  • [x] @joshtriplett
  • [x] @killercup
  • [x] @michaelwoerister
  • [x] @nikomatsakis
  • [x] @nrc
  • [x] @pnkfelix
  • [x] @scottmcm
  • [x] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Yes!!

@rfcbot reviewed

@rfcbot reviewed

nit: the expected error for "The return type must implement the Termination trait" is pointing to an error about fn main, not a test. Should there be a test for a #[test] that returns non-Termination?

I think we need two additional bits of documentation for this feature.

First, the examples of this all use Result<(), SomeError>. Does it make any sense to allow a success type other than () in such a Result? We should document that any such success type will get ignored and thrown away, unless a custom test framework makes use of them.

And second, we should document the intended replacement for #[should_panic] tests: if you have a test that you expect to fail, such as a test of error handling, then you should check the Result yourself against the error you expect, and then return success if you find what you expect or failure if you don't. We might want to show a couple of examples of doing that idiomatically. (And I'd expect custom test frameworks to make that easier.)

@rfcbot reviewed

@scottmcm

nit: the expected error for "The return type must implement the Termination trait" is pointing to an error about fn main, not a test. Should there be a test for a #[test] that returns non-Termination?

Good catch! Must have gotten overlooked.... care to open a PR? =)

@joshtriplett

First, the examples of this all use Result<(), SomeError>. Does it make any sense to allow a success type other than () in such a Result? We should document that any such success type will get ignored and thrown away, unless a custom test framework makes use of them.

Actually, Termination is implemented for Result<T, _> where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

(Actually, I thought we said we'd only implement it for Result<()>, but I might be remembering wrong?)

And second, we should document the intended replacement for #[should_panic] tests

Sounds reasonable.

In any case, those comments are important for doc writing -- cc @steveklabnik, how can we ensure these concerns don't get forgotten?

@nikomatsakis

Actually, Termination is implemented for Result where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

Ah, right, I'd forgotten that! We definitely need to document that, then, and give an example of it.

@nikomatsakis

In any case, those comments are important for doc writing -- cc @steveklabnik, how can we ensure these concerns don't get forgotten?

Hopefully https://github.com/rust-lang/book/issues/1284 .

Actually, Termination is implemented for Result where T: Terminaton at present, and that T does get used to determine success or failure. So it is not ignored.

(Actually, I thought we said we'd only implement it for Result<()>, but I might be remembering wrong?)

Yes, that changed:

https://doc.rust-lang.org/nightly/std/process/trait.Termination.html

impl Termination for ()
impl<E: Debug> Termination for Result<(), E>
impl Termination for !
impl<E: Debug> Termination for Result<!, E>
impl Termination for ExitCode

So the other success type is !, which is unstable and pretty much useless for tests, so I don't know if it needs to be mentioned in any detail.

Opened PR https://github.com/rust-lang/rust/pull/50272; the message is ungreat:

error[E0277]: `main` has invalid return type `std::result::Result<f32, std::num::ParseIntError>`
  --> $DIR/termination-trait-test-wrong-type.rs:18:1
   |
LL | / fn can_parse_zero_as_f32() -> Result<f32, ParseIntError> { //~ ERROR
LL | |     "0".parse()
LL | | }
   | |_^ `main` can only return types that implement `std::process::Termination`
   |
   = help: the trait `std::process::Termination` is not implemented for `std::result::Result<f32, std::num::ParseIntError>`
   = note: required by `__test::test::assert_test_result`

@scottmcm I think unsupported instead of invalid would make it pretty good.

@eddyb well, the span is poor, and the function is not called main -- it's a unit test. =) The message is obviously targeting the main case.

@scottmcm do you feel the message is bad enough to hold up stabilization?

@nikomatsakis Nope, I'm leaving my box checked.

@rfcbot reviewed

:bell: This is now entering its final comment period, as per the review above. :bell:

The final comment period, with a disposition to merge, as per the review above, is now complete.

The final comment period is up, let's do this! There are general instructions on how to stabilize a feature given here:

https://forge.rust-lang.org/stabilization-guide.html

In this case, the feature name in question is termination_trait_test.

hi @nikomatsakis I can work on this

Since #51298 is merged, we can close this issue?

Pretty sure this can be closed now that this is stabilized, thanks @Dylan-DPC!

I'm missing something. I went to use this, but what's all this junk about 1 and 0 that has nothing to do with my test? This is what we stabilized?

#[test]
fn example() -> Result<(), Box<dyn std::error::Error>> {
    Err(String::from("boom"))?;
    Ok(())
}

(Playground)

Output:

running 1 test
test example ... FAILED

failures:

---- example stdout ----
Error: StringError("boom")
thread 'example' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`', libtest/lib.rs:326:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    example

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

@shepmaster I think there is a PR at #52453 to fix that.

Thanks. It's a shame this will go out in the suboptimal form in 1.28, and presumably 1.29, but hopefully we will see it in 1.30!

Was this page helpful?
0 / 5 - 0 ratings