Rubberduck: Difficulties unit testing with loops

Created on 17 Jul 2019  Â·  19Comments  Â·  Source: rubberduck-vba/Rubberduck

Rubberduck version information
Version 2.4.1.4796
OS: Microsoft Windows NT 10.0.17763.0, x64
Host Product: Microsoft Office x64
Host Version: 16.0.11727.20244
Host Executable: EXCEL.EXE

Description
Not entirely sure whether this is a feature request, change request, bug or what
it is, but it generally concerns an issue I am having when writing tests that
use loops to test systematics. First a bit of explanation of how I am used to
Assert functioning.

I've mostly been doing unit testing in C++ before this, more specifically using
the googletest unit testing suite and they have two very sensible types of
tests:

  • Assert That: These tests check something, and if the check fails it will
    abort the entire test it is currently running.
  • Expect That: These tests also checks something, but if they fail, the test
    will continue to run.

In most scenarios I use the first of these, but it is nice having the
distinction; it makes testing cleaner. For example if you are testing the value
of a variant you could first assert that it is not empty, and then expect what
its value should be gven that it is not empty. In that case it would be easy to
write code that doesn't throw errors even when the tests fail.

This choice only really manifests itself in Rubberduck if you are debug
stepping through the execution of your test because it only reports the first
failure anyway.

This is not really my issue, more of a reason why I encounter it. My real issue
is with the existence of a failure category called "Catastrophic failure". When
I have searched the Rubberduck source for references it seems to be an easter
egg status (?). As far as I can tell it triggers if there are more than 11
failures in a test, but I might be wrong.

This is fine in most situations, but if you are testing with a loop and you
encounter this then you are left with no information about what went wrong or
in which iteration it happened. The way to find this information is to step
through the test code, or, as I do quite frequently, add the following line in
my loop

If Result <> Expected Then Exit For

This latter solution is of course just code duplication which has all the issues
of copy-pasting code we all know and love.

I have a couple of requests and possible solutions:

  • Discard the whole Catastrophic Failure state, unless you have a good argumnt
    for keeping it. I don't really understand why it exists.

  • Implement Assert vs Expect; this is of course a bit overkill for the specific
    issue, but maybe you find the idea appealing

  • Add a function to the Assert object that is .State or some other way for
    me to see what the state of the test is. In this case I could remove the code
    duplication by

    If Assert.State = Failed Then Exit For
    

Testing with loops can be really useful for testing a lot of expected scenarios
that follow a certain logic, but with the existense of Catastrophic Failure it
is really hard to find out what went wrong so that you can fix it.

discussion feature-unit-testing

Most helpful comment

Ok, so dev chat recap:

  • Easter egg test outcome in its current form is harmful and needs to be removed;
  • @TestCase annotation and adding support for parameterized test methods needs to be added;
  • A new inspection would be flagging tests with more than a single Assert call (excluding Assert.Inconclusive and Assert.Fail), and/or tests that Assert in a loop. The inspection's xml-doc and meta-description would justify the result with the widely-accepted best practice of single test == single assert, and code examples would illustrate how to turn a loop-assert test into a DDT parameterized test with @TestCase annotations; as with all inspections, that could easily be turned off.

All 19 comments

When I have searched the Rubberduck source for references it seems to be an easter
egg status (?). As far as I can tell it triggers if there are more than 11
failures in a test, but I might be wrong.

That is correct... the idea being that a useful unit test should Assert one thing, i.e. have no more than a single reason to fail.

I'm not against discarding it, but I would think implementing data-driven tests (ref. #1229) would be a better approach than using Assert in a loop. Of course without DDT implemented, the "catastrophic failure" easter egg is kind of spoiled. The current-best idea would look something like this (leveraging a new @TestCase annotation to supply the parameter values):

'@TestMethod
'@TestCase("foo", True)
'@TestCase("bar", False)
Public Sub TestMethod1(parameter As String, result As Boolean) 

I think having @TestCaseSource(path) and an external file with a specific format (as presented in the linked issue) is probably way overkill, and a lot of work too (there's a reason it hasn't been implemented yet), but supporting multiple explicit @TestCase annotations seem a good compromise.

Exposing Assert.State feels wrong though, and I fear Expect would create confusion and complicate the framework/tooling.

Note that a failed Assert call cannot stop test execution (it's... complicated), so reporting the first failure / failing a test given a single failed Assert call is pretty much the only way we could simulate the behavior of an AssertFailedException.

Any thoughts on proper data-driven tests? DDT wouldn't require any changes to the Assert logic, would remove the need for explicit loops in the tests themselves, would run all iterations, and report success/failure state of each individual one.

Hmm, well, having some sort of way of supporting DDT would obviously help, but it seems to me like you are adding unnecessary restrictions to dictate how people use your tool.

I agree that in a perfect world with perfect code a unit test should test one thing, hence the name, but sometimes, when your code isn't perfect, you rather go for testing one concept that feels small enough that the test makes sense. Say 10 conditions you feel like should be fulfilled in one scenario, and these 10 things depends on each other. Splitting them up might be the more correct thing to do, but it could also bloat your testing environment. Testing something in a non-optimal way is still better than testing nothing.

Exposing Assert.State feels wrong though, and I fear Expect would create confusion and complicate the framework/tooling.

I totally agree, I was just trying to think of ways of circumventing my issue.

But yeah, improved support for DDT would obviously help, but I still don't really see the value of catastrophic failure. To test imperfect code you sometimes need imperfect tests, and I'd rather have imperfect tests testing old code before I refactor, than having to refactor my code so that it works with the unit testing framework I am using.

As for idea around DDT I don't have that many. Seems to me like the easiest would be to implement a '@TestFunction that tells the system that the following is a functional used for testing, and you then have a syntax for calling this function multiple times with different data. Not all that familiar with DDT, I mostly write unit tests where my unit is the smallest independent concept I can think of, not really restricting myself to one point of failure because in some situations I don't actually think that necessarily improves anything.

Edit But yes I agree that DDT is probably always preferable to unit tests with loops.

Ok, so dev chat recap:

  • Easter egg test outcome in its current form is harmful and needs to be removed;
  • @TestCase annotation and adding support for parameterized test methods needs to be added;
  • A new inspection would be flagging tests with more than a single Assert call (excluding Assert.Inconclusive and Assert.Fail), and/or tests that Assert in a loop. The inspection's xml-doc and meta-description would justify the result with the widely-accepted best practice of single test == single assert, and code examples would illustrate how to turn a loop-assert test into a DDT parameterized test with @TestCase annotations; as with all inspections, that could easily be turned off.

Sounds great. I did a bit of reading and couldn't find any good arguments for single test == single asserts, isn't it better to test concepts than semantics? Sure, you should test one thing, but that one thing will in many cases be much more readable if it is done with multiple asserts. E.g. testing if a value is between a range is much more readable by testing two less than asserts than one big logic blob. Oh well, this is not the place for these discussions and if you make it an inspection check I can always turn it off :> This is the best solution I think because then the users can choose how to use the tool, which should mostly strive to be helpful, not restrictive.

Really looking forward to seeing how DDT could be implemented, that would clean up some of my tests for sure.

  • Official Microsoft guidelines "When writing your tests, try to only include one Assert per test."
  • Some stackify blog post (no direct link to the "best practices" section) "I won’t go so far as to say that no test should ever contain a number of assertions other than one. But I will say that your unit test suite should have a test to assert ratio pretty darned near 1"
  • Stack Overflow "Multiple assertions are okay as long as they are all testing one feature/behavior."

But you also have this pretty popular stackexchange post where almost every post agrees that they do multiple asserts in an effort to test one feature/behaviour. One assert just seems like a really unnecessary restriction that doesn't necessarily correlate with what you want to achieve, testing one behaviour.

Again, it seems like something to strive for because striving for it might give you some clarity on what it is you are actually doing, but that the goal itself is nonsense.

Just saying that if you implement that inspection then a lot of people are going to have a lot of useless inspection results because they are testing one behaviour with multiple asserts.

But anyway, I still think it is a good choice, because I can't really think of a better way than to write your own best practices and let the users do what they will do anyway. As I said, it can always be disabled if one disagrees.

I think the Stackify article says it best: "Unit testing newbies commonly make a mistake of testing all of the things in one test method." -- basically the "one test, one assert" guideline is there to remind beginners that the Single Responsibility Principle also applies to test methods, so that when a test fails, it's immediately apparent why.

The accepted answer on that SE thread also puts it nicely:

but I do think we should strive towards only having single asserts in our tests.

Static code analysis really feels like the best way to educate users new to unit testing about this, and a test could always be annotated with @Ignore MultipleAsserts (that would be the quickfix for the inspection), ignoring a specific test without turning off the inspection.

One assert [...] doesn't necessarily correlate with what you want to achieve, testing one behaviour.

True, not necessarily - but the correlation is strong enough in beginner code that flagging multiple asserts would be useful in most situations, given @TestCase parameterized test support.

I agree, sorry, didn't mean to argue, I can get quite passionate on discussing rules for best practices as I feel like they should be guidelines and not rules quite often.

Anyway, reading through what I say I totally agree that the inspection should be there, and I was in essence mostly arguing about whether it should be turned on by default or not. This is obviously not something I should really have strong opinions on as it is very easy for me as an (overly) opinionated user to switch that lever anyway. You know much better than I do what is best for your users as you are developing this tool used by thousands and I am not. And the fact that you allow me to change the inspections myself, even on single tests/lines with annotation shows me that you respect my opinions, and I am grateful.

Thank you for all you do and for looking at my at times unreasonable requests :smile:

@Irubataru thanks for your invaluable input, and keep that passion burning!

You know much better than I do what is best for your users

Not necessarily. The balance is very delicate between promoting best practices, and shoving them down users' throats :wink:

Rubberduck has always been a pretty opinionated tool. I suspect it will stay that way until it becomes absolutely clear that no one is going to step up and create an alternative. That is life.

@retailcoder in regards to data driven tests, can we think about how we might implement them so the user has compiler assistance? I like the annotation as a first step, but it’s not as friendly as it could be. I’ll be honest, I don’t really know how we’d do that. I recently helped a friend implement DDT in Elixir, but Elixir has hygienic macros for meta programming.

DDT is usually implemented by generating tests prior to executing them. I know we’re capable of permanently writing code to the project, but can we temporarily create tests?

If it was me, I would want to write something like this.

`@TestMethod(BarData)
public sub BarTest(a,b)
    Assert.AreEqual(b, Bar(a))
end

`@TestData
public function BarData() As Variant
    BarData = [{1,2}; {3,4}; {5,6}]
end

Which would then generate tests like this.

‘@TestMethod
public sub BarTest_1_2()
    BarTest(1,2)
end

This gets complicated fast though, because not only do we have to parse the data function, but we need to maybe execute it from the C# backend? I don’t know how feasible it is. Just thinking out loud.

@rubberduck203 the @TestMethod annotation is already parameterized and takes an optional "category" argument, so DDT will require a new annotation; I like @TestCase , with the arguments matching the signature of the test method.

It could look like this:

'@TestMethod("test category")
'@TestCase(1, "foo")
'@TestCase(0, "foo")
'@TestCase(-1, "foo")
'@TestCase(1, "bar")
'@TestCase(0, "bar")
'@TestCase(-1, "bar")
Public Sub SomeTestMethod(ByVal arg1 As Long, ByVal arg2 As String)
   'setup SUT, consume provided parameters, assert once
End Sub

RD would simply iterate all @TestCase annotations [, validate them] and use the annotation parameters to supply the arguments for a test run. Test results could be grouped by default, and results could be expanded by individual test case as needed.

My initial thought is that what if you have a prime generator and have a lookup for say the first 1000 primes and want the test to just run through all of them as a dummy just to be sure? Or in my case I have a random user input generator to try to catch some scenarios I can't think of myself and want to run this 10000 times to see if I am missing something.

I would say that the question of a random data generator is a separate question from the question of data driven test. That definitely needs its own issue & discussion, apart from this one here.

I think allowing parameterized tests with the parameter values supplied by an annotation is a good first step towards enabling data-driven tests. Further enhancements could include adding a @TestCaseSource annotation that can be substituted to a plethora of @TestCase annotations to load the test cases from a file with a specified format - but the first step is really to just enable parameterized tests.

The only way to eat an elephant, is one bite at a time.

@retailcoder in regards to data driven tests, can we think about how we might implement them so the user has compiler assistance?

Looking at NUnit as an example, the only compile-time validation we get is whether the attributes are well-formed. I doubt those will get flagged as a compile-time error:

[TestCase("ha ha I am a number!", 42)]
...
public void derp(int someNumber, string someString)

But I appreciate the desire to have compile-time validation of the TestCase annotation in VB. Here's what I think, though - I think it's actually easier to just hijack the existing Compile [project name] menu options and provide our own Compile command, which can be then made to "compile" all the found annotations so that we can validate that those are well-formed just as NUnit's attributes are in Visual Studio.

hm, I'd just make it an inspection - it's not really different than any other IllegalAnnotation IMO.

I would say that the question of a random data generator is a separate question from the question of data driven test. That definitely needs its own issue & discussion, apart from this one here.

Agreed, but wouldn't it to some extent be useful if you could find a solution that solves both DDT and some sort of computer generated testing (not entirely sure what this paradigm is called when you have computer generated input with bounds for validity).

Also (sorry for double posting), I am looking at solving the titular issue, i.e. "Unit testing with for loops", and there are generally four scenarios where I find myself doing that:

  1. When I have a large lookup table of the results (sure, I could run a regex on it to generate test cases, but that isn't exactly ideal)
  2. When I am simply replacing a SequenceEquals because it is easier to check the elements individually than create arrays that this method accepts, or if I only want to check a subrange, etc.
  3. When I have random test data
  4. When I have two algorithms that are supposed to do the same thing, or at least close to the same thing, where e.g. the slow algorithm is a true and tested methods that I am fairly sure :tm: works, while I want to have one test that checks that the fast algorithm at least gives the same result as the slow one for a range of inputs. Again, trying to test results, not method, and if I have a way of generating results with another safe method, this is a good test.

I am not sure how I would implement any of these with an annotation system, except #2 which would be solved by removing the easter egg result.

PR #4956 was merged just now, and removes the easter egg result - the AppVeyor CI build and subsequent pre-release tag will be available imminently.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ThunderFrame picture ThunderFrame  Â·  3Comments

retailcoder picture retailcoder  Â·  4Comments

retailcoder picture retailcoder  Â·  3Comments

ChrisBrackett picture ChrisBrackett  Â·  3Comments

connerk picture connerk  Â·  3Comments