Go: testing: exclude unreachable code in test coverage

Created on 5 Apr 2019 · 18Comments · Source: golang/go

What version of Go are you using (`go version`)?

$ go version
go version go1.12.1 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (`go env`)?

go env Output



$ go env

GOARCH="amd64"

GOBIN=""

GOCACHE="/Users/santhosh/Library/Caches/go-build"

GOEXE=""

GOFLAGS=""

GOHOSTARCH="amd64"

GOHOSTOS="darwin"

GOOS="darwin"

GOPATH="/Users/santhosh/go"

GOPROXY=""

GORACE=""

GOROOT="/usr/local/go"

GOTMPDIR=""

GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"

GCCGO="gccgo"

CC="clang"

CXX="clang++"

CGO_ENABLED="1"

GOMOD=""

CGO_CFLAGS="-g -O2"

CGO_CPPFLAGS=""

CGO_CXXFLAGS="-g -O2"

CGO_FFLAGS="-g -O2"

CGO_LDFLAGS="-g -O2"

PKG_CONFIG="pkg-config"

GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/v5/44ncgqws3dg3pmjnktpt_l2w0000gq/T/go-build063045597=/tmp/go-build -gno-record-gcc-switches -fno-common"

i have code like this in my project:

~~~go
const trace = false

func doSomething() {
if trace {
fmt.Println("some information")
}
}
~~~

because trace is false, the coverage reports the tracing code as not covered
can such code be shown as "not tracked"

I enable trace if we are debugging some issue in our code using build tags
I can enable trace during coverage generation, but tracing output is large and it takes significant time.

I have seen types package from stdlib also uses if trace
so may be this can be taken as enhancement

FeatureRequest NeedsDecision

Source

santhosh-tekuri

👍10

Most helpful comment

I was looking for a similar feature, not so much to reach 100% coverage, but because when reviewing coverage to determine if uncovered lines _should_ be covered, its useful to have an annotation that says "not reachable, expected to be red". I guess I could literally put such a comment in my code, but I think it'd be pretty nice to get a report that said "92% covered, 6%uncovered, and 2% marked unreachable".

I think of uncovered code a bit like lint output or compiler warnings. Developers shouldn't have to reevaluate every time they checkout a project and run lint if the output is relevant. If its been decided that it is irrelevant, its handy to suppress it with in-line markup, and then nobody has to see the output again and then go look to see if its valid.

sam-github on 29 Oct 2019

👍12

All 18 comments

Flagging statically-unreachable code seems like a useful signal to me, and whether the compiler is able to determine statically that it is unreachable is an implementation detail.

100% code coverage is usually not an appropriate goal for Go projects. Can you give some more detail as to why this particular behavior is important?

bcmills on 10 Apr 2019

I am not targeting for 100% coverage. the code coverage numbers does not reflect expected ones. the html report generated shows all tracing code as red. so when seeing reports, I have to mentally discard all those tracing lines of code highlighted in red.

currently I am doing a workaround of enable trace variable but do not generate any trace during tests suing build tags. this keeps code coverage track those lines.

santhosh-tekuri on 11 Apr 2019

👍6

sam-github on 29 Oct 2019

👍12

@sam-github agree with your rationale, would be nice to have

maxim-ge on 30 Dec 2019

totally nice to have

faraonc on 16 Jan 2020

@maxim-ge, @faraonc: please see http://golang.org/wiki/NoPlusOne.

bcmills on 22 Jan 2020

I'm not sure I understand the rationale behind "100% code coverage is usually not an appropriate goal for Go projects". What causes a project written in Go to need different coverage metrics than the same project written in another language such as Python or Java? Those languages support annotations to mark uncoverable regions (like untestable error handling), such that you can tell at a glance that 100% of your coverable code is in fact covered.

Uncovered code is code that has no evidence whatsoever that it does what it's supposed to under any circumstances, nor is it robust against regressions.

kstenerud on 19 May 2020

👍3

@kstenerud, a project written in Go doesn't need different coverage metrics than the same project written in Java. (100% coverage is not usually an appropriate goal for Java projects either.)

However, a Go project usually does not require the same coverage as a project written in Python, unless the Python project is making extensive use of analyzers such as PyChecker or mypy. That is because you can rely more on the Go compiler itself to flag common issues on rare paths (such as misspelled variable names in panic statements). In that sense, the type-checker itself is a form of “test coverage”, and necessarily already covers 100% of lines — so if you're comparing to a Python project without an additional analyzer, the comparable Go coverage is effectively already 100%.

bcmills on 20 May 2020

(100% coverage is not usually an appropriate goal for Java projects either.)

100% code coverage is typically not worth the time investment required, regardless of language. I agree that higher code coverage is a good goal in non-typed languages.

All that being said, I think it would be helpful to mark some lines as ignored for code coverage purposes. It can be very difficult from a unit test perspective to get the code in to some states.

This ticket is over a year old and still open. That's a shame.

jcollum on 1 Jun 2020

@bcmills Static type checking is not code testing. They are completely different things. If tests and coverage were only for type checking, then there'd be no need for unit tests in go at all.

The point of 100% test coverage with "unreachable" annotation support is so that you can run your coverage tool and see at a glance (or with CI rules) that 100% of the code that you WANT to have covered has in fact been covered. As it is now, you have to manually step through the entire report, checking every red line to make sure that it's a code path you actually don't intend to test, ON EVERY CHECKIN (and doing this automatically in your CI is impossible). This leads to human error, and libraries that aren't properly tested and will instead blow up in the wild on some edge case because getBaseFactor() deep in your library should have returned 5113 instead of 5118, but someone forgot to write a test covering it, and it got missed in the 10,000 line coverage report (that nobody reads anymore because they don't want to spend 20 minutes scanning it on every checkin).

kstenerud on 4 Jun 2020

👍7

@kstenerud

Static type checking is not code testing. They are completely different things.

They are different points in a continuum of “tools that verify assertions about program behavior”.

A type-checker verifies assertions inline in the program source about program deviations from the type system. Note that the same program may compile with type declarations of different strengths: for example, a program that uses direction-annotated channels will also compile, but with less type information, if the direction annotations are all removed.

A unit-test or regression-test verifies assertions written in the test function about program deviations from the behavior expected by the test.

You can fairly easily trade off between the two: for example, you could use the type interface{} everywhere and type-assert variables at all points of use, then write unit-tests to ensure that the assertions are correct.

So what about 100% coverage? 100% test coverage asserts a very trivial property: namely, “there exists some condition under which the line does not panic without a corresponding recover”. That is a very, very weak property — on par with “the line does not contain a type error”, and certainly not enough to be confident that the code is actually adequately-tested. So either way you end up needing to think about which scenarios you've tested — not just which lines.

Some further reading:

bcmills on 4 Jun 2020

👎4 👍2

@santhosh-tekuri, I've been thinking about your use-case some more.

I think the const-guarded lines really should be marked as uncovered. If you have untested lines of code that are unreachable, but that could be _made_ reachable by changing some constant (which you do intend to change), then that code _really is_ untested. When you change that constant and re-run the code, you might get arbitrary panics from that portion of the code, and it might not do at all what you expected it to do.

So for that kind of use-case, perhaps it would be better to use a build constraint to adjust the constant. Then you can run the tests once with and once without the matching tag, and merge the resulting coverage to find out what code is _actually_ covered by the net effect of running the test both ways.

There are still a few cases that are legitimately _genuinely_ unreachable — such as lines that panic due to the violation of some invariant that should be guaranteed to always hold — but even those can often be tested by intentionally violating the invariant in a program run as a subprocess.

(Merging coverage profiles from multiple processes is #28235.)

bcmills on 4 Jun 2020

(For genuinely-unreachable code that diagnoses invariant violations, see also #30582.)

bcmills on 4 Jun 2020

@bcmills please don't patronize me; I've been writing unit tests for 20 years. Any system can be cheated. You can write good tests, and you can write bad tests, and no policy or system is going to force you. All you have at the end of the day is your own discipline, and the tools you use to help you maintain that discipline.

This isn't about some magical "100% green" badge of approval. It's about catching human error via good hygiene. If you keep a "100% coverage" policy in your CI system, you'll automatically be notified by the system that your code change is probably not covered by new unit tests to test your reason for the change. THAT's why Python does it, and THAT's why Java does it. The type system doesn't even enter into it.

Of course 100% coverage can be a weak property if that's how you treat it. But as part of a disciplined team it's a godsend. Your tools warn you of potential problems, and of course how you react to those warnings is your own business. You CAN ignore that the canary has died, or maybe nail it to its perch and say everything's fine, but some of us would rather take more proactive measures when we see warning signs, or make an informed decision that it's a false alarm (add a "don't report test coverage for this" annotation). All we want are the tools so that we CAN practice good hygiene in go, like being notified whenever we might have forgotten to test some parts of the code changes we'd planned to.

kstenerud on 5 Jun 2020

👍9

Besides agreeing with everything that @kstenerud just said, I'd like to emphasize that the following statement by @bcmills is in my opinion a very inaccurate description of what test coverage can tell us:

100% test coverage asserts a very trivial property: namely, “there exists some condition under which the line does not panic without a corresponding recover”. That is a very, very weak property — on par with “the line does not contain a type error”, and certainly not enough to be confident that the code is actually adequately-tested.

The point of covering a code path is not _just_ to prove that it doesn't panic. It's also to prove that the code path was executed, _and therefore if there is a logic mistake within that path, there is a chance that tests will notice it_. If a code path has no coverage, you may wrongly think you are testing it when you're not.

It's not hard to think of situations where what appeared to be a successful test actually just means that you have not accounted for every relevant permutation of preconditions. We've probably all done this quite a few times. Take this deliberately silly example:

func F(n int, b bool) int {
    if n == 5 && b {
        return 0
    }
    return n + 1
}

If none of our tests happen to invoke F with the parameters (5, true), then we can't say whether every related piece of our code was written with a correct understanding of how F behaves— so something might break under those conditions. Of course, it's a basic principle of test design that if you _know_ there is a specific conditional behavior like this, then you should make sure your tests cover all relevant permutations. But it is very easy to forget that. And even if you don't forget, it's also easy to write a test that _at the time you wrote it_ exercised all the code paths... but then due to a logic change somewhere else, those preconditions no longer result in calling F(5, true).

So it is very handy to be able to look at a coverage report and see that the return 0 block is never being called in tests. That has nothing to do with thinking it might panic.

No one here has claimed that perfect coverage means your tests are perfect. The point is the converse: if there's a gap in coverage that you did not anticipate, then you _know_ your tests are _not_ perfect. And therefore, it's desirable to be able to exclude gaps that you're already aware of and have determined are either not significant or impossible to cover in tests, so that any new gaps stand out.

eli-darkly on 6 Jun 2020

👍7

I agree with the utility and ultimately I'd be nice to have something like this in go itself, however does that have to be the first step? The impression I get isn't that go has excess developer resources, so an external tool seems more appropriate. If that gets lots of attention and usage, I'd assume chances will also be higher to get changes about this issue into go itself. I am currently only aware of https://github.com/dave/courtney that already does e.g. marking code as not tested, maybe there's other tools that could be used and extended.

imsodin on 6 Jun 2020

Ah perfect! I'll use courtney then. TBH I'm rather surprised that Google hasn't put some effort into this kind of code coverage tool. A tool that alerts you when your code changes probably haven't been adequately tested will reduce your fail rate in production, and that always makes the bean counters happy.

kstenerud on 6 Jun 2020

FWIW, I agree there should be a mechanism for explicitly marking some statements as never reachable when evaluating test coverage. In any non-trivial program there will always be some code paths that will only ever be executed in what are effectively impossible scenarios. Yet you don’t want to continue as if everything was okay if the impossible happens. So you attempt to handle the impossible case. Injecting a mock to exercise those scenarios can be difficult to justify. Note that such cases should be a negligible percentage, well under 0.001%, of all testable statements. This isn't, in my opinion, about gaming the numbers. It's about making it easy for a project to annotate the code so it is clear what code falls into the "we don't test this because it can't happen in practice, but theoretically could" bucket while making the coverage data more useful.

Update since I wrote the above on Aug 8:

I was looking at the Elvish shell test coverage and noticed one file had exactly one line that was not covered: https://codecov.io/gh/elves/elvish/src/master/pkg/wcwidth/wcwidth.go. Specifically the panic in this block:

if w < 0 {
 panic("negative width")
}

It's essentially an assertion that validates the caller of the function is not violating its contract. There is no meaningful way, or reason to, test that. Having a way to mark that block as excluded from coverage reports would be useful.