I know I'd love to see this work! Do you know what it would entail in terms of what the hoops a prototype would have to jump through? Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

alexcrichton on 7 Jul 2016

A prototype would need to generate the LLVM IR for the instrumentation, for that it needs to generate a mapping between source ranges and performance counters (I would start with just functions). Then it needs to generate and embed the IR in the object files. It would be a good idea to look how clang then outputs this into a file to use the same format (which is documented), and test that we can read it with llvm-cov in linux and macos (don't know about windows support but since clang is gaining full windows support if its not there it will be there soon).

I think that would be enough for a prototype, from there we can move on to generating instrumentation for conditionals (match/loop/jumps/ifs...) and blocks (how many iterations was a loop body executed). We could go down to expressions, but then it would be wise to offer users a way to control how much instrumentation is generated: functions, branches, and loops (which are branches) is typically enough. We should then support skipped regions (for conditional compilation), expansions (for macros, maybe plugins), and dealing with unreachable code (there is a performance counter that is always zero for that).

The meat of the work is in the source=> performance counter mapping, generating the IR, and generating the conforming output. llvm-cov works at least on mac and linux, but IIRC clang has options to generate coverage information even for particular gcov versions. The source code for all this is available so taking a look wouldn't hurt.

Also, does this work reliably on platforms like Windows? Or is it still best on Linux and "works mostly elsewhere"?

It works on Mac and Linux, don't know about Windows. Even if it doesn't work everywhere, this is probably the way to make it work in as much platforms as possible anyways. Clang support on windows is pretty good already and it is only getting better.

gnzlbg on 7 Jul 2016

@sujayakar's done some awesome work to prototype this on OSX at least, discovering:

For now, codegen-units probably has to be one to make this work (worth investigating)
The pass to insert into LLVM is -C passes=insert-gcov-profiling
The runtime support is available through -C link-args=-lclang_rt.profile_osx, and that library
This runtime support lives around the directory xcrun --find clang in a path like usr/lib/clang/7.3.0/lib/darwin/libclang_rt.profile_osx.a.
Binaries run with that instrumentation will produce a .gcda file
Next to xcrun --find clang, there's an llvm-cov tool, and that can be run over the output file to generate information.

That should at least get code coverage working on OSX in some respect! THis is also a pretty strong case that it may not be too hard to actually get this working for _all_ platforms if LLVM's standard tool suite "just works".

alexcrichton on 3 Aug 2016

@alexcrichton That's gcov coverage, which is completely different from what @gnzlbg meant in this issue.

The correct pass name is instrprof, implemented in lib/Transform/Instrumentation/InstrProfiling.cpp. See also LDC LLVM profiling instrumentation.

sanxiyn on 4 Aug 2016

@sanxiyn is correct.

I just want to add that I don't think we should implement clang-like gcov coverage generation.

LLVM format can be used for many more things (e.g. profile guided optimizations), and LLVM's llvm-cov tool can generate gcov files from it.

gnzlbg on 4 Aug 2016

afaik, there's a PR open for this: https://github.com/rust-lang/rust/pull/38608

frewsxcv on 2 May 2017

As far as I can tell, #38608 is still gcov coverage and not instrprof.

sanxiyn on 2 May 2017

I believe this is now implemented as -Z profile and tracked at https://github.com/rust-lang/rust/issues/42524, so closing in favor of that.

alexcrichton on 25 Aug 2017

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

sanxiyn on 28 Aug 2017

👍3

Would LLVM coverage ( instrprof ) enable usable code coverage of generic heavy code? None of the existing code coverage solutions work well with heavy use of generics.

My project (a gameboy emulator) makes very heavy usage of generics and it'd be wonderful to get code coverage working on integration tests but I'm not really sure how.

mehcode on 21 Sep 2017

I think good code coverage support is the single most important piece of tooling missing in Rust.

When I run cargo test, and everything is green, I get a good feeling, but this feeling actually means nothing at all. To make cargo test mean something, it would need to at least tell me how many code-paths of my application have been exercised.

This information does not mean much either (just because a code-path was exercised does not mean that it was exercised for all inputs), but it would mean more than nothing, and it would make writing tests and asserting the value of test way easier.

IMO adding this kind of capability to cargo test (and yes, it should be part of cargo test, people should always know their code coverage) would be extremely valuable.

There is only another part of tooling infrastructure that would come close to this in value, and that would be an undefined behavior detector for all rust code.

This might sound like a rant, but I just want to raise awareness that this is a very important issue at least for me (and from the other issues being filled, for others as well) because it directly affects the correctness of rust programs (we don't want them to only be memory safe, but also to have correct logic).

Maybe if rustfmt, clippy, and the RLS advance enough during the impl period, the Rust tool team could prioritize these two tools next?

If I don't have precise auto-completion information or my source code is not perfectly formatted, well, I can live with that. But if my program has undefined behavior and I am not catching it because I am only testing 40% of my code-paths then... I am screwed. Does this make sense?

gnzlbg on 21 Sep 2017

👍34

This is not implemented yet. LLVM coverage is different from gcov coverage, which is also different from sanitizer coverage. LLVM implements three separate coverage mechanisms.

Hi, can anyone explain that what's the difference between "LLVM coverage" (profile-instr-generate) and "gcov" (insert-gcov-profiling)? I don't know the detailed implementations of these tow passes, but will profile-instr-generate be more accurate than insert-gcov-profiling.

mssun on 12 Apr 2018

I found @kennytm's answer to my question. Let me put here in case someone like me gets lost in the overwhelmed references: https://github.com/rust-lang/rust/pull/44673#issuecomment-330435323

mssun on 13 Apr 2018

I've done a bit of prototyping on this to see what would be involved in using instrprof for coverage, and it's revealed (to me) that the coverage aspect of this feature is secondary to the "Profile Guided Optimization" feature that instrprof was originally written for.

So, here's a vague "plan of action" (with progress so far) to enable PGO and LLVM-assisted coverage:

[x] Add frontend-guided instrumentation to IR
- Unstable
[x] Enhance the profile-rt library to dump this out
[x] Enable Profile-Guided Optimization passes
- Unstable
[ ] Create a coverage map
[ ] Integrate LLVM coverage tooling into cargo

Add Frontend-Guided Instrumentation

This is currently implemented in its most basic form as -Zpgo-gen which adds the -pgo-instr-gen and instrprof passes to LLVM to add instrumentation of all functions and basic blocks (loop bodies, if statements etc.).

There may be other places that it's useful to add instrumentation (e.g. adding vtable-downcast counts to enable -pgo-icall-prom passes - which turn dynamic dispatch into static dispatch in fast paths).

:+1:

Enhance the profile-rt library

With the above work complete, the generated binaries will count (in-memory) each time an instrumented code path is hit. To extract this information, the profile-rt library should walk the various __llvm_pfr_X sections of the binary/memory and dump the values out to a file (Clang dumps to default.profraw). The format of this file is (as far as I can see) "documented" in LLVM/Clang's codebase. On the other hand, the current implementation of Rustc's profile-rt library is taken wholesale from LLVM and thus already supports this function.

:+1:

Enable profile-guided optimization

With a default.profraw, one can create a profdata file (using llvm-profdata merge). This file can be passed to the pgo-instr-use (and other) passes to enable optimisations based on gathered profile information. Again, this is supported in nightly Rustc today (through -Zpgo-use).

:+1:

Create a coverage map

LLVM's coverage tooling can use a "Coverage Map" to map between instrumentation counters and blocks of code. The format of the map is well documented at http://llvm.org/docs/CoverageMappingFormat.html and there's code in LLVM (http://llvm.org/doxygen/classllvm_1_1coverage_1_1CoverageMappingWriter.html) for creating the magic format. Note that this may require Rustc manually inserting instrumentation (rather than using -pgo-instr-gen) since it needs to know which counter corresponds to which code block. The code in clang (https://github.com/llvm-mirror/clang/blob/master/lib/CodeGen/CodeGenPGO.cpp) that does this is probably a good place to start.

Note that the LLVM coverage map format supports various useful features, not well handled in GCOV-style coverage:

Column as well as line-number analysis (useful for inline expressions)
Macro-expansion analysis (coverage can "see through" macros)

❌

Integrate coverage tooling into Cargo

Creating a cargo coverage (or, possibly better, just creating coverage from cargo test) by adding profiling to the test cargo-profile and running the resulting profile data through llvm's tooling would be really cool. Especially if the output format is compatible with tools like Coveralls.io and can produce useful graphical outputs for analysis afterwards.

Likely requires stabilising -Zpgo-gen or at least exposing enough of it that cargo can create a profiled build.

:x:

Unanswered Questions

There are some bits of Rust code for which coverage is non-obvious:

Do landing pads report coverage? If they do then we run into the same issue as https://github.com/rust-lang/rust/issues/55352
- Maybe we add instrumentation for them (to determine how often they're hit) but don't add those counters to the coverage map?
How do we handle non-instantiated generics? If a generic is never instantiated then no code is created for it so there's no counters for it.
- Presumably we need to ensure that the coverage map includes the counters that "would be there if there was an instantiation" then coverage will spot that the generic is not covered.
- This applies to "default implementations" in traits and likely is impacted by specialisation too.
Similarly, should separate instantiations of the same generic be covered separately? If I test Foo<i32> in my UTs but use Foo<String> in my real code, have I really tested my code (in this case, String needs drop so maybe there's a subtlety there)?
- Create coverage map entries for all instantiations (as well as the "generic" counter entry from the previous bullet)? I don't think llvm-cov knows how to handle this in the coverage map at the moment, but it might make sense for it to do so to support this for C++ templates too?

bossmc on 5 Feb 2019

❤14 👍9

Proc-macros (and the built-in #derives) need care too - the expanded code coverage is potentially uninteresting (if you trust the proc-macro) or critical (if you don't).
Basically anywhere where rustc does "codegen" above and beyond what the user wrote needs to be thought about quite carefully.

bossmc on 6 Feb 2019

* How do we handle non-instantiated generics?  If a generic is never instantiated then no code is created for it so there's no counters for it.

I think rust should behave the same as clang here.
And if I remember correctly, then not instantiated templates are reported.

* Similarly, should separate instantiations of the same generic be covered separately?  If I test `Foo<i32>` in my UTs but use `Foo<String>` in my real code, have I really tested my code (in this case, String needs `drop` so maybe there's a subtlety there)?

llvm-cov at least has the option to show coverage per instantiation (or merged - per _-show-instantiations_).
So I think it makes sense to provide this as well.

pJunger on 17 Apr 2019

Great write-up @bossmc! I'll take a stab at adding LLVM coverage map generation, unless it's already being worked on elsewhere?

Whilst I'm happy to try and figure out a solution on my own, a little direction from the compiler team would no doubt ensure that I arrive at something that's acceptable far sooner! Would anyone be willing to mentor this?

Cc @michaelwoerister

eggyal on 22 Oct 2019

❤1

I don't have time to mentor this but feel free to ask specific questions!

michaelwoerister on 22 Oct 2019

@bossmc Are you sure that -Zpgo-gen (now stabilised as -C profile-generate) adds LLVM's -pgo-instr-gen and -instrprof passes? My understanding is that it "merely" instruments via gcov, which is the very issue under discussion here? Have I missed something?

eggyal on 22 Oct 2019

-C profile-generate is exactly the same as Clang's -fprofile-generate.

michaelwoerister on 22 Oct 2019

Thanks @michaelwoerister, that's what I thought—the difference being, AIUI, that -fprofile-generate instruments the IR whereas -fprofile-instr-generate (which this issue seeks) effectively instruments the AST (via source mapping).

eggyal on 22 Oct 2019

Maybe debuginfo can help there?

michaelwoerister on 22 Oct 2019

Unfortunately, for the reasons @kennytm gave in the comment to which @mssun linked above, debuginfo is not sufficient.

eggyal on 22 Oct 2019

@eggyal yes, profile-generate enables two passes:

pgo-instr-gen which heuristically adds http://llvm.org/docs/LangRef.html#llvm-instrprof-increment-intrinsic (and related intrinsics) to the LLVM-IR basically by adding them to the beginning of codeblocks.
instrprof which lowers the above intrinsics to "real code" I assume these are intrinsics so the implementation of profile counting can be smart (threadsafe in threaded programs, using native types where needed etc.)

Together (with the profile-rt library) these allow generation and subsequent use of profiling data during compilation. Unfortunately together these two passes are insufficient for code-coverage usage, since the LLVM is too far away from the AST. Part 4 of my write-up discusses writing the coverage mapping code that maps AST nodes to instrumentation intrinsics - this is what then enables frontend-guided coverage.

I suspect that adding the coverage map will require disabling the pgo-instr-gen pass and adding the intrinsics manually (this seems to be what clang calls profile-instr-generate?), because there's rust-specific knowledge that can/should be applied (e.g. generic expansions, semanticly equivalent AST changes etc.) and because there's no way to guess the profiling entries that will be generated by pgo-instr-gen.

bossmc on 22 Oct 2019

https://github.com/rust-lang/rust/pull/48346/files#diff-2030f049f8df2454e3720ea9403bc14aR446 is where we started adding the pgo-instr-gen pass.

bossmc on 22 Oct 2019

👍2

I am very eager to see this feature work for Rust. I was the person who designed and originally implemented Clang's profile-instr-generate feature (for both PGO and code coverage). I'm currently working on a project that is implemented in Rust and I miss the ability to get good code coverage data. I'm not yet familiar with the internal workings of the Rust compiler, I'd be happy to consult with anyone who is working on it. I can start by confirming that to get the best results, the instrumentation should be added in the Rust frontend by walking the AST, prior to any internal lowering so that the AST corresponds closely to the source constructs. Adding instrumentation in an LLVM pass can work OK for PGO but it's not what you want for code coverage.

bob-wilson on 30 Oct 2019

❤12 👍2

Thank you both @bossmc and @bob-wilson — all very helpful! I've spent some time over the past week getting to grips with how instrprof is implemented in Clang, and have begun prototyping some broad structures of a Rust implementation.

I had pretty much come to exactly the conclusion you mention, @bob-wilson: in order to closely align coverage with the source, instrumentation will need to be added before the AST is lowered (or at least, before HIR is lowered to MIR—this seems slightly better to me as macros will then have been expanded). However, in order to maintain backend-independence, this presumably means that new instrumentation intrinsics will need to be introduced into the MIR (and possibly the HIR)?

As an intermediate step before working on that goal, I am currently thinking that I will first instrument the MIR on lowering to LLVM IR—and thereby prove that I have both instrumentation and coverage map generation working at that level first. But perhaps that would be unnecessarily arduous and I would be best to start with the AST/HIR?

Otherwise I do think that I have a reasonable feel right now for my next steps, but am more than happy to work on this with anyone else who'd like to contribute! Also very grateful to those who've offered their input, especially since this is my first foray into the internals of rustc.

eggyal on 30 Oct 2019

I think the instrumentation and coverage map should be generated from the ASTs after macro expansion and before lowering to HIR. Adding the instrumentation is relatively easier than generating the coverage mapping, so I wouldn't bother doing that on MIR first -- you would probably end up throwing most of it away. I do agree with you that it would be best to start with the instrumentation, and adding instrumentation intrinsics to HIR and MIR seems like the right thing to do.

bob-wilson on 31 Oct 2019

Another thing that I've been pondering is exactly what should be excluded from the coverage map? Clang excludes system headers, but perhaps we should exclude all dependencies from outside the package/workspace being built?

Can anyone confirm the semantics of syntax::source_map::SourceFile::is_imported(&self) -> bool?

eggyal on 31 Oct 2019

I think (though I suspect @bob-wilson will know better and can feel free to correct me) that the coverage format has the capability to represent "macros" (remote code) sensibly (https://llvm.org/docs/CoverageMappingFormat.html#file-id) so maybe we'd want to apply instrumentation directly to the parse tree?

bossmc on 31 Oct 2019

@bossmc: Aye, expansion regions are perfect for macros—and they can be fully generated from the Span info in the HIR, whereas the parse tree would not have the expanded code to reference (thus requiring subsequent modification of any found expansion regions to complete the referencing).

eggyal on 31 Oct 2019

Ah yes, I'm being slow :) - so long as the expanded code knows it came from a macro then HIR is the correct point indeed.

bossmc on 31 Oct 2019

Based on the RustC guide, I had the impression that HIR has a number of lowering changes that diverge from a 1-1 mapping to the source code. Trying to map back to the exact source locations for the original constructs seems like it may be difficult. The RustC guide also describes macro expansion as happening prior to lowering to HIR. Is that out of date? It would definitely be best to insert the instrumentation and build the coverage mapping at a point where macros have been expanded, but I was hoping that could be done before other lowering transformations.

bob-wilson on 31 Oct 2019

Based on the RustC guide, I had the impression that HIR has a number of lowering changes that diverge from a 1-1 mapping to the source code. Trying to map back to the exact source locations for the original constructs seems like it may be difficult. The RustC guide also describes macro expansion as happening prior to lowering to HIR. Is that out of date? It would definitely be best to insert the instrumentation and build the coverage mapping at a point where macros have been expanded, but I was hoping that could be done before other lowering transformations.

Perhaps one of the seasoned rustc developers could chip in on this, as I'm not entirely certain—but after digging through the source for a bit it seems to me that MIR is perfect for instrumentation and source-mapping: I suspect that the coverage regions for a given function are exactly the MIR Statement descendants of its Body that are StatementKind::Assign and their source locations can readily be obtained from their Span data (yielding either a code region or an expansion region); skipped regions might even be identifiable by Statement descendants that are StatementKind::Nop—although perhaps some such regions will not have any presence whatsoever in the MIR?... which then might only leave gap regions (that I confess I don't yet entirely understand).

eggyal on 31 Oct 2019

which then might only leave gap regions (that I confess I don't yet entirely understand)

~~I understood it to be for empty statement, e.g. if (condition) {/*gap*/} ?~~
Edit: see below.

At least that's what I'm gathering from CoverageMapping.h:

/// A GapRegion is like a CodeRegion, but its count is only set as the
/// line execution count when its the only region in the line.

pJunger on 31 Oct 2019

Although you may very well be able to implement some kind of code coverage support at the MIR level, that would be a different approach than what we did in Clang. One key idea from Clang's implementation is that both the instrumentation and the coverage map are directly tied to the source constructs. As one example, take an "if" statement. The instrumentation is done by inserting a counter at the start of the "then" block. If there is an "else" block, there's no need to put a counter there because you can calculate the execution count from the parent count and the "then" block count. This also holds true in the coverage map, where the goal is to precisely map the execution counts back to source locations. So why should this be done before any lowering? As an example, take one thing that I saw in the RustC Guide: "if let" is lowered to HIR as a "match". A match statement can have any number of arms, so it generally needs to be instrumented differently than an "if" statement. When lowering from an "if let", presumably there is only one arm (or two, if there was an "else") but even if there is a way to recognize that, it would need to be a special case and I suspect it would get quite complicated. Could you just go ahead and instrument it as a "match"? Sure, but then you're likely to run into challenges mapping back to the exact source locations. I can give an example of a gap region that is somewhat relevant here. Say that you put a blank line in between then "if let" condition and the start of the "then" block. What execution count should be displayed for that blank line? Clang treats this as a gap region from the end of the condition to the start of the "then", and it chooses to use the "then" block count for that region. I haven't given it a lot of thought, but I think the answer may be different for a "match" statement. It would be strange for that blank line to show the execution count of the first arm of the match.

bob-wilson on 31 Oct 2019

That's really helpful, thanks @bob-wilson.

I'm definitely not averse to injecting the instrumentation earlier (it was what I had earlier thought would be needed), but I'm still not completely convinced that MIR lacks the necessary information to produce the desired result. Taking the example you mention, of if being lowered to a match, the generated MIR TerminatorKind::SwitchInt will have the relevant number of .values / .targets—and a counter expression can be used in place of an outright counter if appropriate, irrespective of whether the original source was an if or a match. So far from being a special case, it actually enables match expressions with fewer arms to be handled with counter expressions in exactly the same way as an if. Nor do I think there's any issue mapping back to source locations, as the containing Terminator structure has .source_info that contains the generating Span.

eggyal on 31 Oct 2019

Part of my comment with that example is that I suspect you might want different behavior for the gap region after the "if let" condition than for the gap region after a match condition with 1 arm, and you may not be able to distinguish them easily after lowering.

The other missing part of this discussion relates to using these profiles for PGO in the context where someone wants to save a profile as part of their project (e.g., in a case where collecting a good profile is labor intensive). We had a design goal that the profile should remain usable in that scenario even if you had updated your compiler version so that the internal lowering changed. We did that by tying the profile structure to the source code structure as reflected in the pre-lowering ASTs. I don't know if Rust will ever support that style of instrumentation-based PGO, but I think it is still a valuable feature for some styles of development and it would make sense to me to implement the coverage profiling in a way that would align with that in case someone wants to support it in the future.

More generally, I think you can probably follow Clang's lead in many aspects of this work. It's an approach that has been tried and works well in practice. If there's a compelling reason to do it differently in Rust, then it might make sense, but otherwise, my advice is to follow Clang's approach.

bob-wilson on 31 Oct 2019

Is there even a need to leave out the counter on the else block? Isn't that one of the reasons that Clangs coverage does not work reliably when exceptions are thrown?

pJunger on 31 Oct 2019

@pJunger: It's not "leaving out" the counter, but rather providing means for deducing its value with zero runtime overhead.

@bob-wilson: I'm not sure I'm following the thinking re gap region differences following if let and match—would you be able to elaborate with an example? You're no doubt right about the PGO side of things (especially when rustc's internal structures are still quite fluid) and the principle of following Clang generally... I'll give it some more thought! My only inkling for starting from MIR is that it has essentially done a lot of the grunt already, and starting from AST would be duplicating much of that work.

eggyal on 31 Oct 2019

Leaving out the "else" counter reduces the runtime overhead of the instrumentation and produces smaller profile files. There might be some 2nd order effects related to throwing exceptions, but the primary trouble with exceptions is that we thought it would be too expensive to insert a counter after every function call. The issue here is that throwing an exception can essentially cause a difference between the execution count going into a function call and the count coming out of it. The brute force approach to detect that would be to put a counter after every call. You might be able to do something clever with automatically generated exception handlers, but I'm not aware of anyone who has looked into that.

@eggyal : Consider "if let condition" followed by a blank line vs. "match condition" followed by a blank line. What execution count should be displayed for that blank line? For the "if", Clang shows the "then" count. The corresponding behavior for the "match" (or a match lowered from "if let") would be to show the count for the first arm of the match. That just seems wrong to me.

bob-wilson on 31 Oct 2019

Okay, I understand... although wouldn't this be more a coverage tool issue rather than an instrumentation issue? Indeed, shouldn't the coverage tool show the execution count of any lines between if/match arms as being either skipped or that of the containing if { } / match { } region itself—i.e. without any need for explicit gap regions?

eggyal on 31 Oct 2019

This is definitely outside my area of expertise, but some observations:

While HIR does make some slight simplifications that erase possibly-relevant source-level distinction, some of that information is preserved for use by diagnostics or, possibly, coverage instrumentation. For example, the if let vs match distinction is captured in the MatchSource field of each hir::ExprKind::Match value.
MIR on the other hand is very close to LLVM IR in how much it resembles source code, how it links back to source (only storing bits and pieces relevant for debug info), and how it represents control flow. Therefore, I would not a priori expect that inserting instrumentation in MIR would yield drastically different results from the instrumentation inserted by an LLVM pass (the only difference being the rather mild effects of MIR optimizations). However, inserting it during MIR construction might make sense.

hanna-kruppe on 31 Oct 2019

❤1 👍1

Yes, it's a coverage tool issue. There are a lot of cases like this where there is no clear right answer and the best compromises require understanding the semantics of the code. (I.E., you want the compiler to figure it out instead of relying on some pattern matching in an IDE editor.) The instrumentation part is relatively easier than the coverage mapping, and you definitely want the instrumentation and coverage mapping to be generated from the same ASTs, so be sure to think about the coverage tool issues when deciding where to insert instrumentation.

Regarding the gap regions, I'm not the expert on that part, but I remember that there was quite a lot of iteration trying to come up with solutions that most people were happy with. It's really not easy. For example, for blank lines between match arms, you don't want them to show up as being unexecuted, but using the count from the containing region can give very surprising results, e.g., if none of the match arms ever execute but the blank lines between them have large non-zero counts from the enclosing region. Most people seem to expect that those blank lines will be included as part of the code region right before them, but then you run into problems with gaps after control flow.

@rkruppe Clang does not generate coverage instrumentation from LLVM IR. That's way too low level to do a good job. For gcov-style instrumentation, we used to rely on debug info to get back to source locations, and that really doesn't work well in practice. Debug info tends to be optimized to work well for single stepping in a debugger, and it's not so good for precisely identifying code coverage regions.

bob-wilson on 31 Oct 2019

@bob-wilson Obviously this is better for runtime performance & data sizes.

I'm just asking because in the automotive industry every decision & condition have to be covered (including those that throw/panic), so taking over the design from Clang 1:1 would make this instrumentation useless for that. (And this is a bit sad, because otherwise you could interpolate CD/Coverage from those counters and compete with Bullseye Coverage and the likes).

pJunger on 31 Oct 2019

@bob-wilson

@rkruppe Clang does not generate coverage instrumentation from LLVM IR. That's way too low level to do a good job. For gcov-style instrumentation, we used to rely on debug info to get back to source locations, and that really doesn't work well in practice. Debug info tends to be optimized to work well for single stepping in a debugger, and it's not so good for precisely identifying code coverage regions.

Yes, I draw the analogy between generating instrumentation from MIR and generating it from LLVM IR to suggest that generating it from MIR won't be any better.

hanna-kruppe on 31 Oct 2019

👍1

Regarding the gap regions, I'm not the expert on that part, but I remember that there was quite a lot of iteration trying to come up with solutions that most people were happy with. It's really not easy. For example, for blank lines between match arms, you don't want them to show up as being unexecuted, but using the count from the containing region can give very surprising results, e.g., if none of the match arms ever execute but the blank lines between them have large non-zero counts from the enclosing region. Most people seem to expect that those blank lines will be included as part of the code region right before them, but then you run into problems with gaps after control flow.

Not that I propose addressing it, but this does rather sound like a UI issue, since the execution counts relate to regions rather than lines but (presumably for historical reasons) the tools only display line counts... these problems seem to stem from that disparity? Anyway, I almost have both instrumentation and mapping generation working from MIR and hope from that to glean a much better understanding of these problems! Will then work on moving to the AST instead.

eggyal on 31 Oct 2019

@bob-wilson: I'm delighted to say that I now recognise the folly of trying to generate these mappings from the MIR: it's not practical to comprehend program flow through the BasicBlocks well enough to establish the code regions. Nevertheless I'm pleased that I undertook the exercise to understand the problem better—thank you for patiently indulging me whilst I caught up with you!

eggyal on 1 Nov 2019

❤2

By the way, maybe this goes without saying, but I definitely recommend following the Clang implementation as closely as makes sense. For the instrumentation code, see lib/CodeGen/CodeGenPGO.{h,cpp}. The coverage mapping code is in lib/CodeGen/CoverageMappingGen.{h,cpp}

bob-wilson on 6 Nov 2019

❤1

I guess one difference will/should be that return and ? needs to be handled?

pJunger on 10 Nov 2019

Some news about this? 💔

erikjara on 15 Dec 2019

@erikjara I've made some good progress but then some other projects demanded quite a bit of focus and I haven't looked at this for a good month or so—hoping to get back on it after Christmas.

eggyal on 16 Dec 2019

❤4

(For others who hadn't noticed this, the first part of this has now been merged in #73011!)

djc on 23 Jun 2020

👍4 🎉3

So I just saw the help text for the flag:

-Z instrument-coverage=val -- instrument the generated code to support LLVM source-based code coverage reports (note, the compiler build config must include profiler = true, and is mutually exclusive with -C profile-generate/-C profile-use); implies -C link-dead-code (unless explicitly disabled)and-Z symbol-mangling-version=v0`; and disables/overrides some optimization options (default: no)

And there is an open issue that can make -C link-dead-code problematic if you have one a crate doing C bindings in your build tree, which would definitely impact the usability/accuracy of the results https://github.com/rust-lang/rust/issues/64685

xd009642 on 19 Aug 2020

👀1

there is an open issue that can make -C link-dead-code problematic

@xd009642 Thank you!! ❤️

You provided a key clue to resolving an issue running Rust binaries instrumented with -Zinstrument-coverage on MSVC!

Linking never failed (or at least, it didn't generate error messages), but when running the program, the program would usually SegFault during the post-exit phase to save the profiling counters to the *.profraw file. (If it didn't crash, it would produce an empty file.)

I inspected the generated instructions in the binary in "IDA Pro" and in the Visual Studio debugger, and could see some major problems with missing or misaligned symbols/addresses, resulting in some instructions being optimized out, which explained the SegFaults, but I was not able to understand why it was linked wrong.

I disabled the dead code option with -Clink-dead-code=no and everything worked!

I haven't had an issue with link-dead-code on Linux or Mac, or even on Windows under GNU, but it was consistently failing under MSVC.

I'll change the option to enable link-dead-code ONLY if not building for MSVC, and hopefully that will be good enough.

I've collected some debugging notes, which you can see here. I'll also link them in the bug report, but I'll update them with this information as well.

FYI: @tmandry @wesleywiser @petrhosek
Looks like I have the fix for --Zinstrument-coverage under MSVC! 🎉

richkadel on 24 Aug 2020

🎉2

@richkadel profiling does not work with windows-gnu when building profiler library with GCC (but it does with Clang) and is not tested on the CI. With GCC it segfaults like that https://reviews.llvm.org/D86405#2232228
I'll open PR to Rust side of the issue today, LLVM side is waiting on https://reviews.llvm.org/D86405.

mati865 on 24 Aug 2020

@mati865 Thanks for the additional info. When you say "profiling" here, you're referring to PGO, right?

Just to be clear, my comment was not about Rust's PGO, but was about the Clang InstrProf coverage profiling feature, enabled by rustc -Zinstrument-coverage. I believe the InstrProf profiling depends on Clang and (I assume) would not work with GCC.

richkadel on 24 Aug 2020

(I should say, InstrProf depends on LLVM, that is.)

richkadel on 24 Aug 2020

@richkadel I'm talking about llvm-project/compiler-rt/lib/profile which is used for both PGO and code coverage.
GCC somehow miscompiles that library so we cannot test it on CI. When that library is built by Clang (and used by rustc) code coverage doesn't cause segfaults in the resulting binaries.

mati865 on 24 Aug 2020

Ok, yes. (The notes I linked above are pretty extensive, but include disassembled views of the results from both Clang and rustc.) At any rate, rustc appears to be working with that library now, as long as I disable link-dead-code. (Granted, I've only tested a few small binaries, but given all of the things I've tried (with no significant difference or improvement), I'm optimistic that the link-dead-code was the reason. Whether this helps PGO or not, I cannot say.

richkadel on 24 Aug 2020

@richkadel glad my comment was helpful :+1:

So the approach I'm taking with tarpaulin is to use syn to find coverable lines so I don't need to link dead code and can get the same results. I've been partially taking this approach for code which can be missed out regardless of dead code. But a combination of linker issues with -C link-dead-code and users having issues with compile times and target directory sizes makes me feel this is the best solution (at least for tarpaulin).

Just to elaborate on "code which can be missed out regardless of dead code". If there's a generic function that's unused no code is generated so traditional coverage tools will miss it out without using source analysis to catch the code.

I am really looking forward to this llvm coverage stuff making it into stable rust, because then I can use this as a coverage collection backend and keep the source-analysis and reporting code on either end and run it without dead-code. :grin:

I'm sorry if this is clearly mentioned in the llvm coverage notes (if so I missed it) but does llvm coverage account for condition coverage? So instead of just branch - evaluating the boolean condition for a branch, condition coverage looks at the boolean subconditions, so for if a && b it will report on a being true/false and b being true/false/X (X for short-circuiting)

xd009642 on 24 Aug 2020

Yes, Boolean short circuiting will definitely be "covered" (no pun intended) 😄

richkadel on 25 Aug 2020

so just to be extra explicit if I have:

if a || b {
     println!("foo");
} else {
    println!("bar");
}

And test with only a toggling between true and false and everything else false, that would be 50% condition coverage, but 100% branch coverage (and also 100% line coverage).

xd009642 on 25 Aug 2020

Ok, yes. (The notes I linked above are pretty extensive, but include disassembled views of the results from both Clang and rustc.) At any rate, rustc appears to be working with that library now, as long as I disable link-dead-code. (Granted, I've only tested a few small binaries, but given all of the things I've tried (with no significant difference or improvement), I'm optimistic that the link-dead-code was the reason. Whether this helps PGO or not, I cannot say.

Doesn't this mean that you're missing one or more references that cause linker to drop symbols that are being used? Note that profile instrumentation is current incompatible with --gc-sections which is this issue wouldn't manifest on Linux even if was present there.

petrhosek on 25 Aug 2020

@xd009642 - Some of the specifics are part of what I'm finally nailing down now, as the final phase of the implementation.

I expect to produce results that resemble the coverage spans produced by Clang, for cases where the syntax is similar (like an if/else, as in your example).

But (at least in my opinion) accurate coverage of the executable code regions is what's most important. Deciding what surrounding syntax elements should also be included is partly a question of style and readability.

It's easier for me to address your specific example in terms of the code regions and their coverage counts, which is how LLVM InstrProf encodes the coverage, than it is percentages and line counts, which are computed by LLVM tools after the fact. So hopefully you can extrapolate from this similar example:

if a == b || b == c {
     println!("foo");
} else {
    println!("bar");
}

Assuming a and b are the same value, then it doesn't matter what c is, so LLVM's coverage tools would report:

the code region for "a == b ||" was executed 1 time. (The OR ("||") might be reported as a separate code region, but it clearly doesn't have to be. I'll see what Clang does and replicate that, most likely.)
the code region for "b == c" was executed 0 times
the block encompassing println!("foo") was executed 1 time. (Like with the ||, the code region may also include the else keyword, but I'll check out Clang's answer to this and probably replicate its approach.)
the block encompassing println!("bar") was executed 0 times

Note, if this snippet represented the entire function, there's an implicit return () here that might need to be accounted for, but without any actual source code to construct a "code region" for that, it may not show up in the llvm-cov show reports. Nevertheless, I think it may be important to report this since an early return (say, if there was one right after println!("foo")) would bypass the implicit return, and someone might want to know that.

richkadel on 25 Aug 2020

👍1

@petrhosek - That's really interesting. Now that you mentioned it, I now finally see that a rust, linker-dependent function called gc_sections() is called/enabled based on the value of -Clink-dead-code (but note, it's only enabled if link-dead-code is turned off), and see my follow-up comment):

https://github.com/rust-lang/rust/blob/2da73717d70ee8583d619dfa12ad8b8306eedab5/src/librustc_codegen_ssa/back/link.rs#L1667-L1675

There are only about 3 or 4 places that the link-dead-code option affects rustc. I'm not sure what would happen if I just enabled or disabled the gc-sections() options, without changing the other link-dead-code dependencies.

I'm not sure I understand what you're implying about --gc-sections compatibility and Linux. The fact is, the -Z instrument-coverage implementation does work on Linux and Mac with -C link-dead-code, _and_ I've seen the coverage results on Linux with link-dead-code both enabled and disabled. Both worked and the produced different coverage results consistent with expectations (showing zero coverage in one example, for dead code, and showing no coverage results for the dead code when link-dead-code is disabled).

richkadel on 25 Aug 2020

I should also add that the gc_sections() function in the referenced link.rs snippet is does not directly translate to the --gc-sections flag. It depends on which linker is used, and on the value of keep_metadata. Here's what rustc does for the MSVC linker:

https://github.com/rust-lang/rust/blob/1627ba19643c18fe9a80e65c8df8bbd2f3ed8432/src/librustc_codegen_ssa/back/linker.rs#L678-L690

richkadel on 25 Aug 2020

(FYI, I had to edit the comment before last. I made a couple of incorrect statements originally.)

richkadel on 25 Aug 2020

@richkadel awesome that makes sense and I can see how to derive other coverage metrics from the info :+1:.

Regarding dead code not causing issues on linux maybe try with https://github.com/rust-ndarray/ndarray-linalg I've found I can get issues with crates that using FFI libraries in their dependency tree. Although, there's always a chance this is some interaction of link-dead-code and the relocation-model I need to use in order to instrument the binaries with breakpoints (an issue not present with LLVM instrumentation).

I wonder if a possible solution is to apply link-dead-code only to the crate libraries/binaries and not the dependencies. That would at least isolate the problem if (like my issue) it only applies to a subset of crates - but if one of them is in the dependency tree it impacts the whole project. EDIT: cargo rustc seems to be the solution to experiment with it

xd009642 on 25 Aug 2020

FYI: As of PR #78267, LLVM InstrProf-based code coverage is "feature complete"! (I believe this last PR will be in the Rust nightly distribution channel tomorrow, Nov 7.) It has been tested on a few large crate implementations, but will need more real world testing. Please cc me on any bug reports.

Usage is now documented in The Rust Unstable Book.

richkadel on 6 Nov 2020

🎉12 👍3

@richkadel which version of llvm-profdata does it require to work? I installed the one in my distros package manager and all it gives is an unhelpful message saying it's an incompatible version. When I'm on my machine later I can get the versions and error message

xd009642 on 6 Nov 2020

Rust builds with LLVM 11 currently, so use that if you can. 10 may or may
not also work.

On Fri, Nov 6, 2020 at 8:45 AM xd009642 notifications@github.com wrote:

@richkadel https://github.com/richkadel which version of llvm-profdata
does it require to work? I installed the one in my distros package manager
and all it gives is an unhelpful message saying it's an incompatible
version. When I'm on my machine later I can get the versions and error
message

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/34701#issuecomment-723181469,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AARMYYBLKDBFGZHOVXS75VLSOQR4BANCNFSM4CI4DJVQ
.

tmandry on 6 Nov 2020

👍1

Now llvm-cov should be shipped through llvm-tools-preview?

50336 seems to skip it.

dalance on 11 Nov 2020

Yes! Thanks for the PR @dalance !

richkadel on 11 Nov 2020

Rust: Implement support for LLVMs code coverage instrumentation

Most helpful comment

All 74 comments

Add Frontend-Guided Instrumentation

Enhance the profile-rt library

Enable profile-guided optimization

Create a coverage map

Integrate coverage tooling into Cargo

Unanswered Questions

50336 seems to skip it.

Related issues