Hi everyone,
I'd like to write a suite of tests purely to verify that methods I've written do not make heap allocations.
Utilising BenchmarkDotNet's MemoryDiagnoser seems like an easy way to achieve this.
However benchmarks can take minutes to run, which is not ideal when I'm only after the memory usage insights.
Is there a way that I can configure my benchmarks such that they're only executed a few times? This would result in MemoryDiagnoser output, but the benchmarks would conclude almost immediately.
Thanks.
Dan
hi @Lothy
We always run separate run for the accurate time measurements and another one for diagnosers (they could affect the results).
You should be able to use MemoryDiagnoser with our InProcessToolchain which does not build and rune separate process for every benchmark. Then if it is still too long you can play with the settings to reduce the time. @ig-sinicyn could tell you something more about it
The alternative could be to use dotMemory Unit framework
@Lothy
Hi! Yes, it's should be possible to perform short runs using InProcessToolchain together with job with hardcoded iteration count / target count.
Something like this:
public static readonly Job DefaultJob = new Job("Competition")
{
Env =
{
Gc =
{
Force = false
}
},
Run =
{
LaunchCount = 1,
WarmupCount = 128,
TargetCount = 256,
RunStrategy = RunStrategy.Throughput,
UnrollFactor = 16,
InvocationCount = 256
},
Infrastructure =
{
Toolchain = InProcessToolchain.DontLogOutput
}
}.Freeze();
If you want to perform checks on a regular basis you may also try CodeJam.PerfTests project. As example of perftest that checks some allocations:
[Category("PerfTests: NUnit examples")]
public class ListCapacityPerfTest
{
private const int Count = 10;
[Test]
public void RunListCapacityPerfTest() => Competition.Run(this);
[CompetitionBaseline]
[GcAllocations(224, BinarySizeUnit.Byte)]
public int ListWithoutCapacity()
{
var data = new List<int>();
for (int i = 0; i < Count; i++)
data.Add(i);
return data.Count;
}
[CompetitionBenchmark(0.20, 0.50)]
[GcAllocations(104, BinarySizeUnit.Byte)]
public int ListWithCapacity()
{
var data = new List<int>(Count);
for (int i = 0; i < Count; i++)
data.Add(i);
return data.Count;
}
}
Please note that correct GC allocation measurement is a tricky thing. Minimum amount of GC allocation is 8Kb so byte-level accuracy for small allocations may be achieved only by performing multiple runs.
Hi @adamsitnik , @ig-sinicyn ,
I've just gotten back to this!
Thanks for taking the time to reply, and sorry for my own belated reply.
I took the example code and reduced the iteration counts further to get this result:
BenchmarkDotNet=v0.10.8, OS=Windows 10 Redstone 2 (10.0.15063)
Processor=Intel Core i7-2600K CPU 3.40GHz (Sandy Bridge), ProcessorCount=8
Frequency=3331181 Hz, Resolution=300.1938 ns, Timer=TSC
[Host] : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.7.2101.1
Job=Competition Force=False Toolchain=InProcessToolchain
InvocationCount=16 LaunchCount=1 RunStrategy=Throughput
TargetCount=16 UnrollFactor=16 WarmupCount=16
| Method | Mean | Error | StdDev | Scaled | ScaledSD | Allocated |
|---------------------- |-----------:|-----------:|-----------:|-------:|---------:|----------:|
| OneDimensionOneLoop | 526.5 us | 1.5587 us | 1.4580 us | 0.35 | 0.00 | 0 B |
| OneDimensionA | 562.6 us | 1.5392 us | 1.4397 us | 0.38 | 0.00 | 0 B |
| OneDimensionB | 694.6 us | 0.5958 us | 0.5281 us | 0.47 | 0.00 | 0 B |
| TwoDimensionsD1ThenD2 | 1,493.0 us | 11.4924 us | 11.2871 us | 1.00 | 0.00 | 0 B |
| ThreeDimensionsD1D2D3 | 2,271.7 us | 1.9948 us | 1.8659 us | 1.52 | 0.01 | 0 B |
| ThreeDimensionsD3D2D1 | 3,470.5 us | 8.4323 us | 8.2817 us | 2.32 | 0.02 | 0 B |
| TwoDimensionsD2ThenD1 | 5,756.8 us | 10.2958 us | 9.6307 us | 3.86 | 0.03 | 0 B |
As per your advice concerning the 8Kb allocation block size, I retained multiple iterations to ensure that it had a chance to correctly determine the memory allocation rate.
With that said though, the big draw for me is quickly verifying that a given method performs no allocation whatsoever. This particular benchmark run took 15 seconds to tell me precisely what I wanted to know.
I think it's great that this can be used as an ad-hoc tool to achieve that.
Thanks again, and have a great weekend (when you get there in your respective timezones)!
Dan
@Lothy I am glad to hear that!
Have a nice weekend too!
Most helpful comment
Hi @adamsitnik , @ig-sinicyn ,
I've just gotten back to this!
Thanks for taking the time to reply, and sorry for my own belated reply.
I took the example code and reduced the iteration counts further to get this result:
| Method | Mean | Error | StdDev | Scaled | ScaledSD | Allocated |
|---------------------- |-----------:|-----------:|-----------:|-------:|---------:|----------:|
| OneDimensionOneLoop | 526.5 us | 1.5587 us | 1.4580 us | 0.35 | 0.00 | 0 B |
| OneDimensionA | 562.6 us | 1.5392 us | 1.4397 us | 0.38 | 0.00 | 0 B |
| OneDimensionB | 694.6 us | 0.5958 us | 0.5281 us | 0.47 | 0.00 | 0 B |
| TwoDimensionsD1ThenD2 | 1,493.0 us | 11.4924 us | 11.2871 us | 1.00 | 0.00 | 0 B |
| ThreeDimensionsD1D2D3 | 2,271.7 us | 1.9948 us | 1.8659 us | 1.52 | 0.01 | 0 B |
| ThreeDimensionsD3D2D1 | 3,470.5 us | 8.4323 us | 8.2817 us | 2.32 | 0.02 | 0 B |
| TwoDimensionsD2ThenD1 | 5,756.8 us | 10.2958 us | 9.6307 us | 3.86 | 0.03 | 0 B |
As per your advice concerning the 8Kb allocation block size, I retained multiple iterations to ensure that it had a chance to correctly determine the memory allocation rate.
With that said though, the big draw for me is quickly verifying that a given method performs no allocation whatsoever. This particular benchmark run took 15 seconds to tell me precisely what I wanted to know.
I think it's great that this can be used as an ad-hoc tool to achieve that.
Thanks again, and have a great weekend (when you get there in your respective timezones)!
Dan