Benchmarkdotnet: [Bug] Incorrect results for targetCount:Auto

Created on 28 Oct 2016  路  7Comments  路  Source: dotnet/BenchmarkDotNet

Current codebase, benchmark:

    public class Program
    {
        private const int Count = 10 * 1000;

        static void Main(string[] args)
        {
            BenchmarkRunner.Run<Program>();
        }

        [Benchmark(Baseline = true)]
        public void Baseline() => Thread.SpinWait(Count);

        [Benchmark]
        public void SlowerX3() => Thread.SpinWait(3 * Count);
    }

part of output:

MainWarmup 1: 2048 op, 283982469.15 ns, 138.6633 us/op

MainTarget 1: 2048 op, 283552921.72 ns, 138.4536 us/op
MainTarget 2: 2048 op, 285226436.09 ns, 139.2707 us/op
MainTarget 3: 2048 op, 282084974.32 ns, 137.7368 us/op
MainTarget 4: 2048 op, 284826169.13 ns, 139.0753 us/op
MainTarget 5: 2048 op, 287191547.67 ns, 140.2302 us/op
MainTarget 6: 2048 op, 283267663.59 ns, 138.3143 us/op
MainTarget 7: 2048 op, 283271285.92 ns, 138.3161 us/op
MainTarget 8: 2048 op, 287422772.78 ns, 140.3432 us/op
MainTarget 9: 2048 op, 286220160.72 ns, 139.7559 us/op
MainTarget 10: 2048 op, 285216172.84 ns, 139.2657 us/op
MainTarget 11: 2048 op, 289273479.23 ns, 141.2468 us/op
MainTarget 12: 2048 op, 282808533.84 ns, 138.0901 us/op
MainTarget 13: 2048 op, 288167764.38 ns, 140.7069 us/op
MainTarget 14: 2048 op, 286347545.83 ns, 139.8181 us/op
MainTarget 15: 2048 op, 284229391.01 ns, 138.7839 us/op
MainTarget 16: 2048 op, 286027573.75 ns, 139.6619 us/op
MainTarget 17: 2048 op, 285252697.95 ns, 139.2835 us/op
...


Result 1: 2048 op, 0 ns, 0.0000 ns/op
Result 2: 2048 op, 0 ns, 0.0000 ns/op
Result 3: 2048 op, 0 ns, 0.0000 ns/op
Result 4: 2048 op, 0 ns, 0.0000 ns/op
Result 5: 2048 op, 0 ns, 0.0000 ns/op
Result 6: 2048 op, 0 ns, 0.0000 ns/op
Result 7: 2048 op, 0 ns, 0.0000 ns/op
Result 8: 2048 op, 0 ns, 0.0000 ns/op
Result 9: 2048 op, 0 ns, 0.0000 ns/op
Result 10: 2048 op, 0 ns, 0.0000 ns/op
Result 11: 2048 op, 0 ns, 0.0000 ns/op
Result 12: 2048 op, 0 ns, 0.0000 ns/op
Result 13: 2048 op, 0 ns, 0.0000 ns/op
Result 14: 2048 op, 0 ns, 0.0000 ns/op
Result 15: 2048 op, 0 ns, 0.0000 ns/op
Result 16: 2048 op, 65685984.36 ns, 32.0732 us/op
Result 17: 2048 op, 66107079.7 ns, 32.2788 us/op
Result 18: 2048 op, 66544173.64 ns, 32.4923 us/op
Result 19: 2048 op, 66640768.99 ns, 32.5394 us/op
Result 20: 2048 op, 66662804.8 ns, 32.5502 us/op
Result 21: 2048 op, 66793208.52 ns, 32.6139 us/op
Result 22: 2048 op, 66925423.4 ns, 32.6784 us/op

Looks like there's something wrong with the result calculation (first 16 samples as Unroll factor == 16),

And therefore:

// BaselineScaledColumn.cs
            bool invalidResults = baseline == null ||
                                 summary[baseline] == null ||
                                 summary[baseline].ResultStatistics == null ||
                                 !summary[baseline].ResultStatistics.CanBeInverted() ||
                                 summary[benchmark] == null ||
                                 summary[benchmark].ResultStatistics == null;

// ResultStatistics.CanBeInverted() 
        public bool CanBeInverted() => Min > 1e-9;

Min == 0, CanBeInverted() returns true and the output is:

   Method |       Mean |    StdErr |     StdDev | Scaled | Scaled-StdDev |
--------- |----------- |---------- |----------- |------- |-------------- |
 Baseline |  8.4592 us | 0.6132 us |  4.8286 us |      ? |             ? |
 SlowerX3 | 25.4619 us | 1.8444 us | 14.5229 us |      ? |             ? |
bug

Most helpful comment

@AndreyAkinshin done

All 7 comments

I'm seeing the same behavior as well sometimes (with manual build from current master).

Whenever this happens, the remaining printed us/op results are also often increasing (as they are in the OP excerpt)

@adamsitnik, it was your bug. Recently you moved List<Measurement> in EngineWarmupStage and EngineTargetStage from a local variable to a field. Unfortunately, we share these stages between idle and main iterations. As a result, both idle and main lists have references to the same instance of List<Measurement> which has all the measurements. It spoils calculation of the results. I have revered this fix (https://github.com/PerfDotNet/BenchmarkDotNet/commit/8647e737b3473528e96a22f83691d72023bf9daf). You have to find another way to reduce the amount of allocation here.

@AndreyAkinshin Sorry for that! I will find other way.

@adamsitnik, and another bug: you sorted the values in the MainStage. It's wrong because it produces sorted Result measurements (we need the original order in the summary). I pushed another hotfix; it looks ugly and produces another allocation, but it works properly. Please note it when you will refactor these stages in the next time.

@AndreyAkinshin Ok, I have synchronized the gc branch, and found a way to solve this problem:

  • start memory tracking after pilot & warmup stage, so these stages can allocate and this won't be included in final results. no need for hacks here anymore
  • target stage - in the ctor I create a stack with few (10) lists with max size. I use .Pop() to get list when executing so if 10 will ever be not enough we will get exception and we will find out about this without silent error. I know it's not perfect, but I believe it's the lesser evil ;)

@adamsitnik, I like the idea, but could you make the source code more pretty? Like create an additional generic class (e.g. ListPool<T>) with nice API? I want to keep the stage classes as simple as possible and hide all "not-about-measurements" code somewhere else.

@AndreyAkinshin done

Was this page helpful?
0 / 5 - 0 ratings