Benchmarkdotnet: How to share same large row data between difference benchmarks?

Created on 22 Dec 2017  路  8Comments  路  Source: dotnet/BenchmarkDotNet

I'm encounter a scenario that I have about one million row data loading from remote database into list and I need benchmarking difference algorithms to get best way to find particular data. But each benchmark will reloading data that will take a lot of time and bandwidth.

I'm trying to loading once using singleton object to store row data and using [GolbalSetup] to get row data but during benchmarking that will throw null exception when I called the singleton object. (I don't know why actually...)

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.NullReferenceException:Object reference not set to an instance of an object.

 --- End of inner exception stack trace ---
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor)
   at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] parameters, Object[] arguments)
   at BenchmarkDotNet.Autogenerated.Program.AfterAssemblyLoadingAttached(String[] args) in /Users/shockliang/Documents/Benchmark/bin/Release/netcoreapp2.0/379e7427-2fb4-4886-a69a-79a909e7d7d8/379e7427-2fb4-4886-a69a-79a909e7d7d8.notcs:line 56

Is any good practice to solve my scenario?

Thanks

Most helpful comment

I found workaround to this issue - using environment variables: I set one before BenchmarkRunner.Run call and read its value in GlobalSetup method in benchmark type.

All 8 comments

Hello @shockliang

Your approach is correct: Load the data in the method marked with [GlobalSetup] and use the data in [Benchmark] methods.

Could you mark the type which defines benchmarks with [KeepBenchmarkFiles] attribute and paste here the content of .notcs file here? It's an auto-generated C# file which we use to generate new type before running the benchmarks. You can find it's location in the logs/console output: (here it was /Users/shockliang/Documents/Benchmark/bin/Release/netcoreapp2.0/).

Btw. are you running on Mono or .NET Core?

Hi @adamsitnik

My environment:

BenchmarkDotNet=v0.10.11, OS=macOS 10.12.6 (16G1114) [Darwin 16.7.0]
Processor=Intel Core i7-4770HQ CPU 2.20GHz (Haswell), ProcessorCount=8
.NET Core SDK=2.0.3
  [Host]     : .NET Core 2.0.3 (Framework 4.6.0.0), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.3 (Framework 4.6.0.0), 64bit RyuJIT

My first approach was using singleton initialize in the main method and run benchmarks after data loading completed.

static void Main(string[] args)
{
    var dbConn = new SqlConnection(ConnectionString);
    DataService.Instance.LoadingComplete += (s, e) =>
    {
        var summary = BenchmarkRunner.Run<SearchData>();
    };
    DataService.Instance.Initial(dbConn);
}

And trying to cache data from DataService in the Benchmark class with [GlobalSetup]

        [GlobalSetup]
        public void InitialData()
        {
            originCardIds = new int[] { 0, 1, 2, 3, 13 };
            holdCardIds = new int[] { 0, 13 };
            unholdCardIds = originCardIds.Except(holdCardIds).ToArray();

            rankData = DataService.Instance.GetRankData();
            rankDataCount = rankData.Count;
        }

        [Benchmark]
        public void ParallelForLoopWithIntersect()
        {
            // using rank data for enumerated
        }

        [Benchmark]
        public void ForLoopWithIntersect()
        {
            // using rank data for enumerated
        }

Is this approach valid for BenchmarkDotNet?
It seems each benchmark will build again with separate folder that meaning singleton object can't shared same memory locate for each benchmark build. Is this understanding correct?

Btw. Because I try errors too many times that make my code mess up. I will upload .notcs after clear up to revert first approach.

Thanks your reply.

@shockliang now I see it.

To provide best possible precision we generate, build and execute a new process for every benchmark. Here you have set the value of singleton in host process and used it in the benchmark process (null).

You should move your initialization logic to the [GlobalSetup] method. It's going to be executed in the separate process. No matter how long it takes or memory it allocates this method is not affecting the results.

[GlobalSetup]
public void InitialData()
{
    var dbConn = new SqlConnection(ConnectionString);
    DataService.Instance.Initial(dbConn);

    originCardIds = new int[] { 0, 1, 2, 3, 13 };
    holdCardIds = new int[] { 0, 13 };
    unholdCardIds = originCardIds.Except(holdCardIds).ToArray();

    rankData = DataService.Instance.GetRankData();
    rankDataCount = rankData.Count;
}

static void Main(string[] args)
{
    var summary = BenchmarkRunner.Run<SearchData>();
}

@shockliang I am closing. Please feel free to reopen if my answer does not help

@adamsitnik
I wrote first edition I using [GlobalSetup]attribute to initialize data inside the SearchData benchmark class. It's work but take a lot of time when run each benchmark cause reload data from remote database.
After that I'm trying using singleton object to store data as your mention above. But running each benchmark still load data again. That doesn't shared same singleton object to every benchmark.

As you said

build and execute a new process for every benchmark

Isn't singleton pattern approach fit BenchmarkDotNet?
Can I share data from host process to others?

@shockliang [GlobalSetup] is executed only once. [IterationSetup] is executed before every iteration. Docs

If you run the benchmarks often I think that the best thing would be to serialize the db results to a file and load them from a file in the [GlobalSetup].

Can I share data from host process to others?

We don't have any mechanism for that. We always want the child process to be "clean".

Hey @adamsitnik, I was experience the same problem as well. I have some long preparation work which generate some files. Now I want to pass at least file names to the benchmark process and can't find a way to do that. Maybe it can be done via string args[] parameter in BenchmarkRunner.Run()? The problem with GlobalSetup is it implies that initialization logic is a simple static method. But in my case I need to startup whole component container thing to generate input data and we already have hierarchy of test classes with NUnit attributes to do exactly that.

I found workaround to this issue - using environment variables: I set one before BenchmarkRunner.Run call and read its value in GlobalSetup method in benchmark type.

Was this page helpful?
0 / 5 - 0 ratings