Lock-Free hash table implementation.
Microsoft.DotNet.XUnitConsoleRunner v2.5.0 (64-bit .NET Core 5.0.0)
Discovering: System.Collections.Concurrent.Tests (method display = ClassAndMethod, method display options = None)
Discovered: System.Collections.Concurrent.Tests (found 874 test cases)
Starting: System.Collections.Concurrent.Tests (parallel test collections = on, max threads = 12)
Finished: System.Collections.Concurrent.Tests
=== TEST EXECUTION SUMMARY ===
System.Collections.Concurrent.Tests Total: 2254, Errors: 0, Failed: 0, Skipped: 0, Time: 569.611s
API changes:
This class was written from scratch. The new design does not use concurrencyLevel for load balancing. Therefore, users no longer need to pay attention to this. In this regard, 3 new constructors without concurrencyLevel were added. Older constructors also work, but they are marked as obsolete. In addition, another overload of AddOrUpdate (TKey key, TValue addValue, TValue updateValue) was added.
Thus, the new class is fully compatible with the existing code that used it.
@MBurtsev the Benchmarks link is giving me a 404? Is it a private repo?
@QueenCitySmitty yeah sorry )
Can see now, thanks.
@stephentoub
Thanks! This looks like cool work.
One way you could help evaluate this is to add more benchmarks from different angles -- I.e. zero contention, low contention, high contention, mix of operation types, collisions, key/value types/sizes, NUMA, and so on. You might also check the dotnet/performance repo to see what existing benchmarks are available.
@MaximLipnin the existing benchmarks are in https://github.com/dotnet/performance/tree/master/src/benchmarks/micro/libraries/System.Collections. There are very good markdown docs in there explaining how to run them against private changes.
@danmosemsft I looked at the tests you pointed out. In this folder. Using Task's to measure performance is not correct for multithreaded algorithms. Tasks are launched using the thread pool. Which does not guarantee instant start of thread. To obtain objective indicators, it is extremely important that the threads start at the same time as possible. For these purposes, I use the helper class, which allows to do accurate tests with BenchmarkDotNet.
@stephentoub
I added a lot of unit tests that prove the absence of bugs. I would like to ask you to see this code if you have time for this.
@jkotas As it turned out, my approach consumes 3 times less memory. Due to the lack of a linked list in the bucket. Because links and objects on the heap take up additional memory besides useful data.
Using Task's to measure performance is not correct for multithreaded algorithms.
@adamsitnik thoughts about this?
Tasks are launched using the thread pool. Which does not guarantee instant start of thread.
Not true here.
a) these are "LongRunning". Each is spinning up its own dedicated thread.
b) even if that wasn't the case, barriers are being used to block all the workers until they all reach the point to be measured, and measurement doesn't start until all threads have reached that point.
Not true here.
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
static void Main(string[] args)
{
var count = 10_000_000;
var result = 0;
var threads = 16;
var ready = 0;
double start_time = Stopwatch.GetTimestamp();
for (var i = 0; i < threads; ++i)
{
Task.Factory.StartNew(() =>
{
//Interlocked.Add(ref ready, 1);
//while (Volatile.Read(ref ready) < threads)
//{
//}
var time = (Stopwatch.GetTimestamp() - start_time) * 1000_000_000d / Stopwatch.Frequency;
Console.WriteLine($"My time {time}ns");
for (var n = 0; n < count; ++n)
{
Interlocked.Add(ref result, 1);
}
Console.WriteLine(result);
}, CancellationToken.None, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
Console.ReadLine();
}
}
}
My time 83843700ns
My time 83788400ns
My time 83779000ns
My time 83805800ns
My time 83783400ns
My time 83797100ns
My time 83814100ns
My time 83823500ns
My time 83827300ns
My time 83845900ns
My time 83850400ns
My time 83807800ns
My time 86535300ns
My time 90237900ns
My time 92942700ns
My time 95103600ns
@stephentoub Yes, you are right, i can use tasks if barriers are used. This is so obvious. In my tests, I started threads manually, I used barriers the same way, and I don't see a problem in this.
thoughts about this?
As @stephentoub has explained, our benchmarks are correct.
@MBurtsev could you please follow the instructions from this doc and run ConcurrentDictionary benchmarks against your local build of .NET runtime?
It should be a matter of running a single command:
dotnet run -c Release -f netcoreapp5.0 --filter '*ConcurrentDictionary*' --corerun $before $after --join
Once you share the results we should have more insight into the impact of your proposed changes.
I'm confused: I tried very simple code and it doesn't work with this implementation of ConcurrentDictionary.
Basically, if 2 keys have same hash code, attempt to add a second key fails.
Looking at implementation I see that code does not support this case at all and possibly cannot by design.
Am I missing anything?
Below is example of the code. The code is simplified on purpose, but you expect hash code conflicts in real life.
public struct Foo : IEquatable
{
int v;
public Foo(int v) { this.v = v; }
public bool Equals(Foo f) { return this.v == f.v; }
public override bool Equals(object obj) { return obj is Foo && ((Foo)obj).v == this.v; }
public override int GetHashCode() { return 1; }
}
...
var d = new ConcurrentDictionary
d.Add(new Foo(1), 1);
d.Add(new Foo(2), 2); // This throws
@MBurtsev are you willing to continue working on this?
@SergeyLavr Did you run your code yourself? It doesn't compile
And secondly
public override int GetHashCode() { return 1; }
If you override GetHashCode (), make sure you do it correctly.
If you override GetHashCode (), make sure you do it correctly.
Always returning 1 is a valid GetHashCode (even if it's bad for perf).
Yes. Now read the paragraph after that.
Am I getting it right that this behavior is vital in the standard ConcurrentDictionary?
Yes, in all .NET collections.
@stephentoub OK. In this case, this topic can be closed.
Ok. Thanks for your efforts.
@En3Tho Sometimes, to achieve a goal, needs to sacrifice something. Dotnet will most likely always be a low-performance platform ;)
@MBurtsev, I'm not sure how you draw that conclusion. Regardless, you should feel empowered to publish your collection as a nuget package for others to consume. If it yields improved performance for consumers and those consumers are aware of and can work with the limitations, everyone wins. That said, "sacrificing" correctness is a pretty big limitation.
@stephentoub It's just emotions. Actually, I have other ideas how to achieve significantly higher performance. Using the event jornal architecture. But for this it will be necessary to significantly change the API signature. Not compatible with the standard library. For example TryAdd (key, value, callback). Perhaps when I have time I will do it. Good luck to you.
@SergeyLavr Did you run your code yourself? It doesn't compile
And secondly
public override int GetHashCode() { return 1; }
If you override GetHashCode (), make sure you do it correctly.
Yes, I ran the code myself. I had to update original to compile with .NET Framework 4.7.2 because I was interested to port it to older version - sorry for some confusion with it, I should have mentioned it from the beginning. I removed notnull and removed some interfaces just to quickly check functionality before doing bigger effort of porting.
However, it doesn't change the main point: I was especially suspicious about this functionality which failed based on my review of the code.
As to public override int GetHashCode() { return 1; } implementation: as was noticed before, it is simplified example to demonstrate the problem. Take string.GetHashCode implemented correctly, with all possible perf optimizations, you can still get same hash code from 2 different strings. Hash collisions are not avoidable so if dictionary fails on them it is just not right.
@SergeyLavr Yes you are right. Probably the bucket on the linked list is the only way. If HashCode were of type long, then for reference types the address of the object in memory could act as a hashcode. Or some kind of descriptor table. That would guarantee the uniqueness of hashcodes for objects whose hash functions can cause collisions. But these are fantastic thoughts :)
the address of the object in memory could act as a hash code
The GC can move objects.
Java has a lock-free hashtable implementation. It is detailed in this talk: https://www.youtube.com/watch?v=HJ-719EGIts
The table is completely lock-free even for growing. The author developed it for machines that have 768 CPU cores all using the same hashtable. It seems that scalability is rather nice.
@GSPP: Saw this video a while back and it is very cool. Always wondered if anything like this was considered to be added to .NET. However 768 core machines are very different from machine on which most .NET code runs. I'd love to hear analysis of pros and cons of this implementation (Perf, Memory, GC pressure etc.).
Most helpful comment
Java has a lock-free hashtable implementation. It is detailed in this talk: https://www.youtube.com/watch?v=HJ-719EGIts
The table is completely lock-free even for growing. The author developed it for machines that have 768 CPU cores all using the same hashtable. It seems that scalability is rather nice.