The need for it comes up every now and then. To me it would make sense next to ConcurrentDictionary<TKey, TValue>
What does this gives you that ConcurrentDictionary
Nothing but communicates intent better imo. I usually do stuff like this (not a proper set).
Set operations are perhaps tricky to get right with concurrency.
@ayende Apart from the "it's clearer what this code does" reason and the precedent of having HashSet<T>
, there is a small, but measurable, difference in performance and memory consumption.
A very simple test: when I add a million values to Dictionary<int, int>
, it takes 42 ms and consumes 26.6 MB of memory, while with HashSet<int>
, it takes 38 ms and consumes 21.3 MB.
I assume the difference with concurrent collections would be similar.
If anything more so, since the concurrency consideration means there are a few spots that want to be particularly efficient, so not having to do anything with values would be an even greater gain there than with HashSet<T>
.
@JohanLarsson
Set operations are perhaps tricky to get right with concurrency.
I think a ConcurrentSet
would be useful even without set operations.
I know of one C# concurrent hash set and one in Java and neither offer concurrency on set operations, only on the add, remove, contains.
Is there's plan to add this? If there is and it's up for grabs, I can look into it
@hermitdave You might want to have a look at how the API review process works.
Thanks @svick will keep an eye on this one
@svick
I think a
ConcurrentSet
would be useful even without set operations.
Agreed, perhaps it should not be called Set
? IDK.
@hermitdave Do you have a proposal for this ?
@priya91 oops forgot about this one.. I'll get started asap
Starting work on this
@hermitdave are you working on it?
@hermitdave do you still have interest in writing up the formal proposal here? If we review okays it promply and someone is interested in implementation, there may still be time to get this into 2.0.
Possibly the implementation could begin by wrapping ConcurrentDictionary, so the API is quickly available to use, then rewritten to have the space/time improvements of a custom implementation.
FWIW, @i3arnon has a really nice implementation at https://github.com/i3arnon/ConcurrentHashSet
@i3arnon maybe you should do a PR instead.
Just to be clear folks -- feel free to make a PR, but we can't take any change without an API review approval. That needs a proposal written up as above. @hermitdave were you going to do that?
@danmosemsft I had a look at what @i3arnon did and it is which is decent. if he isn't interested then I will do a PR later this week
@hermitdave we first need API proposal (see API review process) - i.e. review the API surface of the collection (with motivation and relation to 'classic' HashSet and other concurrent collections).
Just to set expectations: Based on recent discussions around other collections, we will likely have to find the right place for new collections (CoreFX repo might not be the desired destination). That may take even longer.
@hermitdave apologies, I wasn't clear in my comment. I should have said -- feel free to prototype in a fork, but a PR against CoreFX would be noise at this point without API approval
I see a limited use internal one exists. https://github.com/dotnet/corefx/blob/103639b6ff5aa6ab6097f70732530e411817f09b/src/Common/src/CoreLib/System/Diagnostics/Tracing/TraceLogging/ConcurrentSet.cs
Just FYI The implementation Dan refers to above is very specialized (e.g. there is no remove), and is probably not representative.
The C# compiler community wrote a simple wrapper for themselves, that is probably a better representation of what would be done if we wanted to add this class.
http://source.roslyn.io/#microsoft.codeanalysis/InternalUtilities/ConcurrentSet.cs
I find myself wanting this very much on occasion. Seems like something that shouldn't be overlooked.
@vancem I can't agree more! But three years passed, and no ConcurrentHashSet
I think most of us want a kind of collections which is similar to ConcurrentBag
We can make it using ConcurrentDictionary, but that implementation is too ugly.
Nothing but communicates intent better imo. I usually do stuff like this (not a proper set).
Set operations are perhaps tricky to get right with concurrency.
@JohanLarsson Your link is broken
@reggaeguitar fixed the link.
I think most of us want a kind of collections which is similar to ConcurrentBag, but its elements are never duplicated and its 'Contains()' is an O(1) operation like HashSet.
And which allows to remove a specific element, instead of just a random one (I just needed this today, and see https://stackoverflow.com/questions/3029818/how-to-remove-a-single-specific-object-from-a-concurrentbag for another example).
So no one ever made an API proposal for this? Maybe you guys should add the up-for-grabs
label.
@MgSam I can't stand it anymore, so I've spent some time on the API proposal.
Details see #39919 . Let's promote it and finally make it.
Duplicate of #39919 that has actual API proposal
Most helpful comment
@JohanLarsson
I think a
ConcurrentSet
would be useful even without set operations.