BitArray
can be a useful class but being mutable means implementing operators on it does not make sense. I propose we add a new type that is immutable so it can have operators for easier usage.
Modelled based on BitArray
src/apisof.net and ImmutableList<T>
src/apisof.net
Open questions:
Set
, SetAll
, Get
methods as BitArray
has?IEnumerable<bool>
to BitArray
?Resolved questions:
public struct Enumerator GetEnumerator()
on BitArray
?public IEnumerable GetEnumerator()
, so we are stuck with boxing there.```c#
namespace System.Collections.Immutable
{
// NEW: BitArray has only ICollection, IEnumerable, not IEnumerable
public struct ImmutableBitArray : IEnumerable
{
public ImmutableBitArray(int length);
public ImmutableBitArray(byte[] bytes);
public bool this[int index] { get; }
public int Count { get; }
IEnumerator<bool> IEnumerable<bool>.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator();
public ImmutableBitArray Not();
public ImmutableBitArray Or(ImmutableBitArray right);
public ImmutableBitArray Xor(ImmutableBitArray right);
public ImmutableBitArray And(ImmutableBitArray right);
public ImmutableBitArray RightShift(int count);
public ImmutableBitArray LeftShift(int count);
}
}
### 2nd Part
```diff
namespace System.Collections.Immutable
{
public struct ImmutableBitArray : IEnumerable<bool>
{
+ public ImmutableBitArray(byte[] bytes, int length);
+ public ImmutableBitArray(bool[] values);
+ public ImmutableBitArray(int[] values);
+ public ImmutableBitArray(BitArray bits);
+ public ImmutableBitArray(int length, bool defaultValue);
+ public bool Any(bool value); // returns true if any bits have the specified value
+ public bool All(bool value); // returns true if all bits have the specified value
+ public int Count(bool value); // returns the count of bits having the specified value
+ public int Find(bool value); // returns the index of the first bit having the specified value
+ public int FindLast(bool value); // returns the index of the last bit having the specified value
+ public static ImmutableBitArray operator !(ImmutableBitArray iba);
+ public static ImmutableBitArray operator ~(ImmutableBitArray iba);
+ public static ImmutableBitArray operator |(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray operator ^(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray operator &(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray operator <<(ImmutableBitArray iba, int count);
+ public static ImmutableBitArray operator >>(ImmutableBitArray iba, int count);
+ public static ImmutableBitArray Not(ImmutableBitArray iba);
+ public static ImmutableBitArray Or(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray Xor(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray And(ImmutableBitArray left, ImmutableBitArray right);
+ public static ImmutableBitArray RightShift(ImmutableBitArray iba, int count);
+ public static ImmutableBitArray LeftShift(ImmutableBitArray iba, int count);
+ public Enumerator GetEnumerator();
+ public struct Enumerator : IEnumerator<bool>;
}
}
Looks like a valuable addition. Can anyone please submit a speclet? See api review process.
/cc: @Priya91 @ianhays
I'd be happy to give a speclet a go, just need a new name 😄
I'd be happy to give a speclet a go, just need a new name
ImmutableBitArray?
I see you chose slightly different pattern than the one on BitArray. Can you provide motivation, pros/cons?
cc: @ianhays @Priya91
For the most part I copied the API from BitArray
, changes were made for:
@Priya91 @ianhays any update?
A struct to act more like the 'int' it is.
I don't think that's wise. Do we do that with any other Immutable types? Consistency with the mutable BitArray is more valuable than making the immutable version a value type.
Changed instance bit operator methods to operator overloads since that makes more sense with a read only type.
Same comment here about consistency with the mutable BitArray. We want the barrier to transitioning between mutable/immutable to be preferably as thin as possible, so IMO the APIs should be as close to identical as we can get them.
@ianhays's suggestion to keep API surface compatible with mutable BitArray sounds reasonable.
I wonder if we could/should add the operators on top of that for easier use ... @ianhays, do we have such precedence in ImmutableCollections?
I do get the idea for API compatibility. At the same time the very thing that prompted me to created this proposal was adding the operators... (Even without them though it would still be a good idea I think)
Do we do that with any other Immutable types? Consistency with the mutable BitArray is more valuable than making the immutable version a value type.
ImmutableArray
is a struct. But I don't think we can make ImmutableBitArray
a struct no matter how much we may want that. ImmutableBitArray
needs to store both the array reference and the bit count so it has 2 fields. That makes it problematic in multi-threaded scenarios, exactly where one might consider using an immutable data structure.
Some API observations:
struct
enumerator, like List<T>
doesSetBit
/ResetBit
methods since the indexer can't be used for thatthe indexer has a setter, that's probably a mistake
Ah, yes. Thanks for the catch! I'll correct the API
the class should have a struct enumerator, like List
does
👍
I do get the idea for API compatibility. At the same time the very thing that prompted me to created this proposal was adding the operators... (Even without them though it would still be a good idea I think)
Why not both?
Why not both?
Paradox of choice?
Paradox of choice?
I don't think that's a particularly large concern when comparing operator overloads to named methods.
I'd personally rather have both than just the operator overloads.
Works for me, I'll add them too then
@ianhays
Changed instance bit operator methods to operator overloads since that makes more sense with a read only type.
Same comment here about consistency with the mutable BitArray. We want the barrier to transitioning between mutable/immutable to be preferably as thin as possible, so IMO the APIs should be as close to identical as we can get them.
I think that the proposed type is a perfect match for operator overloads. They should be provided on it, to make code using it easier to read.
I can also think of a single precedent in existing .Net: string concatenation. Immutable string, string
, supports +
in C#, while mutable string, StringBuilder
, has only an instance method Append
. (Though string
also has static method Concat
, so the precedent is actually to provide both on the immutable version.)
Also, I think that making the transition too easy could be dangerous. Consider this code:
```c#
BitArray first = …, second = …;
first.Or(second);
This code works fine. But if transitioning to `ImmutableBitArray` was made "easy" by providing an instance method `Or`, the transition would lead to this code:
```c#
ImmutableBitArray first = …, second = …;
first.Or(second);
This code is now wrong, because the result of Or
is not used (the correct code is first = first.Or(second)
). If Or
was instead a static method, then that means I have to change every use of the operation:
```c#
ImmutableBitArray first = …, second = …;
first = ImmutableBitArray.Or(first, second);
// or, with using static:
first = Or(first, second);
The mistake of ignoring the result is still possible, but much less likely, because when I change the code, I'm likely to think about what I'm changing. But if I'm changing the code, why not change it to a more readable and succinct form?
```c#
ImmutableBitArray first = …, second = …;
first = first | second;
// or even:
first |= second;
This also has the advantage that the mistake is now impossible, since first | second;
is not a valid statement on its own.
Though I'm not opposed to having methods in addition to operators, as long as the "too easy to ignore the return type when transitioning to ImmutableBitArray
" issue is addressed, probably by making the methods static.
I don't think string
is as good of a comparison as the other classes in ImmutableCollections
that face the same concern on returns e.g. how immList.Add(newItem)
doesn't modify the existing reference but returns a new reference. I think it's reasonable to follow that pattern and have ImmutableBitArray
structured the same way.
Also, I think that making the transition too easy could be dangerous. Consider this code:
This code works fine. But if transitioning to ImmutableBitArray was made "easy" by providing an instance method Or, the transition would lead to this code:
This is the same as ImmutableList providing an Add
method. The question then becomes "why doesn't ImmutableList have the Add operator overloaded instead to prevent the issue"?
If we could get by only with operators, I would agree with @svick. But given that we need to add SetBit
and ResetBit
, I think that consistency with other mutable->immutable types relation (like ImmutableList
as @ianhays pointed out) is good to keep.
From that point of view I think we should keep all Or
, And
methods from mutable BitArray
. We can add also the operators for ease of use. We may get some push back from our "API review group". In the meantime, I suggest to include both sets of methods in the API proposal. Calling out motivation for each (consistency vs. ease of use).
BTW: It's just my personal opinion without deeper API review experience :)
@ianhays
the other classes in ImmutableCollections that face the same concern on returns e.g. how immList.Add(newItem) doesn't modify the existing reference but returns a new reference
That's a good point.
The question then becomes "why doesn't ImmutableList have the Add operator overloaded instead to prevent the issue"?
I think it's because adding an item to a collection is sufficiently different from adding two numbers to use the same operator for both. (Though a similar argument could be made for for +
for string, especially in the string + char
form.)
On the other hand, ORing two ImmutableBitArrays
works the same as ORing two integers.
I've updated the speclet with methods as both instance members and static members
We should add also methods for LeftShift and RightShift (and operators for them) -- see dotnet/runtime#17359 (currently in PR).
API proposal moved to top post (middle post deleted).
I fixed incorrect static
in the first set of And
/Or
/... methods.
Added more comments to explain where each set of methods comes from.
Added open questions.
Fixed namespace Generics -> Immutable
Should we add Set, SetAll, Get methods as BitArray has?
I think you should add at least Set
since there's no other way to change a single bit. ImmutableArray
does have SetItem
Though it would be rather expensive to have to allocate a new array only to change a single bit. We might consider the existing BitArray
as a "builder" than can then be converted to the immutable form.
Should we add
IEnumerable<bool>
to BitArray?
Should we add public struct Iterator GetEnumerator on BitArray?
You can't add public Iterator GetEnumerator
because it already has public IEnumerator GetEnumerator()
. The reality is that BitArray
kind of sucks and attempting to mirror it or improve it won't work too well. IEnumerable<bool>
could be added but since you can't change the existing public GetEnumerator you still end up with an allocation from boxing the struct enumerator. Oh well, at least you won't have more allocations from boxing the enumerated bools.
Others:
Enumerator
, not Iterator
.BitArray
has both Count
and Length
properties. It probably has Count
only because of ICollection
. We may want to have only Length
to reflect the array nomenclature.BitArray
's length is settable. That's unusual for collections and I'm not sure how useful it is. We may consider adding Resize
to ImmutableBitArray
but I don't think it is a common operation on bit arrays.CopyTo
methods for one to be able to get the raw bits out of the bit array.@mikedn thanks for insights
You can't add
public Iterator GetEnumerator
because it already haspublic IEnumerator GetEnumerator()
.
Thanks, proposal updated with answer.
IEnumerable<bool>
could be added but ...
Stupid question: What is the value of having it on ImmutableBitArray
? How is it going to be used? What's the value? Why IEnumerable
not sufficient?
The struct enumerator should be called
Enumerator
, notIterator
.
Updated.
@SamuelEnglard I leave reaction to the rest of the feedback on you :)
Why IEnumerable not sufficient?
Performance. The use of IEnumerable
results in enumerated value types being boxed. Now, in the particular case of bool
it would be pretty easy to have pre-boxed true
/false
and reuse those to avoid allocation but the overhead is still higher than when using IEnumerable<bool>
.
What is the value of having it on ImmutableBitArray?
Convenience & performance. Sometimes you need to enumerate the bits and do some action depending on each bit value. Of course, you could a for
loop for this but foreach
is the norm. Providing IEnumerable<bool>
also allows it to work with LINQ without needing an additional OfType<bool>
that would only hurt performance.
And speaking of LINQ, while it is trivial to use LINQ to implement some common bit queries the performance is bad compared to a built in implementation and that alone may steer people toward creating their own bit arrays. Couple of additional methods that may be worth considering:
C#
bool Any(bool value); // returns true if any bits have the specified value
bool All(bool value); // returns true if all bits have the specified value
int Count(bool value); // returns the count of bits having the specified value
int Find(bool value); // returns the index of the first bit having the specified value
Could ImmutableBitArray be used for native interop? In C we have this fancy syntax for bitfields. When I modelled a StructLayout class for a native structure containing bitfields, an ImmutableBitArray could have been useful...
Do we get casting operators to primitive integer types?
Could ImmutableBitArray be used for native interop? In C we have this fancy syntax for bitfields. When I modelled a StructLayout class for a native structure containing bitfields, an ImmutableBitArray could have been useful...
IMO it's highly unlikely that someone will do the work necessary to allow such a type to marshal as a bit field. Besides, it would be overkill to use a reference type for a bit field.
Do we get casting operators to primitive integer types?
What would those do?
What would those do?
I'm thinking:
public static explicit operator int(ImmutableBitArray i);
// And similar for byte, sbyte, short, ushort, uint, long and ulong
This could be useful if uses a bit array to create the value that is needed elsewhere in the form of a numeric value.
It would also be useful to be able to do the reverse without creating a single-element int array to construct a ImmutableBitArray. E.g.:
// As ctor
public ImmutableBitArray(int value);
// Or as explicit cast
public static explicit operator ImmutableBitArray(int value);
That way I could then easily do:
const int errorBitIdx = 5;
var bitArray = (ImmutableBitArray)0xFFFF;
int returnValue = SomeMagicFunctionThatTakesABitflagsValueAsArgument((int)bitArray);
var returnBitArray = (ImmutableBitArray)returnValue;
if (returnBitArray[errorBitIdx])
{
// Do some magic...
}
@couven92 I've been thinking about being able to cast to and from ImmutableBitArray
and primitive integer types.
The main issue with casting from is overflow. That could be solved with the classic TryX
pattern or with throwing an exception (which I hate).
It can't be done "to" as a constructor if we want to keep some API compatibility with BitArray
since it has a constructor that takes an int
as the length. Admittedly that doesn't stop a casting option.
@mikedn added the LINQ like methods you proposed and added a FindLast
too.
It can't be done "to" as a constructor if we want to keep some API compatibility with BitArray since it has a constructor that takes an int as the length.
What is the point of having a constructor with the length as an argument to an immutable type? length AND default value might make sense, but I do not see how length alone would be useful...
@SamuelEnglard could you update the API proposal with the new shift methods that were added in https://github.com/dotnet/corefx/issues/8719?
@ianhays done
@safern any comments/thoughts? If not, then this looks ready for review.
My only remaining thought is that the API surface area is quite large in its current state and may be more difficult to push through review. It may be better to implement this in stages, where the first stage has a minimal API similar to BitArray, the second stage adds e.g. operator overloads and other constructors, then the third stage adds Any/First/FirstLast/etc.
Just the same thought as you have @ianhays we should definitely break it down maybe in two steps to make it easier to review as you mention. In general I like this proposal.
I think it would be sufficient to just separate the methods clearly into 2 sections: 1st part: old APIs, 2nd part: New APIs.
Is that ok with you @ianhays @safern?
@karelz that's good enough! I will go ahead and do it :)
Updated the proposal!
We haven't invested in BitArray
because it has relatively low usage. We generally don't want to follow the path of other platforms (such as the STL) that has a large variety of collection types because it creates cognitive load when deciding which one to use and can make it harder to exchange types.
The proposed type seem reasonable, given the design of BitArray
. However, given BitArray
's low usage having an immutable version is likely finding even fewer customers. Thus, we don't think it makes the bar for the BCL collections.
I'm going to take that as a compliment 😃
Most helpful comment
I'm going to take that as a compliment 😃