Use case: efficiently creating an ImmutableArray
with a fixed (>4) number of elements.
Presently if you need to create an ImmutableArray<T>
with a known collection of elements, the most efficient way to do so is to create an ImmutableArray<T>.Builder
, set its Capacity
to allocate an array of the correct size, fill the Builder
with repeated calls to Add
, and then use MoveToImmutable
to create the ImmutableArray
. (This strategy can be marginally improved by storing a cached Builder
in a ThreadLocal
.)
For element counts smaller than 4, there are overloads of ImmutableArray.Create
which are implemented by creating an array and then immediately turning it into an ImmutableArray
, but for more than 4 Create
takes a params[]
and copies it into an ImmutableArray
. There is no way to create an array directly, fill it, and freeze it into an ImmutableArray
without copying while pinky-promising never to mutate the original array again.
I set up a microbenchmark (see below) which uses codegen to call ImmutableArray<T>
's private T[]
constructor, and it's around 4-5 times faster for an array of 1000 long
s. I found that a library of mine was spending a reasonable chunk of its time creating ImmutableArray
s, and implementing this hack led to a global ~10% speedup in my end-to-end benchmarks. I'm not entirely happy about depending upon private implementation details like that though.
I propose adding a officially-supported static method to freeze an array into an ImmutableArray
without copying it. I suggest calling it UnsafeFreeze
to emphasise the fact that this method is risky: to use it correctly you have to make sure never to mutate the array that you passed in. (This name may be confusing given that it doesn't have anything to do with C#'s unsafe
; I'm open to alternative suggestions.)
public static class ImmutableArray
{
[EditorBrowsable(EditorBrowsableState.Never)]
// might wanna add [MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ImmutableArray<T> UnsafeFreeze<T>(T[] array)
=> new ImmutableArray<T>(array);
}
There may be a use case for a similar UnsafeThaw
method which turns an ImmutableArray<T>
into a T[]
without copying, but I can't think of one right now.
NB. There is precedent in the BCL for APIs which ask you to relinquish ownership of the argument. ArrayPool.Return
requires you not to continue using the array you returned because it is liable to get handed out to a future caller of Rent
.
Here's the code for the microbenchmark I mentioned above (using BenchmarkDotNet):
public class Bench
{
static readonly ThreadLocal<ImmutableArray<long>.Builder> _cachedBuilder = new ThreadLocal<ImmutableArray<long>.Builder>(ImmutableArray.CreateBuilder<long>);
static readonly Func<long[], ImmutableArray<long>> _unsafeFreeze = GetUnsafeFreeze<long>();
[Benchmark(Baseline = true)]
public void CachedBuilder()
{
var builder = _cachedBuilder.Value;
builder.Capacity = 1000;
for (long i = 0; i < builder.Capacity; i++)
{
builder.Add(i);
}
builder.MoveToImmutable();
}
[Benchmark]
public void UnsafeFreeze()
{
var array = new long[1000];
for (long i = 0; i < array.Length; i++)
{
array[i] = i;
}
_unsafeFreeze(array);
}
static Func<T[], ImmutableArray<T>> GetUnsafeFreeze<T>()
{
var ctor = typeof(ImmutableArray<T>)
.GetConstructors(BindingFlags.NonPublic | BindingFlags.Instance)
.Single(c => c.GetParameters().Count() == 1 && c.GetParameters().Single().ParameterType.Equals(typeof(T[])));
var param = Expression.Parameter(typeof(T[]));
var body = Expression.New(ctor, param);
var func = Expression.Lambda<Func<T[], ImmutableArray<T>>>(body, param);
return func.Compile();
}
}
And the results:
Method | Mean | Error | StdDev | Scaled | ScaledSD |
-------------- |---------:|----------:|----------:|-------:|---------:|
CachedBuilder | 8.439 us | 0.1667 us | 0.4089 us | 1.00 | 0.00 |
UnsafeFreeze | 1.944 us | 0.0399 us | 0.1169 us | 0.23 | 0.02 |
We had methods like these, but they were removed: https://github.com/dotnet/corefx/pull/196/
The current "solution" is to use hack like what you have done. You can find another version of the hack in dotnet/corefx#196. Multiple versions of this hack are replicating in uncontrolled way. I have even seen one in F#. I think it is a much worse situation than having officially santioned unsafe interop methods. Maybe we should reconsider?
@jkotas Thanks, I didn't know about dotnet/corefx#196. I quite like the idea of using a ref
parameter and setting it to null
; it is a little more complex but it has some minor safety benefits in the relatively common case where the array has only one reference to it. I also quite like the ByteArrayUnion
idea in that PR as a way for me to implement my existing hack, might turn out to be marginally faster than calling a dynamic virtual method like I currently do.
I agree with you, the status quo of proliferating unsupported hacks and workarounds is bad for everyone: bad for users who have to write ugly and dangerous code, and bad for you guys who have to consider users who may be depending on private implementation details.
The leanest and fastest version of this hack is to use System.Runtime.CompilerServices.Unsafe. With that, the hack is one-linker like this:
byte[] a = ...
ImmutableArray<byte> ia = Unsafe.As<byte[], ImmutableArray<byte>>(ref a);
cc @VSadov @gafter @jaredpar If I remember correctly, you had strong opinions about not having an officially sanctioned unsafe API for this.
The whole point of ImmutableArray<T>
is to be immutable. The type does not offer any other value.
There is a number of patterns, such as argument validation without making private copies that rely on instances never changing values. If we could rely on a convention, we would not need the type and just use ordinary arrays.
I am a bit amused that we have to defend that immutableArray<T>
must be immutable, but no one proposes the same ideas for System.String
.
Perhaps there is a problem with documentation...
Using System.Runtime.CompilerServices.Unsafe
to get around immutability seems the right approach.
That is basically the purpose of System.Runtime.CompilerServices.Unsafe
- to get around the type system when so desired.
Another solution could be just using ordinary arrays on the hot internal paths.
I can see how ImmutableArray<T>
could be an unnecessary obstacle in a closed system which could implement "I will not mutate" convention and where performance is paramount.
We do use ordinary arrays in Roslyn when performance is critical and immutability stands in the way and when we do not need to expose the data across public APIs.
I agree that getting around immutability is fundamentally unsafe operation. The question is whether we should have a API to codify the unsafe pattern instead of telling everybody to depend on internal implementation details.
For example, we have introduced Memory<T> System.Runtime.InteropServices.MemoryMarshal.AsMemory<T>(ReadOnlyMemory<T> readOnlyMemory)
. We could have said that folks should use Unsafe to cast ReadOnlyMemory<T>
to Memory<T>
and take a fragile dependency on the internal implementation details, but we did not. We have API to codify the pattern that gives us control over it - our API telemetry can see how many places it used for, etc. The API is in System.Runtime.InteropServices
that is namespace reserved for unsafe code.
I looked into why the safe code is so much slower than the unsafe code, since a good optimizing compiler should be able to make the performance of using Builder
close to the performance of bypassing that abstraction.
My conclusion is that adding a new dangerous API is not necessary. What is necessary is:
The caller should be changed from setting Capacity
and using Add()
to setting Count
and using the indexer setter:
c#
public void CachedBuilderIndexer()
{
var builder = _cachedBuilder.Value;
builder.Count = 1000;
for (int i = 0; i < builder.Count; i++)
{
builder[i] = i;
}
builder.MoveToImmutable();
}
ImmutableArray<T>.Builder
indexer setter should be tweaked so that the CLR will inline it.With these two changes, by my measurements, the performance is only 1.24脳 slower than UnsafeFreeze
, which I think is acceptable, especially considering that the original code is 5脳 slower than UnsafeFreeze
.
If you want more details, here's what I did:
First I looked at the disassembly of CachedBuilder
(whose scaled performance relative to UnsafeFreeze
is 5脳) and noticed that the Add()
method was not being inlined. So I created my own stripped-down copy of ImmutableArray<T>
and tweaked its Add()
to make it shorter in IL. This didn't actually put it under the threshold, but it did trigger "Inline candidate is mostly loads and stores.", which meant the method was inlined. The result is that the performance of this method, MyInlinedCachedBuilder
, was 2.1脳.
When I looked at the disassembly of the previous method, I noticed the presence of (inlined) EnsureCapacity
. Since the array will not be resized in the loop, that is not necessary. So I switched to setting Count
and then using the indexer setter. The performance of this method, MyCachedBuilderIndexer
, was 3.0脳. This is worse than the previous method, but we're not done yet.
When I looked at the disassembly, I noticed that the indexer was not being inlined. So I tweaked its code by extracting throw into a separate method. This meant that the indexer was now being inlined, resulting in performance of this method, MyInlinedCachedBuilderIndexer
of 1.24脳, relative to UnsafeFreeze
.
I ran this on .NET Core 2.1.0-preview2-26131-06. The code is here, the raw results were:
| Method | Mean | Error | StdDev | Scaled | ScaledSD | Gen 0 | Allocated |
-------------------------------------- |---------:|----------:|----------:|-------:|---------:|-------:|----------:|
| CachedBuilder | 7.562 us | 0.0469 us | 0.0416 us | 4.93 | 0.41 | 2.5024 | 7.84 KB |
| CachedBuilder
Indexer | 4.502 us | 0.0122 us | 0.0108 us | 2.94 | 0.25 | 2.5024 | 7.84 KB |
| MyCachedBuilder | 7.894 us | 0.0195 us | 0.0183 us | 5.15 | 0.43 | 2.4414 | 7.84 KB |
| MyInlinedCached
Builder | 3.247 us | 0.0147 us | 0.0137 us | 2.12 | 0.18 | 2.5024 | 7.84 KB |
| MyCachedBuilder
Indexer | 4.591 us | 0.0691 us | 0.0612 us | 2.99 | 0.25 | 2.5024 | 7.84 KB |
| MyInlinedCached
BuilderIndexer | 1.909 us | 0.0352 us | 0.0312 us | 1.24 | 0.11 | 2.5330 | 7.84 KB |
| UnsafeFreeze | 1.545 us | 0.0448 us | 0.1322 us | 1.00 | 0.00 | 2.5330 | 7.84 KB |
More powerful builder helps in some cases, but it won't ever solve the interop case when you need to interop efficiently with the existing code that expects regular arrays and guarantees immutability via documentation. Here is an example of this use case from Roslyn: https://github.com/dotnet/roslyn/blob/944ae114140ba77cbd4be370bf13a7f758f740b3/src/Compilers/Core/Portable/NativePdbWriter/PdbWriter.cs#L593
Most helpful comment
I agree that getting around immutability is fundamentally unsafe operation. The question is whether we should have a API to codify the unsafe pattern instead of telling everybody to depend on internal implementation details.
For example, we have introduced
Memory<T> System.Runtime.InteropServices.MemoryMarshal.AsMemory<T>(ReadOnlyMemory<T> readOnlyMemory)
. We could have said that folks should use Unsafe to castReadOnlyMemory<T>
toMemory<T>
and take a fragile dependency on the internal implementation details, but we did not. We have API to codify the pattern that gives us control over it - our API telemetry can see how many places it used for, etc. The API is inSystem.Runtime.InteropServices
that is namespace reserved for unsafe code.