In the Framework, there are many types that can be converted into Span<T>, ReadOnlySpan, Memory<T>, and ReadOnlyMemory<T> (a.k.a. _slice types_).
The conversions are implemented as:
Span<T> ctor taking T[] parameterimplicit operator Memory<T> (T[] array)As<slice_type> extensions methods, e.g. static ReadOnlySpan<char> AsReadOnlySpan(this string text)Unfortunately, the set of these conversions is not very consistent. For example, AsReadOnlySpan method converting strings to ReadOnlySpan<char> has three overloads:
```c#
public static System.ReadOnlySpan
public static System.ReadOnlySpan
public static System.ReadOnlySpan
But logically equivalent ```T[]``` to ```Span<T>``` conversion method has only one overload:
```c#
public static System.Span<T> AsSpan<T>(this T[] array) { throw null; }
This means string can be sliced using the following code:
```c#
var slice = str.AsReadOnlySpan(5, 10);
... but to slice an array, developers have to write:
```c#
var slice = arr.AsSpan().Slice(5, 10);
This proposal outlines guidelines we could adopt to make the APIs more consistent and easier to use:
2. If a type can be converted to both heapable slices and by-ref slices, provide extension methods for both.
```c#
public static Span<T> AsSpan(this T[] value) ;
public static Span<T> AsSpan(this T[] value, int start);
public static Span<T> AsSpan(this T[] value, int start, int length);
public static Memory<T> AsMemory(this T[] value) ;
public static Memory<T> AsMemory(this T[] value, int start);
public static Memory<T> AsMemory(this T[] value, int start, int length);
Do not provide conversions to both read-only and read/write slice types. e.g T[] will have conversions to only ``SpanandMemory, not toReadOnlySpanorReadOnlyMemory
Do provide implicit casts between types if appropriate.
Not sure if we want, but we should discuss:
Limit constructors to the longest (most flexible) overload.
Provide `AsReadOnly() methods on r/w/ slice types.
Skip ReadOnly from conversion method names if the source type is already read-only. For example, string can be converted only to ReadOnlySpan<char> and so the As<slice_type> method should be called AsSpan, not AsReadOnlySpan.
The resulting changes are:
```c#
public static partial class MemoryExtensions
{
// * ADD *
// String
public static System.ReadOnlyMemory<char> AsMemory(this string text) { throw null; }
public static System.ReadOnlyMemory<char> AsMemory(this string text, int start) { throw null; }
public static System.ReadOnlyMemory<char> AsMemory(this string text, int start, int length) { throw null; }
public static System.ReadOnlySpan<char> AsSpan(this string text) { throw null; }
public static System.ReadOnlySpan<char> AsSpan(this string text, int start) { throw null; }
public static System.ReadOnlySpan<char> AsSpan(this string text, int start, int length) { throw null; }
// T[]
public static System.Memory<T> AsMemory(this T[] array) { throw null; }
public static System.Memory<T> AsMemory(this T[] array, int start) { throw null; }
public static System.Memory<T> AsMemory(tthis T[] array, int start, int length) { throw null; }
public static System.Span<T> AsSpan(this T[] array) { throw null; } // this already exist
public static System.Span<T> AsSpan(this T[] array, int start) { throw null; }
public static System.Span<T> AsSpan(this T[] array, int start, int length) { throw null; }
// ArraySegment
public static System.Memory<T> AsMemory(this ArraySegment<T> segment) { throw null; }
public static System.Memory<T> AsMemory(this ArraySegment<T> segment, int start) { throw null; }
public static System.Memory<T> AsMemory(this ArraySegment<T> segment, int start, int length) { throw null; }
public static System.Span<T> AsSpan(this ArraySegment<T> segment) { throw null; } // this already exist, just rename the parameter
public static System.Span<T> AsSpan(this ArraySegment<T> segment, int start) { throw null; }
public static System.Span<T> AsSpan(this ArraySegment<T> segment, int start, int length) { throw null; }
// * REMOVE *
// these actually just get renamed to AsMemory/Span (see above)
public static System.ReadOnlyMemory<char> AsReadOnlyMemory(this string text) { throw null; }
public static System.ReadOnlyMemory<char> AsReadOnlyMemory(this string text, int start) { throw null; }
public static System.ReadOnlyMemory<char> AsReadOnlyMemory(this string text, int start, int length) { throw null; }
public static System.ReadOnlySpan<char> AsReadOnlySpan(this string text) { throw null; }
public static System.ReadOnlySpan<char> AsReadOnlySpan(this string text, int start) { throw null; }
public static System.ReadOnlySpan<char> AsReadOnlySpan(this string text, int start, int length) { throw null; }
// remove and simply use the casts, e.g. from Memory<T> to ReadOnlyMemory<T> instead
public static System.ReadOnlyMemory<T> AsReadOnlyMemory<T>(this System.Memory<T> memory) { throw null; }
public static System.ReadOnlySpan<T> AsReadOnlySpan<T>(this System.Span<T> span) { throw null; }
public static System.ReadOnlySpan<T> AsReadOnlySpan<T>(this System.ArraySegment<T> arraySegment) { throw null; }
public static System.ReadOnlySpan<T> AsReadOnlySpan<T>(this T[] array) { throw null; }
}
```
cc: @ahsonkhan, @joshfree, @stephentoub
We've got the Span ctor, a cast to Span, and an AsSpan method:
```C#
T[] array = ...;
Span
Span
Span
And with this, we'll have the Memory ctor, a cast to Memory, and an AsMemory method:
```C#
T[] array = ...;
Memory<T> m1 = new Memory<T>(array);
Memory<T> m2 = array;
Memory<T> m3 = array.AsMemory();
For both Span and Memory, do we need all three? I don't have a strong opinion, I just want to make sure we're doing this thoughtfully. Do we need these AsXx methods?
ReadOnlySpan<char> AsReadOnlySpan(this string text)
ReadOnlySpan<char> AsReadOnlySpan(this string text, int start)
ReadOnlySpan<char> AsReadOnlySpan(this string text, int start, int length)
We should add AsMemory method to MemoryExtensions class.
This API was rejected in a previous API review: https://github.com/dotnet/corefx/issues/23862
FYI: The API review discussion was recorded - see https://youtu.be/bHwCPVNQLwo?t=333 (37 min duration)
If we are going to add AsMemory, we probably want to add AsReadOnlyMemory as well.
The reason we added AsSpan initially was to enable the builder pattern style of coding to easily and cleanly chain methods together. Otherwise, we would have code like this, which is awkward to read (and write):
https://github.com/dotnet/corefxlab/blob/010238eb489ccfb04a47774cc606653afb82464a/src/System.Azure.Experimental/System/Azure/Sha256.Unix.cs#L33
```C#
new Span
https://github.com/dotnet/corefxlab/blob/4bf5f6a527c109c888a78ff8b2593ee631f33126/samples/LowAllocationWebServer/LowAllocationWebServerLibrary/TcpConnectionFormatter.cs#L58
```C#
toSend = _buffer.AsSpan().Slice(ChunkPrefixSize, _written);
https://github.com/dotnet/corefxlab/blob/d197075cf865def8da82eabecab1c91f595c587f/src/System.Buffers.Experimental/System/Buffers/Sequences/ReadWriteBytes.cs#L248
```C#
array.AsSpan().Slice(_startIndex, length).CopyTo(buffer);
https://github.com/dotnet/corefxlab/blob/d197075cf865def8da82eabecab1c91f595c587f/tests/System.Text.Primitives.Tests/Encoding/Utf8Utf16ConversionTests.cs#L234
```C#
return codePoints.AsSpan().AsBytes().ToArray();
https://github.com/aspnet/SignalR/blob/a0c47c0c664a1e6c5b1bc669592d25bb4e68d186/test/Microsoft.AspNetCore.SignalR.Tests/EndToEndTests.cs#L93
```C#
Assert.Equal(bytes, buffer.Array.AsSpan().Slice(0, result.Count).ToArray());
**It also helps in ASP.NET (for type inference) where var is used in a lot of places**:
https://github.com/aspnet/SignalR/blob/a9d643a93e5b388984e208c3664366895baa4449/src/Microsoft.AspNetCore.SignalR.Common/Internal/Formatters/BinaryMessageFormatter.cs#L33-L34
```C#
var buffer = ArrayPool<byte>.Shared.Rent(lenNumBytes + payload.Length);
var bufferSpan = buffer.AsSpan(); // otherwise, figure out type T of buffer and then call new Span<T>(buffer);
cc @davidfowl, @pakrym
API review: We would like to see a complete proposal.
I updated the first comment in this issue with more complete proposal. We should discuss today.
Looks good as proposed. Regarding the open issues at the end:
Limit constructors to the longest (most flexible) overload.. Makes sense, we want the .ctors to be largely plumbing. However, we don't want to lose perf due to excessive argument validation. We'll add overloads as needed for perf, but not usability.
Provide AsReadOnly() methods on r/w/ slice types.. We agreed to not adding these methods.
Skip ReadOnly from conversion method names if the source type is already read-only. For example, string can be converted only to ReadOnlySpan<char> and so the AsAsSpan, not AsReadOnlySpan
- Skip
ReadOnlyfrom conversion method names if the source type is already read-only. For example, string can be converted only toReadOnlySpan<char>and so theAs<slice_type>method should be calledAsSpan, notAsReadOnlySpan.
I like this. If you are fussy about the return type you won't be using var anyway so will be writing; so its clear anyway
ReadOnlySpan<char> sp = "hello".AsSpan()
What do you gain with
ReadOnlySpan<char> sp = "hello".AsReadOnlySpan()
Not much really; and the terse crowd will be unhappy they can't do
var sp = "hello".AsSpan();
but have to do
var sp = "hello".AsReadOnlySpan();
Also it means all the .As... types will be consistent and in the same place; rather than being .AsR.. or .AsS depending on type.
Regarding the As<slice_type>(this <source_type> value, int start) overload (with a single int): when reading code like byte_array.AsSpan(5), my first instinct would be to think that it will create a span with the first 5 bytes of the buffer, while here it will _skip_ the first 5 bytes.
A typical scenario would be when slicing a pooled buffer that has been filled with data, and creating a span from it and passing it to some parsing method (in this case, the data typically starts at offset 0):
byte[] buffer = /* allocate from some pool */;
int read = ReadOrFillSomeBytes(buffer, ...); //stream, serializer, ...
Span<byte> span = buffer.AsSpan(read); //**BUG**: would actually _skip_ the section with the data!
I could see myself making this mistake a lot, and always needing to change it to buffer.AsSpan(0, read) (which is a different overload, and would waste some CPU cycles checking that offset 0 is non-negative and inside the array).
So what would be the scenario where you would call .AsSpan<byte>(this byte[], int start)?
I think that when people are starting from a byte[], they are probably slicing a pooled buffer, that has been filled with data starting at offset 0. They would probably look for a helper method that only take the size of the span.
_I also see that people already working on a Span<byte> are probably parsing or consuming it, and are looking for a way to drop the first N bytes. But then they would call Slice(N), not (not_a_span).AsSpan(N)?_
As I understand it, these overloads are intended to be shortcuts for As<slice_type>(this <source_type>).Slice(...), which makes at lot of sense. But since the As... prefix is less explicit than Substring() or Slice(), it may not be immediately evident to new comers?
@KrzysFR, it's possible that the single-int AsSpan overload will be used less often than others (or very rarely) , but I think there is value in consistency here. We are not going to remove the single int overloads in the other places (e.g. Slice methods), and so I think people will just have to learn what the single int overloads mean.
The C# is about to add first class slicing syntax: https://github.com/dotnet/csharplang/issues/185. Once that happens, I expect that there will be a strong preference to write either array.AsSpan()[..5] or array.AsSpan()[5..] instead of the multi-argument AsSpan APIs.
We could also add AsSpan(Range) overload and so people would write array.AsSpan(..5) and array.AsSpan(5..)
@terrajobst
Right. I think the question is whether all the extra overloads are worth adding everywhere (as a short term aid?) until Range comes online.
The current plan is that AsSpan(X..Y) does not mean the same as AsSpan(X,Y). It means AsSpan(X, Y-X). If we had nice syntax for index + length, I think it would be reasonable (after measuring perf impact, as Range will be heavier than one or two ints).
There are discussions about also having syntax for index+length, but IMO they are not converging on something that I would consider nice, e.g. [x.+y], [x.^y]
cc: @MadsTorgersen
I agree that both .AsSpan(..5) and .AsSpan(5..) would have the advantage of being very obvious in what they do, versus a single .AsSpan(5). Hopefully the perf impact will be minimal, compared to the gain in code readability.
@KrzysztofCwalina any plans to enable range support in existing APIs?
For example if you have a method Slice(int start, int length) parameters can be marked with attirbutes
Slice([RangeStart]int start, [RangeLenght]int length) that compiler would recognize and allow passing range instead of two arguments.
@KrzysztofCwalina - can you update the top comment with the final shape of the approved api. Thx.
I added "The resulting changes are:" section to the beginning of the issue.
@terrajobst, @weshaggard, @stephentoub, @davidfowl, @pakrym
// remove and simply use the casts, e.g. from Memory
to ReadOnlyMemory instead
Regarding (6), I think we should keep the existing AsReadOnlySpan/AsReadOnlyMemory methods and not rely solely on the implicit casts.
These APIs were added here https://github.com/dotnet/corefx/issues/24564#issue-264435027 since F# does not support implicit operators. Framework Design Guildelines say:
✓ CONSIDER providing methods with friendly names that correspond to each overloaded operator.
The other APIs can be removed, i.e.:
C#
public static System.ReadOnlySpan<T> AsReadOnlySpan<T>(this System.ArraySegment<T> arraySegment) { throw null; }
public static System.ReadOnlySpan<T> AsReadOnlySpan<T>(this T[] array) { throw null; }
How does F# convert from Span to ReadOnlySpan today?
They use the AsReadOnlySpan method.
Or, write something like this:
```F#
let memoryToReadOnlyMemory (memory : Memory<'a>) : ReadOnlyMemory<'a> =
Memory<'a>.op_Implicit memory
let spanToReadOnlySpan (span : Span<'a>) : ReadOnlySpan<'a> =
Span<'a>.op_Implicit span
```
cc @svick
In this case, I think we should do either of these two:
i.e. I don't think extension methods from all types to readonly versions of span and memory are the right approach.
I don't think extension methods from all types to readonly versions of span and memory are the right approach.
I agree. I prefer option 2. If we keep the existing Span -> ReadOnlySpan API, we can remove the readonly extension methods.
Add AsReadOnly() instance methods to types
Do these have to be instance methods? They are currently extension methods.
Do these have to be instance methods? They are currently extension methods.
I think they can be extensions.
@KrzysztofCwalina
public static System.Span<T> AsSpan(this T[] array, int start)
This overload (and ones similar to it) do not have a corresponding constructor which means they're going to have some unnecessary argument validation overhead. Do you want to lobby for adding these constructor overheads?
Span(T[] array, int start)
Memory(t[] array, int start)
which means they're going to have some unnecessary argument validation overhead
Aren't the extensions defined in the same assembly as span/memory? In which case couldn't they just use internal ctors?
This overload (and ones similar to it) do not have a corresponding constructor
That ctor was removed here due to this issue: https://github.com/dotnet/corefx/issues/23471
So implement this one separately for Fast and Portable? I guess we could do that. I see from ahson's comment, that constructor overload was removed for a reason.
Delete temporary AsReadOnlySpan forwarders from CoreLib once the changes propate through:
https://github.com/dotnet/coreclr/pull/16520/files#diff-34f82797d649dc710c6ff9268fa79c9bR691
The other APIs can be removed, i.e.:
public static System.ReadOnlySpanAsReadOnlySpan (this System.ArraySegment arraySegment) { throw null; }
public static System.ReadOnlySpanAsReadOnlySpan (this T[] array) { throw null; }
Why do we want to remove these? These require a number of ugly edits in existing code that turn nice uniformly flowing code:
ReadOnlySpan<byte> bytes = key.AsReadOnlySpan().Slice(start, len).AsBytes();
into ugly code like this:
ReadOnlySpan<byte> bytes = new ReadOnlySpan<byte>(key).Slice(start, len).AsBytes();
Can't you still do the following (be using implicit cast):
C#
ReadOnlySpan<byte> bytes = key.AsSpan().Slice(start, len).AsBytes();
That is not very appealing. It opens up accidental mutation opportunities in that code want to deal only with ReadOnly.
That is not very appealing. It opens up accidental mutation opportunities in that code want to deal only with ReadOnly.
If key is an array, it is already read/write though.
But if the code wants to treat it as read-only, seeing AsSpan() in the code sends the wrong signal to the reader. It'd be annoying if I was scanning code for places that mutated the array.
But if the code wants to treat it as read-only, seeing
AsSpan()in the code sends the wrong signal to the reader. It'd be annoying if I was scanning code for places that mutated the array.
That was brought up here as well:
https://github.com/dotnet/coreclr/pull/16521#discussion_r170261959
That was brought up here as well: dotnet/coreclr#16521 (comment)
And I still believe what I wrote :smile::
https://github.com/dotnet/coreclr/pull/16521#discussion_r170261959
"I prefer the old naming. It's possible I'm just used to it, but when I see the code in this PR, with AsSpan being passed to methods that accept a ReadOnlySpan, I immediately think "this is going through a cast and it's going to cost me something" (I've not looked to see if it does cost anything or not once the JIT is done with it). It also makes me think I'm getting back a mutable span even if I'm not, which makes reviewing code harder. And it'll lead to inconsistencies if we ever decide that we do want both AsSpan and As ReadOnlySpan exposed somewhere. I realize I may be in the minority, but I prefer the unabbreviated AsTargetType naming."
It'd be annoying if I was scanning code for places that mutated the array.
In the method that does the cast, you have r/w array, so you need to scan anyway. Once you pass r/w/ span to a ReadOnlySpan-taking method, no need to scan anymore.
@KrzysztofCwalina, where did the impetus to rename these come from? I've heard a fair number of people now express similar concerns (I believe at least @AtsushiKan, @weshaggard, @bartonjs, @stephentoub, @jkotas, and @danmosemsft).
It's one of the things that divides people. When we had radonly prefix people (other people hopefully) complained about the the APIs being too verbose and the prefix seeming unnecessary as it is often/always implied. Now that we made the change, different group does not like it.
Also, these APIs went through several changes. The changes were designed one by one by different groups (committees of people). We ended up in a very inconsistent state. This prompted me to write a holistic proposal that we reviewed and approved. I would be against going back and adjusting pieces of the proposal in PRs that are being worked on. If somebody feels super strongly about changing the current plan (worth reopening at this point?), the person should take it through the regular channels, i.e. prepare a holistic alternative proposal and push for API review discussion on it.
If somebody feels super strongly about the current plan (worth reopening at this point), the person should take it through the regular channels, i.e. prepare a holistic alternative proposal and push for API review discussion on it.
I think we made the wrong choice and should revert it, but I don't have the time to shepherd doing the right thing. It's unfortunate IMHO we're ending up here.
All api surface area changes are in. There are some internal bandaids that could be cleaned up but this is not critical for 2.1.
All api surface area changes are in. There are some internal bandaids that could be cleaned up but this is not critical for 2.1.
Most helpful comment
I like this. If you are fussy about the return type you won't be using
varanyway so will be writing; so its clear anywayWhat do you gain with
Not much really; and the terse crowd will be unhappy they can't do
but have to do
Also it means all the
.As...types will be consistent and in the same place; rather than being.AsR..or.AsSdepending on type.