Runtime: Add "slicing" APIs to ArraySegment<T>

Created on 2 Aug 2016 · 60Comments · Source: dotnet/runtime

ArraySegment could be much more useful if it had the following additional APIs. I propose that we add these members to ArraySegment:

namespace System
{
    partial struct ArraySegment<T>
    {
        public T this[int index] { get; set;  }

        public static implicit operator ArraySegment<T>(T[] array);

        public static explicit operator T[](ArraySegment<T> segment);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

api-approved area-System.Collections up-for-grabs wishlist

Source

KrzysztofCwalina

👍10

Most helpful comment

The nice about a conversion operatator is that the compiler will give you an error message like:

Cannot convert ArraySegment<Bananas> to @Bananas[]. An explicit conversion exists. Are you missing a cast?

Now, one could also argue that this is the problem if people just apply the cast without realizing that there is performance/memory penalty for doing so.

C# public T[] CreateArray();

I think we should call it ToArray(). That matches the name Linq has and thus shadows the Linq implementation with this one, which is likely being more performant.

Other than that, the proposals look good to me.

terrajobst on 2 Aug 2016

👍4

All 60 comments

public static explicit operator T[](ArraySegment<T> segment);

Does it make a copy?

omariom on 2 Aug 2016

@omariom, yes. this is why I made it an explicit cast. Possibly it should be removed completely.

KrzysztofCwalina on 2 Aug 2016

The nice about a conversion operatator is that the compiler will give you an error message like:

Cannot convert ArraySegment<Bananas> to @Bananas[]. An explicit conversion exists. Are you missing a cast?

Now, one could also argue that this is the problem if people just apply the cast without realizing that there is performance/memory penalty for doing so.

C# public T[] CreateArray();

I think we should call it ToArray(). That matches the name Linq has and thus shadows the Linq implementation with this one, which is likely being more performant.

Other than that, the proposals look good to me.

terrajobst on 2 Aug 2016

👍4

MemoryStream has ToArray as well.
Prefix To here is an indication that the method makes a copy of the underlying storage.

Good candidat for implicit operator is conversation to Span.
When Slices come to CoreCLR, of course.

omariom on 3 Aug 2016

Changed CreateArray to ToArray

KrzysztofCwalina on 3 Aug 2016

Here are all of List<T>'s CopyTo overloads:

``` c#
public void CopyTo(T[] array);

public void CopyTo(T[] array, int arrayIndex);

public void CopyTo(int index, T[] array, int arrayIndex, int count);

```

Should ArraySegment<T> match?

In other words:

Rename index in the proposal to arrayIndex (assuming index is the index of the destination array in which copying begins) to match the naming used by List<T>.CopyTo (and ICollection<T>.CopyTo).
Add a source index parameter to the largest overload, to match List<T>.

justinvp on 3 Aug 2016

Interestingly, ImmutableArray<T> uses different parameter names:

``` c#
public void CopyTo(T[] destination);

public void CopyTo(T[] destination, int destinationIndex);

public void CopyTo(int sourceIndex, T[] destination, int destinationIndex, int length);

```

Whereas ImmutableList<T> matches List<T>.

justinvp on 3 Aug 2016

I like ImmutableArray's overloads and naming convention.

mellinoe on 3 Aug 2016

updated CopyTo to use "destination"

KrzysztofCwalina on 3 Aug 2016

What is the point of adding Length, when Count already exists?

svick on 13 Aug 2016

Are we going to add array segment to more APIs as well. Nothing takes array segment except for the web socket APIs

davidfowl on 13 Aug 2016

@davidfowl, yes. I think we should/must.

KrzysztofCwalina on 23 Aug 2016

@KrzysztofCwalina is it ready for API review?

karelz on 29 Sep 2016

We should remove the explicit operator as it seems weird. And Empty should be static.

The indexer needs to perform a range check; it doesn't have to perform a null check because the range check can just use the fields (offset and count). The only case to observe a null reference is when the ArraySegment<T> was created by concurrent writes to a field, which we don't think warrants slowing down the indexer with additional checks.

Also, ToArray() should always copy, except when the array is empty or null, in these cases it should return Array.Empty<T>().

``` C#
namespace System
{
partial struct ArraySegment
{
public T this[int index] { get; set; }

    public static implicit operator ArraySegment<T>(T[] array);

    public T[] ToArray();

    public int Length { get; }

    public ArraySegment<T> Slice(int index, int count);

    public ArraySegment<T> Slice(int index);

    public void CopyTo(T[] destination);

    public void CopyTo(T[] destination, int destinationIndex);

    public void CopyTo(T[] destination, int destinationIndex, int count);

    public void CopyTo(ArraySegment<T> destination);

    public static ArraySegment<T> Empty { get; }
}

}
```

terrajobst on 4 Oct 2016

👍1

Where new API should be implemented in corefx or coreclr repository?

AlexRadch on 28 Nov 2016

The low-level APIs which already live in coreclr repo ('corlib') need to be done first there. All APIs should have tests and reference assemblies in corefx repo - it is tricky and not best documented process at this moment.
If you want to help with something simpler, I would suggest to start with corefx-only changes.

karelz on 28 Nov 2016

@karelz I made two simple PRs in corefx, but have issues with changing config files. Can you help me? https://github.com/dotnet/corefx/pull/13975 and https://github.com/dotnet/corefx/pull/13976

AlexRadch on 28 Nov 2016

@AlexRadch the more reason to not attack coreclr+corefx changes which are more tricky.

I hope that @Priya91 and @ianhays will be able to help you with the two above. I myself still lack the experience in making those changes to be able to provide useful advice.

karelz on 28 Nov 2016

@AlexRadch From your PR, seems like you're not adding the APIs proposed in this issue. I left feedback on one of your PRs. Please keep the conversation on related issues.

Priya91 on 28 Nov 2016

What difference between array slices, array spans and array segments ? I am confused :(

AlexRadch on 5 Dec 2016

@Priya91 @AlexRadch is there PR for this issue? If yes, it should be linked.

Slices and spans are part of the new Slice<T> API. ArraySegment is the old API.

karelz on 5 Dec 2016

is there PR for this issue?

I'm not aware of any..

Priya91 on 5 Dec 2016

@karelz Why add new methods to old API? May be better to mark ArraySegment as deprecated?

AlexRadch on 6 Dec 2016

@AlexRadch ArraySegment will not be deprecated; there is nothing wrong with it. Span is just a new way of doing things. Using ArraySegment may be necessary if you want to get back the underlying array.

@terrajobst Regarding the proposal you have above

namespace System
{
    partial struct ArraySegment<T>
    {
        public T this[int index] { get; set;  }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(T[] destination, int destinationIndex, int count);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

Should the indexer return a ref T { get; } instead? This would reduce copying for structs-- see https://github.com/dotnet/coreclr/issues/1133.
Can we replace all of the CopyTo overloads with CopyTo(Span<T>)? I thought one of the main reasons for having Span was to eliminate array/offset/count methods.
While we're changing the APIs here, can we also have the type implement IEquatable<ArraySegment<T>> because it is a value type that already overrides Equals?

jamesqo on 8 Dec 2016

New proposed is

namespace System
{
    public struct ArraySegment<T> : IList<T>, IReadOnlyList<T>, IEquatable<ArraySegment<T>>
    {
        public ref T this[int index] { get; }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(Span<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

jamesqo on 8 Dec 2016

@karelz Sorry, I just noticed this has already been marked api-approved... was the above spec the final decision, or can this be reviewed again?

jamesqo on 8 Dec 2016

There is always option to change ... changing APIs before we ship them in major release is "doable".

I don't have opinion on public ref T this[int index] { get; } - @terrajobst is it a pattern we want to consider?
I don't think we should replace CopyTo overloads with Span overload, because ArraySegment is about arrays and their segments -- so this is the place where it makes sense to add. cc: @KrzysztofCwalina @terrajobst any push back?
I don't have opinion on IEquatable<ArraySegment<T>>. Do you have usage in mind? Or are you just applying common patterns?

karelz on 8 Dec 2016

re # 1: Span's indexer is now by ref (which was decided at an API review meeting), so we already approved this new pattern. There are trade offs here, but long term the wins are large, so I think we should go with it.
re # 2: The CopyTo parameter makes sense. Spans are about arrays too. The only thing I worry about is layering.
re # 3: I am not a big fan of O(n) Equals methods. Also, Span.Equals just compares the fields, not the items in the span.

KrzysztofCwalina on 8 Dec 2016

👍1

@KrzystofCwalina

I am not a big fan of O(n) Equals methods. Also, Span.Equals just compares the fields, not the items in the span.

We are definitely not making this an O(N) Equals. ArraySegment already has a strongly-typed Equals here, which just compares the fields. I am just proposing we implement the interface. (@karelz, there are no new methods being added for the IEquatable implementation.)

jamesqo on 8 Dec 2016

@KrzysztofCwalina regarding # 2, do you mean to use rather Span overload, which means we should wait and block on Span to mature first? (and solve the layering problem of course)

@jamesqo I understand you are proposing just to implement the interface. My question is why? Anything we do should have a reason and value for developers. We should not add interfaces just because we can.
So, how will user code leverage the interface? What (real-world) code patterns is it going to enable?

karelz on 8 Dec 2016

@karelz I am working on this issue

AlexRadch on 12 Dec 2016

@karelz I have 3 issues with this API

I suggest to replace public void CopyTo(ArraySegment<T> destination) on public void CopyTo(Span<T> destination)

public void CopyTo(Span<T> destination) can be called with Span, Array and ArraySegment parameter. Because Array and ArraySegment have implicit conversation to Span.

public void CopyTo(ArraySegment<T> destination) can be called with Array and ArraySegment parameter but not with Span, because Span can not be converted to ArraySegment.

So to support new slicing API I suggest to replace public void CopyTo(ArraySegment<T> destination) on public void CopyTo(Span<T> destination)

If we replace ArraySegment<T> destination to Span<T> destination as suggested in previous issue, then I suggest do not add next three CopyTo overloads

        public void CopyTo(T[] destination);
        public void CopyTo(T[] destination, int destinationIndex);
        public void CopyTo(T[] destination, int destinationIndex, int count);

All of them can ease coded with new slicing API

segment.CopyTo(array);
segment.CopyTo(array); // no code to change

segment.CopyTo(array, destinationIndex);
segment.CopyTo(array.Slice(destinationIndex));

segment.CopyTo(index, array, destinationIndex, count);
segment.Slice(index, count).CopyTo(array.Slice(destinationIndex));

New slicing API allows us do not create many overloaded methods.

public void CopyTo(T[] destination, int destinationIndex, int count) is strange method. List and Span classes haven't such method. List has similar method public void CopyTo( int index, T[] array, int arrayIndex, int count) with additional parameter int index. Is it accidentally skipped?

AlexRadch on 12 Dec 2016

@karelz Span have public bool IsEmpty{ get; } property. Should we add such property too?
public static ArraySegment<T> Empty { get; } is added.

AlexRadch on 12 Dec 2016

I created PR https://github.com/dotnet/coreclr/pull/8593 with changes that I suggested above.

AlexRadch on 12 Dec 2016

@AlexRadch re you [1] -- you should wait on closure in the discussion first: https://github.com/dotnet/corefx/issues/10528#issuecomment-265810190

@KrzysztofCwalina can you please comment?

karelz on 12 Dec 2016

When I said layering, I meant: Span is in System.Memory.dll, which a) is not in DotNetStandard and is not in System.Runtime.dll. ArraySegement is both in NS and S.R.dll. I think we could make it work (by adding Span to these lower level libraries, but it's complicated and I think this cleanup/PR should not block on the much larger effort (to bring Span lower).

KrzysztofCwalina on 12 Dec 2016

Thanks @KrzysztofCwalina! That makes sense.

@AlexRadch your questions [1] and [2] are now answered -- we won't use Span<T>, we will go with the original proposal.
Regarding your [3], we should also stick to the original proposal. The original method makes perfect sense IMO -- you copy into array starting at index with specific count. The List.CopyTo overload tries to emulate segment of List IMO, which is not needed here - we already have 'segment' of the array.

karelz on 12 Dec 2016

@karelz ok about [1] and [2]
[3] segment have offset and count so if we remove index from List.CopyTo( int index, T[] array, int arrayIndex, int count) then why we do not remove count also.

But if we will remove offset and count than we already have such method public void CopyTo(T[] destination, int destinationIndex); I do not understand half removing.

So I suggest do not add public void CopyTo(T[] destination, int destinationIndex, int count); overload with half removing.

It can be easy coded as segment.Slice(newOffset, newCount).CopyTo(destination, destinationIndex).

@KrzysztofCwalina What do you think?

AlexRadch on 12 Dec 2016

I could live without that overload CopyTo(T[] destination, int destinationIndex, int count). @KrzysztofCwalina?

karelz on 12 Dec 2016

👍1

We just talked about it and concluded we should indeed remove this overload:

C# public void CopyTo(T[] destination, int destinationIndex, int count);

We should leave the other CopyTo methods in, though.

terrajobst on 13 Dec 2016

👍1

@AlexRadch thanks for pointing it out. The API proposal in the top post is now updated. Feel free to implement it as defined in the top post.

karelz on 13 Dec 2016

@karelz I found 2 issues about Empty

It should be static.
I think we should add IsEmpty property like Span has.

AlexRadch on 13 Dec 2016

Empty fixed to static (it was some copy-paste error, it was properly listed e.g. here: https://github.com/dotnet/corefx/issues/10528#issuecomment-251477256).

Let's track IsEmpty addition as separate issue, otherwise we will never converge on the API shape. Please file it, provide API shape and motivation (IMO "Span has it", is not good enough - I would like to see more details and usage patterns).

karelz on 13 Dec 2016

@karelz I created separate issue dotnet/corefx#14502

AlexRadch on 14 Dec 2016

No activity/response for 1.5 months, unassigning the issue - it is "up for grabs" again, available for anyone to pick it up and finish it.
Anyone interested?

karelz on 29 Jan 2017

@ianhays @safern I moved to Future as this API does not seem necessary for our 2.0 goals. Feel free to change if it is. You might do the same with other issues of this kind if any

danmosemsft on 2 Mar 2017

@KrzysztofCwalina actually wanted this one to happen in 2.0 - he said he will do it himself if no one picks it up ...

karelz on 2 Mar 2017

@karelz Curious, when does 2.0 come out?

jamesqo on 2 Mar 2017

@jamesqo check the milestone description: https://github.com/dotnet/corefx/milestones
There's link to roadmap: https://github.com/dotnet/core/blob/master/roadmap.md -- currently set to Q3 2017.

karelz on 2 Mar 2017

👍1

@jamesqo if you want to pick up this issue, it would be appreciated ;-)

karelz on 2 Mar 2017

@karelz Sure, one thing though. It seems like from the previous discussion @KrzysztofCwalina decided in favor of having a ref T indexer, but the proposal was never updated?

jamesqo on 3 Mar 2017

@karelz, @terrajobst made a post further down here that had the latest proposed API at the time, so I think the current proposal looks like

namespace System
{
    partial struct ArraySegment<T>
    {
        public ref T this[int index] { get; }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

Correct?

jamesqo on 3 Mar 2017

@KrzysztofCwalina By the way, why are we having ArraySegment<T>.Length when we already have ArraySegment<T>.Count? Is that required for the compiler to recognize ArraySegment as a type that can be sliced when spans get language support? It seems really weird/confusing to have two properties that do the same thing.

jamesqo on 3 Mar 2017

The point was to make it easier to switch from array, span, etc. to arraysegment and vice versa. But I agree there is a negative aspect of the addition.

KrzysztofCwalina on 3 Mar 2017

👍1

@KrzysztofCwalina OK. I have pushed it out of the proposal for now, then; it can be added later if you change your mind.

I have a question that came up while I was working on the implementation: should Slice accept negative indices? If you do new ArraySegment<T>(new T[1], 1, 0).Slice(-1) should you get an ArraySegment that contains the first element or should an exception be thrown?

jamesqo on 3 Mar 2017

@karelz, regarding whether ArraySegment<T> should implement IEquatable or not, here is an issue that would have been avoided entirely if it was IEquatable. To see if 2 ArraySegments are equal, xUnit's default comparer checks first if they are IEquatable. Since they aren't, eventually it falls back to treating them like collections and calling GetEnumerator, which throws for a default ArraySegment<T>. I think there is value in implementing the interface.

jamesqo on 17 Apr 2017

👍1

@jamesqo I don't have much experience with IEquatable interface. What you say makes sense, but I let btter experts to make the call. It might be easier to move it into separate issue as it needs API approval.

@KrzysztofCwalina you said you really want this to happen in .NET Core 2.0 - can you please help drive it (incl. answering @jamesqo's questions), if it is still true? Thanks! (sorry I don't have enough expertise on this type + not enough bandwidth lately :()

karelz on 17 Apr 2017

@karelz below are currently implemented in coreclr but not exposed in corefx. Ideally we would have tests and expose them. Some of this was implemented by @jamesqo, some by another contributor who has left.

c# public struct ArraySegment<T> : ICollection<T>, IEnumerable, IEnumerable<T>, IList<T>, IReadOnlyCollection<T>, IReadOnlyList<T> { public static ArraySegment<T> Empty { get; } public T this[int index] { get; set; } public void CopyTo(ArraySegment<T> destination); public void CopyTo(T[] destination); public void CopyTo(T[] destination, int destinationIndex); public ArraySegment<T>.Enumerator GetEnumerator(); public static implicit operator ArraySegment<T> (T[] array); public ArraySegment<T> Slice(int index); public ArraySegment<T> Slice(int index, int count); public T[] ToArray(); public struct Enumerator : IDisposable, IEnumerator, IEnumerator<T> { public T Current { get; } object System.Collections.IEnumerator.Current { get; } public void Dispose(); public bool MoveNext(); void System.Collections.IEnumerator.Reset(); } }

danmosemsft on 30 Apr 2017

/cc @stephentoub fyi

danmosemsft on 30 Apr 2017

They're being exposed here: https://github.com/dotnet/corefx/pull/17033