Runtime: Review System.Buffers.SequenceReader<T> proposals

Created on 10 Sep 2019 · 12Comments · Source: dotnet/runtime

This is a meta issue for dotnet/runtime#30779, dotnet/runtime#30778, dotnet/runtime#30771, dotnet/runtime#30770, and dotnet/runtime#29360 which add new APIs to SequenceReader<T>. This captures the overall change for the review. Detailed discussions are in the linked issues.

Proposed new surface area:

``` C#
namespace System.Buffers
{
public ref struct SequenceReader
{
// Optimized API to position the reader at the end of the sequence (much faster than what users can write)
public void AdvanceToEnd();

    // Pairs with existing Span<T> UnreadSpan;
    public ReadOnlySequence<T> UnreadSequence { get; }

    // Peeking out T, while skipping. This is more performant than users can write (avoids rewinding).
    public bool TryPeek(int skip, out T value);

    // Overloads for TryRead that allow reading out a given count rather than to some delimiter (as with existing
    // API span out will slice if it can or allocate and copy if it has to).
    bool TryRead(int count, out ReadOnlySpan<T> value);
    bool TryRead(int count, out ReadOnlySequence<T> value);

    // Equivalent "Peek" versions. They need a skip as peeking doesn't advance the reader and rewinding is super expensive.
    public bool TryPeek(int count, out ReadOnlySpan<T> value);
    public bool TryPeek(int skip, int count, out ReadOnlySpan<T> value);
    public bool TryPeek(int count, out ReadOnlySequence<T> value);
    public bool TryPeek(int skip, int count, out ReadOnlySequence<T> value);

    // Pairs with existing TryCopyTo(Span<T> destination), which does not advance the reader (neither does this)
    public bool TryCopyTo(int skip, Span<T> destination);

    // Also proposed, but having this can lead to writing slow code.
    public bool SetPosition(SequencePosition position);

}
}
```

area-System.Buffers

Source

JeremyKuhne

👍1

Most helpful comment

Video

We should rename the skip parameters to offset
- We don't use skip anywhere else and it kind of implies a side effect
- offset implies relative to something and for a reader it makes to be relative to current position
We should the count parameter to length
We should remove the overloads of TryPeek that take two arguments and whose first argument is count -- because it conflicts TryPeek(int skip, out T value)
Arguments referring to position and lengths should be typed as long
We should add readonly annotations
It's sad that TryPeek() that returns a span might have to allocate if the span crosses buffers but there is no way around that.
FDG: we need to consider what we do for out parameters where the out type is the same as the input type as that allows you modify the this while it's running (ask Levi for details).
public void SetPosition(SequencePosition position); needs more work as there are a lot of concerns (like O(n) complexity).

```C#
namespace System.Buffers
{
public ref struct SequenceReader
{
// Optimized API to position the reader at the end of the sequence (much faster than what users can write)
public void AdvanceToEnd();

    // Pairs with existing Span<T> UnreadSpan;
    public readonly ReadOnlySequence<T> UnreadSequence { get; }

    // Overloads for TryRead that allow reading out a given count rather than to some delimiter (as with existing
    // API span out will slice if it can or allocate and copy if it has to).
    public bool TryRead(int length, out ReadOnlySpan<T> value);
    public bool TryRead(long length, out ReadOnlySequence<T> value);

    // Peeking out T, while skipping. This is more performant than users can write (avoids rewinding).
    public readonly bool TryPeek(long offset, out T value);

    // Equivalent "Peek" versions. They need a offset as peeking doesn't advance the reader and rewinding is super expensive.
    public readonly bool TryPeek(long offset, int length, out ReadOnlySpan<T> value);
    public readonly bool TryPeek(long offset, long length, out ReadOnlySequence<T> value);

    // Pairs with existing TryCopyTo(Span<T> destination), which does not advance the reader (neither does this)
    public readonly bool TryCopyTo(long offset, Span<T> destination);

}
}
```

terrajobst on 10 Sep 2019

👍3 🎉1

All 12 comments

cc'ing interested parties based on the issues listed:
@davidfowl, @GoldenCrystal, @stevejgordon, @softworkz, @FiniteReality,

ahsonkhan on 10 Sep 2019

IMO, the SequenceReader should not get any API added involving a SequencePosition parameter. Right now, it doesn't have any other - except the Position property.

I think there should be a clear separation between ReadOnlySequence and SequenceReader APIs, where ReadOnlySequence is dealing with SequencePosition objects and SequenceReader using integer offsets.

softworkz on 10 Sep 2019

Video

We should rename the skip parameters to offset
- We don't use skip anywhere else and it kind of implies a side effect
- offset implies relative to something and for a reader it makes to be relative to current position
We should the count parameter to length
We should remove the overloads of TryPeek that take two arguments and whose first argument is count -- because it conflicts TryPeek(int skip, out T value)
Arguments referring to position and lengths should be typed as long
We should add readonly annotations
It's sad that TryPeek() that returns a span might have to allocate if the span crosses buffers but there is no way around that.
FDG: we need to consider what we do for out parameters where the out type is the same as the input type as that allows you modify the this while it's running (ask Levi for details).
public void SetPosition(SequencePosition position); needs more work as there are a lot of concerns (like O(n) complexity).

```C#
namespace System.Buffers
{
public ref struct SequenceReader
{
// Optimized API to position the reader at the end of the sequence (much faster than what users can write)
public void AdvanceToEnd();

    // Pairs with existing Span<T> UnreadSpan;
    public readonly ReadOnlySequence<T> UnreadSequence { get; }

    // Overloads for TryRead that allow reading out a given count rather than to some delimiter (as with existing
    // API span out will slice if it can or allocate and copy if it has to).
    public bool TryRead(int length, out ReadOnlySpan<T> value);
    public bool TryRead(long length, out ReadOnlySequence<T> value);

    // Peeking out T, while skipping. This is more performant than users can write (avoids rewinding).
    public readonly bool TryPeek(long offset, out T value);

    // Equivalent "Peek" versions. They need a offset as peeking doesn't advance the reader and rewinding is super expensive.
    public readonly bool TryPeek(long offset, int length, out ReadOnlySpan<T> value);
    public readonly bool TryPeek(long offset, long length, out ReadOnlySequence<T> value);

    // Pairs with existing TryCopyTo(Span<T> destination), which does not advance the reader (neither does this)
    public readonly bool TryCopyTo(long offset, Span<T> destination);

}
}
```

terrajobst on 10 Sep 2019

👍3 🎉1

Shouldn't the parameter names be span and sequence for methods with either overload? Existing methods that provide both that follow that pattern, and it is useful for writing code like reader.TryReadTo(span: out var span, '\n').

Also, the existing methods have the out parameter first & I think that these should stay congruent with those methods.

reflectronic on 4 Nov 2019

Shouldn't the parameter names be span and sequence for methods with either overload? Existing methods that provide both that follow that pattern, and it is useful for writing code like reader.TryReadTo(span: out var span, '\n').

Being consistent with existing APIs make sense to me (rename value to span, sequence).

Also, the existing methods have the out parameter first & I think that these should stay congruent with those methods.

Maybe. I can see changing the TryRead APIs with the length parameter to have the out paramter first. I am not sure about the TryPeek ones that have offset/length. But again, like you said, they would be more consistent.

I would be fine with both changes, but curious to hear what others have to say (@terrajobst, @JeremyKuhne, @bartonjs). The suggestion is:

    public ref struct SequenceReader<T>
    {
-        public bool TryRead(int length, out ReadOnlySpan<T> value);
+        public bool TryRead(out ReadOnlySpan<T> span, int length);
-        public bool TryRead(long length, out ReadOnlySequence<T> value);
+        public bool TryRead(out ReadOnlySequence<T> sequence, long length);

-        public readonly bool TryPeek(long offset, out T value);
+        public readonly bool TryPeek(out T value, long offset);

-        public readonly bool TryPeek(long offset, int length, out ReadOnlySpan<T> value);
+        public readonly bool TryPeek(out ReadOnlySpan<T> span, long offset, int length);
-        public readonly bool TryPeek(long offset, long length, out ReadOnlySequence<T> value);
+        public readonly bool TryPeek(out ReadOnlySequence<T> sequence, long offset, long length);
   }

Here are the relevant existing APIs that shipped in 3.0 (for reference):
C# public readonly bool TryPeek(out T value) { throw null; } public bool TryRead(out T value) { throw null; } public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, System.ReadOnlySpan<T> delimiter, bool advancePastDelimiter = true) { throw null; } public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, T delimiter, bool advancePastDelimiter = true) { throw null; } public bool TryReadTo(out System.Buffers.ReadOnlySequence<T> sequence, T delimiter, T delimiterEscape, bool advancePastDelimiter = true) { throw null; } public bool TryReadTo(out System.ReadOnlySpan<T> span, T delimiter, bool advancePastDelimiter = true) { throw null; } public bool TryReadTo(out System.ReadOnlySpan<T> span, T delimiter, T delimiterEscape, bool advancePastDelimiter = true) { throw null; } public bool TryReadToAny(out System.Buffers.ReadOnlySequence<T> sequence, System.ReadOnlySpan<T> delimiters, bool advancePastDelimiter = true) { throw null; } public bool TryReadToAny(out System.ReadOnlySpan<T> span, System.ReadOnlySpan<T> delimiters, bool advancePastDelimiter = true) { throw null; }

ahsonkhan on 13 Nov 2019

👍1

out values always go after all non-out values (except defaulted parameters)

bartonjs on 13 Nov 2019

👍1

out values always go after all non-out values (except defaulted parameters)

There are some existing SequenceReader APIs that don't follow that though (out parameter is the first one for existing TryReadTo APIs). So, is it better to be consistent within the type here?

ahsonkhan on 13 Nov 2019

@ahsonkhan this meta issue has got approved label but seem that a lot of related issue are ready to review...can you confirm?I'd like to work on some of those but I don't know if make sense.