Latest Update:
New proposal to re-review: Enumerator returns Range
instead of ReadOnlySpan<T>
```C#
partial class MemoryExtensions
{
public static SpanSplitEnumerator
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable
public ref struct SpanSplitEnumerator<T> where T : IEquatable<T>
{
public SpanSplitEnumerator<T> GetEnumerator() { return this; }
public bool MoveNext();
public Range Current { get; }
}
}
**Previously approved API Proposal**
```C#
public static SpanSplitEnumerator<T> Split(this ReadOnlySpan<T> span, T seperator,
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public ref struct SpanSplitEnumerator<T> where T : IEquatable<T>
{
public SpanSplitEnumerator GetEnumerator() { return this; }
public bool MoveNext();
public ReadOnlySpan<T> Current { get; }
}
Split off from https://github.com/dotnet/corefx/issues/21395#issuecomment-359342832
_From @Joe4evr on January 21, 2018 23:16_
Can I throw in another suggestion? I'd really like to see some ability to split a
ReadOnlySpan<char>
. Obviously, you can't return a collection ofSpan
s directly, but isn't that whatMemory<T>
is for?
// equivalent to the overloads of 'String.Split()'
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, char[] seperator);
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, char[] seperator, int count);
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, char[] seperator, int count, StringSplitOptions options);
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, char[] seperator, StringSplitOptions options);
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, string[] seperator, int count, StringSplitOptions options);
public static IReadOnlyList<ReadOnlyMemory<char>> Split(this ReadOnlySpan<char> span, string[] seperator, StringSplitOptions options);
IReadOnlyList<T>
is to make the resulting collection indexable, just like string[]
. It would be nice if the implementation is ImmutableArray<T>
, but I'm not sure if that's a concern for distribution and such._From @ahsonkhan on January 22, 2018 01:36_
@Joe4evr, out of curiosity, do you have a scenario atm where these APIs would be useful? If so, can you please show the code sample?
_From @Joe4evr on January 22, 2018 08:35_
My scenario is taking a relatively big string of user input and then parsing that to populate a more complex object. So rather than take the whole string at once, I'd like to parse it in pieces at a time. It'd be pretty nice if this can be facilitated by the
Span
/Memory<T>
APIs so that this code won't have to allocate an extra 30-40 tiny strings whenever it runs.
Admittedly, I only started on this particular case earlier today, mostly to experiment and find out how much I could get out of the Span APIs at this time.
_From @stephentoub on January 22, 2018 08:41_
Splitting support would be good, but I don't think it would look like the proposed methods; as @ahsonkhan points out, that would result in a lot of allocation (including needing to copy the whole input string to the heap, since you can't store the span into a returned interface implementation).
I would instead expect a design more like an iterator implemented as a ref struct, e.g.
C# public ref struct CharSpanSplitter { public CharSpanSplitter(ReadOnlySpan<char> value, char separator, StringSplitOptions options); public bool TryMoveNext(out ReadOnlySpan<char> result); }
cc @KrzysztofCwalina, @stephentoub, @Joe4evr
Presumed usage would be like this?
var splitter = span.Split(',', StringSplitOptions.RemoveEmptyEntries);
while (splitter.TryMoveNext(out var slice))
{
//....
}
Yeah, looks good.
Would it be possible to implement the enumerable pattern, even if it's not possible to implement IEnumerable<T>
?
I think the interface would look something like:
```c#
public static SpanSplitEnumerable Split(this ReadOnlySpan
public ref struct SpanSplitEnumerable
{
public SpanSplitEnumerator GetEnumerator();
}
public ref struct SpanSplitEnumerator
{
public bool MoveNext();
public ReadOnlySpan
}
That way, the usage could be:
```c#
foreach (var slice in span.Split(','))
{
…
}
If indexing is important, something like ref struct SpanSplitList
would probably work, but I think it couldn't guarantee zero allocations.
Sorry, didn't see this issue before posting https://github.com/dotnet/corefx/issues/21395#issuecomment-359802015.
As mentioned, I'd love to see something like Google Guava's CharMatcher
for this. It could be useful for other scenarios as well.
It would be great if the split worked also with Span
It would be great if the split worked also with Span
and returned ReadOnlyMemory >, ReadOnlySpan >
Clarifying the comment so the generic types show up:
It would be great if the split worked also with Span<byte>
and returned ReadOnlyMemory<ReadOnlyMemory<T>>
, ReadOnlySpan<ReadOnlyMemory<T>>
Simple implementation for SpanSplitEnumerable<T> Split<T>(this ReadOnlySpan<T> span, T separator)
interface where the return type is foreachable. Unit tests included.
https://gist.github.com/LordJZ/92b7decebe52178a445a0b82f63e585a
The sentinel is a "clever trick" to avoid having a boolean field, can be replaced with a backing field.
We needed features like this on top of ASP.NET's StringTokenizer
type in the aspnet/WebHooks repo. See that repo's TrimmingTokenizer
and have seen some interest (aspnet/WebHooks#324) in making that more broadly available.
It would be great to point customers to something in the BCL instead of (say) adding TrimmingTokenizer
features to StringTokenizer
. What is the expected timeframe for the "Future" milestone that would include a dotnet/corefx#26528 fix?
/cc @davidfowl
@ahsonkhan if this proposal makes sense to you, perhaps you could help shepherd it to review, as it seems there would be volunteers to implement it?
cc @JeremyKuhne who has been doing thinking about low allocation string operations.
StringBuilder added an enumerator for the next version that we should consider when deciding on a pattern for enumerating spans. https://github.com/dotnet/coreclr/blob/master/src/System.Private.CoreLib/shared/System/Text/StringBuilder.cs#L587
if this proposal makes sense to you, perhaps you could help shepherd it to review, as it seems there would be volunteers to implement it?
Let me make sure the API shape is clear. Incorporating the recent feedback (to get foreach support, make it generic), here is the API:
```C#
public static SpanSplitEnumerable
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable
public ref struct SpanSplitEnumerable
{
public SpanSplitEnumerator
}
public ref struct SpanSplitEnumerator
{
public bool MoveNext();
public ReadOnlySpan
}
```
@KrzysztofCwalina this is the enumerating spans issue we discussed yesterday.
Does StringSplitOptions
really make sense for a generic T
?
Does
StringSplitOptions
really make sense for a genericT
?
I suppose a SpanSplitOptions
which would mimic StringSplitOptions
makes more sense here.
Should the enumerator and the enumerable be combined? i.e. what's the reason to have two types?
@KrzysztofCwalina I think it makes the API cleaner and easier to understand. The foreach
pattern requires an enumerable and an enumerator, and that's exactly what the proposed API provides.
It there any advantage in combining them, apart from decreasing the size of the API surface?
We don't think we want these APIs:
String.Split
If you care about allocations, then you want a different API. And if you don't care about allocations, well, then you can just use String.Split
.
So what API would we like to see? It could be a struct-based enumerator that allows the consumer to foreach
the individual spans without allocations. Alternatively (or additionally) we could have a method that allows the consumer to pass in a buffer with the locations, which, for the most part, could be stackalloc
ated.
We don't think we want these APIs:
1. They allocate
I readily concur with this. If I don't care about allocations, I'll simply be using string.Split
. If I care enough to want to avoid allocating substrings, then obviously I would like to avoid _any_ allocations whatsoever.
Suggest ReadOnlySpan<T>.Split
returns ReadOnlySpan<ReadOnlySpan<T>>
on the stack so there is no more allocation.
If I want to use foreach
, which allocations, then I can still enumerate over the ReadOnlySpan<ReadOnlySpan<T>>
. Otherwise, I'll use a simple for
loop to have zero allocation.
Can you use ReadOnlySpan<T>
as a generic argument?
@schungx
Suggest
ReadOnlySpan<T>.Split
returnsReadOnlySpan<ReadOnlySpan<T>>
on the stack so there is no more allocation.
Split
can't return a Span
that was stack allocated by itself. It could take that Span
as an additional parameter, but then you have to somehow handle the case where that Span
is not big enough. I believe this is what @terrajobst meant by "we could have a method that allows the consumer to pass in a buffer with the locations".
@terrajobst, the API that @svick proposed above does not seem to allocate. Am I missing something? Or are you saying that we want both convenience and no allocations?
@svick, I wonder how important it is for this enumerator to be easy to understand. The classic enumerator pattern has two types for many reasons that are less and less applicable once you deal with by ref structs that cannot implement enumerator interfaces and get copied when passed around. I do agree we should discuss pros and cons and maybe indeed it's better to have two types, but I wanted to open the discussion as it seems such an overkill to add two by ref types just so we can split.
@KrzysztofCwalina I agree that my argument of "this version of the API is slightly easier to understand" is fairly weak in this case. But in my opinion, the argument of "this version of the API is slightly smaller" is even weaker, which is why I prefer to have two types.
But ultimately it doesn't make that much of a difference, as long as I'll be able to have foreach (var slice in span.Split(','))
without allocations, I'll be happy.
@svick I'm suspect a normal foreach
will always allocate because the current state must be kept somewhere, even if you somehow come up with value-based iterators...
And what is the _purpose_ of this again? That's to prevent allocations in parsing code in tight loops, where all we're doing is manipulating streams of text without ever changing them.
@schungx There's nothing really new to come up with. For example, foreach
over a List<T>
already does not allocate on the heap, because the current state is kept in the enumerator struct
on the stack. A very similar approach can be used here.
- They allocate
It could be a struct-based enumerator that allows the consumer to
foreach
the individual spans without allocations.
Should the enumerator and the enumerable be combined? i.e. what's the reason to have two types?
The APIs, as suggested here, don't allocate and are essentially what was recommended. Maybe it was missed since it wasn't at the top. Will copy it over to the top post.
Making it into a single type:
```C#
public static SpanSplitEnumerator
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable
public ref struct SpanSplitEnumerator
{
public SpanSplitEnumerator GetEnumerator() { return this; }
public bool MoveNext();
public ReadOnlySpan
}
```
char
, so leaving it as T
is fine.Span<T>
, we should make sure we canSpanSplitEnumerator
to ReadOnlySpanSplitEnumerator
to MemoryExtensions
StringSplitOptions
might feel odd, but (1) we've never extended it and (2) and the operations are about how the split is performed, which applies to this API tochar[]
(matches any) we'd add a new method called SplitAny
so that we can also have the equivalent of string
(matches the sequence)Are we going to have an overload that takes ReadOnlySpan
I tried implementing the APIs above, and additionally a two more split methods in my personal repository. The types and methods themselves can be found in ConsoleApp4
(lol) folder, and the tests can be found in SplitTests
folder.
public static ReadOnlySpanSplitEnumerator<T> Split<T>(this ReadOnlySpan<T> span, T delimiter,
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public static ReadOnlySpanSplitBySequenceEnumerator<T> Split<T>(this ReadOnlySpan<T> span,
ReadOnlySpan<T> delimiter, StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public static ReadOnlySpanSplitByAnyEnumerator<T> SplitByAny<T>(this ReadOnlySpan<T> span,
ReadOnlySpan<T> delimiters, StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
The first one replaces string.Split(char)
, the second one replaces string.Split(string)
, and the third replaces string.Split(char[])
. The last two are not really included in the original discussion (and would take a bit of time to go through approval process even if it does), but I personally think it is worth including to have the complementary alternatives as well.
It's not a great code perhaps, but I'd love to open a PR for this issue (with the code above). Although the last two methods are not approved, at least I can open one for the regular T
version.
A slightly different take on this. Considering that Range
is now available, and the non-zero cost of Slice
on potentially-many results, what if the enumerable
returned Range
s instead of sliced RoS
. In other words, a list of delineations. The consumer could then decide when/if to Slice(range)
foreach (var range in span.Split(','))
{
var x = span.Slice(range);
}
Perhaps that's too abstract for a public api.
Did some experimenting with this as well.
Not sure if it's an optimal solution though.
https://github.com/bbartels/coreclr/blob/master/src/System.Private.CoreLib/shared/System/MemoryExtensions.Split.cs
@grant-d I like that API
That seems reasonable to me.
Only problem is that it would conflict with the approved api, as only the return type differs.
Correct; it would need to go back into api triage
Returning ranges has some real advantages, namely we can operate on a ReadOnlySpan<T>
but then apply the results to other types, in particular Span<T>
, Memory<T>
, and ReadOnlyMemory<T>
, e.g.
C#
foreach (Range r in memory.Span.Split(':')) Process(memory[r]);
Correct; it would need to go back into api triage
To clarify, can you share the proposed range based API shape and then we can reconcile it/review it?
I can update the original post if you provide the precise shape, and we can re-review.
I suppose it's just substituting Range for ReadOnlySpan\
So something like:
public static SpanSplitEnumerator<T> Split(this ReadOnlySpan<T> span, T seperator,
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public ref struct SpanSplitEnumerator<T> where T : IEquatable<T>
{
public SpanSplitEnumerator<T> GetEnumerator() { return this; }
public bool MoveNext();
public Range Current { get; }
}
I don't think overloading Slice would work as stephen suggested (Probably just a typo)
I didn't suggest overloading Slice. If a type has a Slice(int, int)
method, the C# 8 compiler now supports passing a Range
as an indexer on the type and the compiler expands it into a call to the Slice(int, int)
method.
I didn't suggest overloading Slice.
memory.Span.Slice(':')
I believe this is what's causing the confusion.
Returning ranges has some real advantages, namely we can operate on a
ReadOnlySpan<T>
but then apply the results to other types, in particularSpan<T>
,Memory<T>
, andReadOnlyMemory<T>
, e.g.foreach (Range r in memory.Span.Slice(':')) Process(memory[r]);
I meant in the foreach, you used memory.Span.Slice(':').
Ah, sorry, I meant Split, not Slice.
Maybe it would make sense to add another enumerator, which would allow for splitting by a sequence.
Just like string.Split("??") is possible.
public static SpanSplitEnumerator<T> Split(this ReadOnlySpan<T> span, T seperator,
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public static SpanSplitSequenceEnumerator<T> Split(this ReadOnlySpan<T> span, ReadOnlySpan<T> seperators,
StringSplitOptions options = StringSplitOptions.None) where T : IEquatable<T> {}
public ref struct SpanSplitEnumerator<T> where T : IEquatable<T>
{
public SpanSplitEnumerator<T> GetEnumerator() { return this; }
public bool MoveNext();
public Range Current { get; }
}
public ref struct SpanSplitSequenceEnumerator<T> where T : IEquatable<T>
{
public SpanSplitSequenceEnumerator<T> GetEnumerator() { return this; }
public bool MoveNext();
public Range Current { get; }
}
Could also be merged into one enumerator that just holds an ReadOnlySpan
A split api that returns ranges is pure genius @grant-d. That would be a really nice api to aim for.
Sad to see we haven't made progress on this!
Can I suggest allowing to pass a delegate bool IsSeparatorPredicate(ReadOnlySpan<T> span)
?
This will enable splitting on surrogate pairs etc.
@scalablecory, how about Rune
? (and ReadOnlySpan of Rune)
Rune
doesn't solve grapheme clusters, and dealing with normalization would become unwieldy. A predicate would handle all cases.
Isn't one of the main points within the Span
Can I suggest allowing to pass a delegate bool IsSeparatorPredicate(ReadOnlySpan
span)?
And turn split into an O(N^2) algorithm?
I think this should be kept relatively simple, making the common case easy.
Can I suggest allowing to pass a delegate bool IsSeparatorPredicate(ReadOnlySpan span)?
And turn split into an O(N^2) algorithm?
I think this should be kept relatively simple, making the common case easy.
I don't see how this would be any different than String.Split(string[])
I don't see how this would be any different than String.Split(string[])
Because string.Split(string[]) takes an array of the exact strings to use, which means a) the implementation knows exactly how many characters need to be compared, and b) regardless string matching algorithms exist to find substrings in O(N+M) time rather than the naive O(N*M) time. Your proposed IsSeparatePredicate that takes a span would presumably need to be invoked at every character and for every length after that character until the end of the string to determine whether it represents a separator.
I don't see how this would be any different than String.Split(string[])
Because string.Split(string[]) takes an array of the exact strings to use, which means a) the implementation knows exactly how many characters need to be compared, and b) regardless string matching algorithms exist to find substrings in O(N+M) time rather than the naive O(N*M) time. Your proposed IsSeparatePredicate that takes a span would presumably need to be invoked at every character and for every length after that character until the end of the string to determine whether it represents a separator.
Oh, I see the confusion; my fault. I should not make suggestions right before going to sleep!
Imagine something more like: bool IsSeparatorPredicate(ReadOnlySpan span, out int length)
, which would be called only once per character. it gets passed the full remaining span, and returns the length of the separator to skip in the out param.
Imagine something more like: bool IsSeparatorPredicate(ReadOnlySpan span, out int length), which would be called only once per character. it gets passed the full remaining span, and returns the length of the separator to skip in the out param.
Don't think I would be opposed to this idea, but I definitely think there should be a separate (simpler) type that would just handle the cases of splitting on a single character or splitting on a sequence.
there should be a separate (simpler) type
And cheaper. Adding O(N) delegate invocations in an API whose purpose is to reduce overheads is not the right choice. We should start with the 99% case, and only add additional complexity/cost when and where there's a real need.
Agreed, not proposing to get rid of the simple overload :)
Based on the conversation, we believe we would prefer something more scoped:
```C#
partial class MemoryExtensions
{
public static SpanSplitEnumerator
public static SpanSplitEnumerator
public static SpanSplitEnumerator
// This API seems possible, but we can defer until we see scenarios
// public static SpanSplitEnumerator<char> SplitAny(this ReadOnlySpan<char> span, ReadOnlySpan<char> separators);
// This API seems possible, but we should defer until we see scenarios. It wouldn't be good make this work for UTF8, for example.
// public static SpanSplitEnumerator<T> Split<T>(this ReadOnlySpan<T> span, T separator);
}
```
The enumerator returns ranges so you cans split a memory via this API and use the span to slice the memory and to unify span and read-only spans. It's unclear whether we need different enumerator types for different Split
behaviors, but we don't expect people to exchange those, which is why they are currently proposed to be nested types too.
I should have time in the next couple of weeks to tackle this
Having this for bytes would be useful as well.....
Unless I am missing something, with the currently proposed SpanSplitEnumerator:
public static SpanSplitEnumerator<char> Split(this ReadOnlySpan<char> span, string separator);
would not work with the other overloads. At least not without an allocation.
If the SpanSplitEnumerator holds a single T as a separator, one could not split on string, and if the type holds a ReadOnlySpan\
One solution could be to have a separate type for splitting on sequences.
Threw together a quick implementation of this to showcase this issue:
https://gist.github.com/bbartels/87c7daae28d4905c60ae77724a401b20
@bbartels we also decided that the enumerator type may differ depending on overload as needed.
@davidfowl we're hoping to defer anything regarding UTF-8 until we iron out Char8
, so we don't have duplicate byte
and Char8
APIs.
If the SpanSplitEnumerator holds a single T as a separator, one could not split on string, and if the type holds a ReadOnlySpan
then you would somehow need to pass the single char separator to the type.
It could hold both and just use one of them, with a bool or enum value dictating which (or neither) to use.
Not really sure what the corefx team would value more, fewer types overall and less code duplication or a slimmer memory footprint on each individual type.
For now I went with two different types based on splitting on a single length separator and a multi length one.
But could change it to what @stephentoub suggested if deemed better!
Added some tests to verify correctness of my proposed implementation. (Tests were adapted from the official string.split tests)
https://gist.github.com/bbartels/87c7daae28d4905c60ae77724a401b20
May I suggest another overload for enumerating backwards through the sequence?
Just came across a scenario today where it would have required me to check equality of a splitted sequence starting at the last entry. This way I wouldn't have to store the result somewhere before I iterate through them in reverse.
Something like:
partial class MemoryExtensions
{
public static SpanSplitEnumerator<char> Split(this ReadOnlySpan<char> span, bool reverse = false);
public static SpanSplitEnumerator<char> Split(this ReadOnlySpan<char> span, char separator, bool reverse = false);
public static SpanSplitEnumerator<char> Split(this ReadOnlySpan<char> span, string separator, bool reverse = false);
}
Or you could add a Reverse
method to SpanSplitEnumerator<char>
that returns another enumerator that goes in reverse?
I like the idea, but I feel like it might be a little ambiguous. As in, does it return a completely new enumerator that has no previous state, or assuming you've already called MoveNext() a couple times does the returned enumerator preserve that state and just goes backwards from there.
@bbartels That could be solved by splitting SpanSplitEnumerator
into enumerable and enumerator: only the enumerable would have Reverse()
, so there is no confusion about what calling Reverse()
in the middle of enumerating does, because it's not possible. (Note that I've unsuccessfully argued for that split before, for a different reason.)
I'd actually like a Reverse() method on the SpanSplitEnumerator that doesn't return anything, just flips a switch to change direction. This way you could use it as a lookahead: If a certain condition is true for the current element, you could go back and do some extra processing on the previous.
Would 'require' the aforementioned overloads for the Split methods, as you'd otherwise have to manually get the enumerator and call Reverse before you could foreach over it.
@bbartels you can do lookahead and reset with current spec already; just save a copy of the enumerator and restore if you need to back up. granted, you can't easily go back a variable number of times...
@grant-d I assigned this to you. thanks!
_[Edited]_
Question about desired behavior.
Compare the existing string
idiom (assume we _keep_ empty entries):
// 012345
"|ab|c|".Split('|'); // string[4] { "", "ab", "c", --> "" <-- }
Note the final ""
on the _rhs_ boundary.
If we use a comparable Range
enumerator instead (per this issue), then the results would _ostensibly_ be:
Range(0, 0) // <empty> (ok)
Range(1, 3) // ab (ok)
Range(4, 5) // c (ok)
Range(6, 6) // <empty> (oops!)
The final value is questionable since start=6
would be IndexOutOfRange
.
So we have a different semantic to the existing idiom.
Range(5, 5)
which sorta kinda parallels Range(0, 0)
. Though it breaks the pattern of next.start = previous.end + 1
, eg (1, _3_) -> (_4_, 5)
@terrajobst?
I think Range(6,6)
is more appropriate. There shouldn't be any IndexOutOfRangeException
, because anything using the range shouldn't read >= the exclusive end index.
c#
"|ab|c|"[6..6].ToString() == ""
Though in the case of Range(6, 6), I think the first element should then be Range(-1, -1) to retain semantic equivalence.
I am busy implementing and noticed there's this overload:
Split(this ReadOnlySpan<char> span, string separator);
,
but no corresponding overload with:
Split(this ReadOnlySpan<char> span, string separator, StringComparison comparisonType);
Is that a spec miss, should I add it?
Or was it intentional, in which case is the behavior Ordinal
?
Please un-assign me, I am not going to get to this any time soon. Thanks
Done. Thanks.
I already have a working implementation that conforms with https://github.com/dotnet/corefx/issues/26528#issuecomment-523152939: https://gist.github.com/bbartels/87c7daae28d4905c60ae77724a401b20.
I could submit a PR once it is migrated to dotnet/runtime?
@bbartels please do :)
I would love to see these extensions for ReadOnlySpan
@bbartels , dotnet/runtime is open for business :)
@danmosemsft I really should have commented here, I already submitted a PR a couple weeks ago!
See https://github.com/dotnet/runtime/pull/295
Could use some feedback over there regarding the bullet points I noted!
Ran into a good use case for this just today. Would love to see this finished up.
One minor nitpick though: doesn't having a single type acting as both enumerable and enumerator create a less-than-obvious inconsistency with the rest of .NET?
var split = "a,b,c".Split(',');
foreach (var letter in split)
{
// Hit three times
}
foreach (var letter in split)
{
// Hit three times
}
var splitSpan = "a,b,c".AsSpan().Split(',');
foreach (var letter in splitSpan)
{
// Hit three times
}
foreach (var letter in splitSpan)
{
// Never hit because splitSpan was already consumed by first foreach
}
@jonathanmarston
One minor nitpick though: doesn't having a single type acting as both enumerable and enumerator create a less-than-obvious inconsistency with the rest of .NET?
The compiler creates a copy of the struct enumerator before enumerating (SharpLab Example).
The compiler creates a copy of the struct enumerator before enumerating
Ah yes. I see it now. Thanks!
is this already available? 🥺
@wstaelens Still waiting for the PR (https://github.com/dotnet/runtime/pull/295) to be reviewed/approved, but feel free to copy the implementation into your codebase if you need it early.
Reopening since we'll need to send this through review again.
Most helpful comment
A slightly different take on this. Considering that
Range
is now available, and the non-zero cost ofSlice
on potentially-many results, what if theenumerable
returnedRange
s instead of slicedRoS
. In other words, a list of delineations. The consumer could then decide when/if toSlice(range)
Perhaps that's too abstract for a public api.