Runtime: API Proposal: Helper to get ReadOnlySpan<char> from string with terminator

Created on 14 Mar 2018  路  10Comments  路  Source: dotnet/runtime

In some cases we want null terminated spans of char (interop for example). There isn't a convenient helper to get a span from a string with the terminator included.

namespace System
{
    public static partial class MemoryExtensions
    {
        public static ReadOnlySpan<char> AsSpan(this string text);
+       public static ReadOnlySpan<char> AsSpanWithTerminator(this string text); 
    }
}

This is useful when you want to take a ReadOnlySpan as a parameter that you then need to pass to something (notably a native API) that expects a null terminated string. We can easily validate that we have a null and not have to copy.

C# public static void Foo(ReadOnlySpan<char> bar) { if (bar.Length > 0 && bar[bar.Length - 1] == '\0') { NativeFoo(ref MemoryMarshal.GetReference(path)); } else { // make a copy, add a null, and call NativeFoo() } }

* Notes *

  • Perhaps this should live in MemoryMarshal as it is a little more advanced? It isn't a terribly difficult concept though.
  • Should also return default if passed null for string.

cc: @KrzysztofCwalina, @terrajobst, @ahsonkhan, @jkotas

api-needs-work area-System.Memory

Most helpful comment

We could also do:
c# public static partial class MemoryExtensions { public static ReadOnlySpan<char> AsSpan(this string text); public static ReadOnlySpan<char> AsSpan(this string text, bool includeTerminator); }

All 10 comments

We could also do:
c# public static partial class MemoryExtensions { public static ReadOnlySpan<char> AsSpan(this string text); public static ReadOnlySpan<char> AsSpan(this string text, bool includeTerminator); }

I think it is odd for the Length property to include the null terminator.

Instead, what I think we really need is a naming convention or a new type for zero-terminated spans. When I am looking at Span<char> variable, I need to be able to tell whether it is zero terminated or not.

if (bar.Length > 0 && bar[bar.Length] == '\0')

This is invalid code in general case. When you get a Span<char> bar from somewhere, you cannot index at bar[bar.Length]. You cannot make assumptions that the memory at bar[bar.Length] is there, or that it is immutable.

This is invalid code in general case.

Whoops, mistype. Thanks!

I expect that similar off-by-one mistypes like this would happen a lot if the Length property of Span<char> included zero terminator some of the time.

I think it is odd for the Length property to include the null terminator.

For types that are meant to represent strings only I agree, but spans are general purpose and roughly analogous to array and array.Length I think.

Instead, what I think we really need is a naming convention or a new type for zero-terminated spans. When I am looking at Span variable, I need to be able to tell whether it is zero terminated or not.

How about extensions of IsZeroTerminated/ZeroTerminate for spans?

We have made decision to use Span<char> as equivalent of strings. Span<char> is analogous to System.String, not array.

I think if somebody needs to check whether the last char of Span<char> is zero when they use Span<char> in non-standard way, they can just write a code for it. I do not think there should be extension method for it. If we really think there needs to be a method for it, it should be on MemoryMarshal because of it is very interop specific.

We have made decision to use Span as equivalent of strings. Span is analogous to System.String, not array.

And you can have embedded nulls in System.String but you can't implicitly terminate Span. I'm not sure why it is a problem to _explicitly_ include the 0. Implicitly I get.

I do not think there should be extension method for it.

Actually I think we need a general extension of bool EndsWith<T>(this ReadOnlySpan<T> span, T value)

If we really think there needs to be a method for it, it should be on MemoryMarshal because of it is very interop specific.

I agree for ZeroTerminated verbage.

Yes, creating the non-standard zero-terminated spans explicitly is ok. But they should be kept as local micro-optimizations. They should not leak through the system. When they do leak in internal implementation, it should be clear that it is non-standard Span. I think that every argument or return value of method that is zero terminated Span<char> should have comment next to it that says that it is zero-terminated Span<char>.

dotnet/runtime uses Span a lot. There do not seem to be any cases that would benefit from this API in dotnet/runtime libraries. This API is unlikely to be useful.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yahorsi picture yahorsi  路  3Comments

nalywa picture nalywa  路  3Comments

bencz picture bencz  路  3Comments

jchannon picture jchannon  路  3Comments

matty-hall picture matty-hall  路  3Comments