Runtime: Add AsString() extension method to ReadOnlySpan<char>

Created on 25 Jan 2018  路  20Comments  路  Source: dotnet/runtime

Rationale

We have the ability to create a string from a ReadOnlySpan ```C#
public String(System.ReadOnlySpan value) { }

However, someone using the portable System.Memory does not have the ability to convert a ReadOnlySpan\<char\> into a string.

We can resolve that by providing an extension method on ReadOnlySpan\<char\>. **Note:** This will clearly result in an allocation.

## API Proposal
**Alternative:** Change the behaviour of existing ToString method
```C#
namespace System
{
    public readonly ref partial struct ReadOnlySpan<T>
    {
// For char, it would do the equivalent of new string(char*,int) or new string(ReadOnlySpan<char>), 
// and for other Ts, we could either just output "System.Span<T>[Length]" or some concatenation of
// ToString of each element.
        public override string ToString() { throw null; }
    }
}

Original:
```C#
namespace System
{
public static partial class MemoryExtensions
{
// Alternative: something like CopyToString,
// possibly with additional args to control the amount to copy?
public static string AsString(this ReadOnlySpan source);
}
}

## Notes
The .NET Core implementation of this would just call the string constructor in corelib under the covers.

There is probably a way to get to the underlying string from the Pinnable\<char\> internal field for the portable span. Otherwise, we could copy the string data. Something like:
```C#
public static unsafe string AsString(this System.ReadOnlySpan<char> source)
{
    string result = new string(' ', source.Length);
    fixed (char* dest = result, src = &MemoryMarshal.GetReference(source))
    {
        for (int i = 0; i < source.Length; i++)
        {
            *(dest + i) = *(src + i);
        }
    }
    return result;
}

This could be useful, if you need to, let's say combine the Directory property on FindData with a filename string to generate a full path (from https://github.com/dotnet/corefx/issues/25873).

Sample Usage

```C#
string directory = FindData.Directory;
ReadOnlySpan fileNameSpan = FindData.FileName;
string fullPath = directory + fileNameSpan.AsString();

Assert.Equal("Hello", "Hello".AsReadOnlySpan().AsString());
```

cc @KrzysztofCwalina, @jkotas, @stephentoub, @JeremyKuhne, @dotnet/corefxlab-contrib

api-approved area-System.Memory

Most helpful comment

Note: This will clearly result in an allocation.

Not sure AsString sounds expensive enough to represent an api that results in an allocation from a Span (stack -> heap); it sounds more like a lightweight conversion of representation. i.e. its not clear

ReadOnlySpan<char> span = "Test value";
int idx = span.IndexOf(' ');
string s = ((idx < 0) ? span : span.Slice(0, idx)).AsString(); // Allocation might be unexpected

Span.ToString() will work fine once dotnet/corefx#26467 lands.

.ToString is an expected allocation. Would it cause confusion that on Span generally it would produce a diagnostic output and on char it produces a interop output? (e.g. additional method as well for ToString for {ReadOnly}Span<char>)

Looking at other potential allocating methods; reclaiming .ToString might be best? Would that work for portable span?

ToArray(this Span<T> source)
ToArray(this ReadOnlySpan<T> source)
ToList(this Span<T> source)
ToList(this ReadOnlySpan<T> source)

All 20 comments

Ahson and I talked about this earlier today. I think this is a very useful proposal. For API reviewers: is there concern about having a method very similarly named to ToString, which is known to lead to unhappiness when called on a ref struct?

One option I'd like to see is the ability to terminate at null. It is a pretty common pattern for strings to come back null terminated in fixed size buffer. https://github.com/dotnet/corefx/blob/40cbd5636851093309b841db4d53af369b7ef1c6/src/Common/src/System/Memory/FixedBufferExtensions.cs#L14

is there concern about having a method very similarly named to ToString, which is known to lead to unhappiness when called on a ref struct

Span.ToString() will work fine once https://github.com/dotnet/corefx/pull/26467 lands.

Ahson and I talked about this earlier today. I think this is a very useful proposal.

This method is redundant with String constructor that takes Span. I am wondering why we have both AsXXX and XXX constructors for a pretty much everything. Should be have just the AsXXX methods and drop the constructors? What's the value of redundant APIs?

let's say combine the Directory property on FindData with a filename string to generate a full path

This is not a very good example. For this, we should have Concat overload that takes Spans.

public static unsafe string AsString(this System.ReadOnlySpan<char> source)
{
    string result = new string(' ', source.Length);
    fixed (char* dest = result, src = &MemoryMarshal.GetReference(source))
    {
        for (int i = 0; i < source.Length; i++)
        {
            *(dest + i) = *(src + i);
        }
    }
    return result;
}

This is bad hacky way to implement this. The proper way to implement this is using String constructor that takes char* and length.

@JeremyKuhne Fear not! String instances are always null terminated regardless of how they're constructed.

Fear not! String instances are always null terminated regardless of how they're constructed.

@GrabYourPitchforks True, but what I'm looking for is constructing them without embedded nulls. I might have a char buffer with a possibly null terminated string in it. I want to construct it to either (1) the first null char '\0' or (2) .Length if no null is present. As strange as it might seem it is pretty common in native interop scenarios.

This method is redundant with String constructor that takes Span. I am wondering why we have both AsXXX and XXX constructors for a pretty much everything. Should be have just the AsXXX methods and drop the constructors? What's the value of redundant APIs?

In theory we could optimize this to unwrap strings wrapped in spans as the functionality gets implemented in the various runtimes. Thinking of MSBuild using Span on desktop in particular. Is there another way to accomplish this or am I missing something that makes that impossible?

@JeremyKuhne Ah, so something like this, but as a single helper extension method?

ReadOnlySpan<char> span = ...;
int idx = span.IndexOf('\0');
string s = ((idx < 0) ? span : span.Slice(0, idx)).AsString();

@GrabYourPitchforks Yes. I linked where I've already created an internal wrapper that does this in my original comment.

Note: This will clearly result in an allocation.

Not sure AsString sounds expensive enough to represent an api that results in an allocation from a Span (stack -> heap); it sounds more like a lightweight conversion of representation. i.e. its not clear

ReadOnlySpan<char> span = "Test value";
int idx = span.IndexOf(' ');
string s = ((idx < 0) ? span : span.Slice(0, idx)).AsString(); // Allocation might be unexpected

Span.ToString() will work fine once dotnet/corefx#26467 lands.

.ToString is an expected allocation. Would it cause confusion that on Span generally it would produce a diagnostic output and on char it produces a interop output? (e.g. additional method as well for ToString for {ReadOnly}Span<char>)

Looking at other potential allocating methods; reclaiming .ToString might be best? Would that work for portable span?

ToArray(this Span<T> source)
ToArray(this ReadOnlySpan<T> source)
ToList(this Span<T> source)
ToList(this ReadOnlySpan<T> source)

Not sure AsString sounds expensive enough [...] it sounds more like a lightweight conversion of representation

I've made my own extension method for ReadOnlySpan<char> and named it Materialize(), I think something like that might work.

I like the API, but I don't think it should be called AsString. I think "As" implies no allocation. If we could simply override ToString (for span of chars), it would be great.

If we could simply override ToString (for span of chars), it would be great.

We could change the {ReadOnly}Span.ToString implementation to just do the equivalent of ToString on each element (rather than what Ian's checking in, which I believe just outputs Span[Length]), with a specialized implementation for char to do it with the string ctor and thus more efficiently, but that won't address the core of this issue: supporting this out-of-band, where an override can't be used.

We could also just have a MemoryExtensions.ToString non-extension static method.

I think MemoryExtensions.ToString worse than calling the string constructor and passing in the span.

that won't address the core of this issue: supporting this out-of-band, where an override can't be used.

Is this really about OOB or rather fast/slow span?

(Nevermind what I wrote about the override; I wasn't thinking clearly. Any Span implementation we ship could override it.)

As for ME.ToString being worse than string ctor, the point of this issue is being able to use that functionality without needing .net core 2.1.

Anyway, maybe just changing the ToString override makes the most sense; for char it could just do the equivalent of new string(char*,int) or new string(ReadOnlySpan<char>), and for other Ts we could either just do the Span[Length] thing or some concatenation of ToString of each element.

Anyway, maybe just changing the ToString override makes the most sense; for char it could just do the equivalent of new string(char*,int) or new string(ReadOnlySpan<char>), and for other Ts we could either just do the Span[Length] thing or some concatenation of ToString of each element.

Are we OK with different behaviours of ToString() on Span

C# Span<char> span = new char[10]; ReadOnlySpan<char> roSpan = "abcde".AsReadOnlySpan(); span.ToString(); //"System.Span<char>[10];" roSpan.ToString(); //"abcde";

Span<char> and ReadOnlySpan<char> should have the same behavior here.

We should probably do both.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

noahfalk picture noahfalk  路  3Comments

Timovzl picture Timovzl  路  3Comments

jzabroski picture jzabroski  路  3Comments

EgorBo picture EgorBo  路  3Comments

chunseoklee picture chunseoklee  路  3Comments