We have the ability to create a string from a ReadOnlySpan
public String(System.ReadOnlySpan
However, someone using the portable System.Memory does not have the ability to convert a ReadOnlySpan\<char\> into a string.
We can resolve that by providing an extension method on ReadOnlySpan\<char\>. **Note:** This will clearly result in an allocation.
## API Proposal
**Alternative:** Change the behaviour of existing ToString method
```C#
namespace System
{
public readonly ref partial struct ReadOnlySpan<T>
{
// For char, it would do the equivalent of new string(char*,int) or new string(ReadOnlySpan<char>),
// and for other Ts, we could either just output "System.Span<T>[Length]" or some concatenation of
// ToString of each element.
public override string ToString() { throw null; }
}
}
Original:
```C#
namespace System
{
public static partial class MemoryExtensions
{
// Alternative: something like CopyToString,
// possibly with additional args to control the amount to copy?
public static string AsString(this ReadOnlySpan
}
}
## Notes
The .NET Core implementation of this would just call the string constructor in corelib under the covers.
There is probably a way to get to the underlying string from the Pinnable\<char\> internal field for the portable span. Otherwise, we could copy the string data. Something like:
```C#
public static unsafe string AsString(this System.ReadOnlySpan<char> source)
{
string result = new string(' ', source.Length);
fixed (char* dest = result, src = &MemoryMarshal.GetReference(source))
{
for (int i = 0; i < source.Length; i++)
{
*(dest + i) = *(src + i);
}
}
return result;
}
This could be useful, if you need to, let's say combine the Directory property on FindData with a filename string to generate a full path (from https://github.com/dotnet/corefx/issues/25873).
```C#
string directory = FindData
ReadOnlySpan
string fullPath = directory + fileNameSpan.AsString();
Assert.Equal("Hello", "Hello".AsReadOnlySpan().AsString());
```
cc @KrzysztofCwalina, @jkotas, @stephentoub, @JeremyKuhne, @dotnet/corefxlab-contrib
Ahson and I talked about this earlier today. I think this is a very useful proposal. For API reviewers: is there concern about having a method very similarly named to ToString, which is known to lead to unhappiness when called on a ref struct?
One option I'd like to see is the ability to terminate at null. It is a pretty common pattern for strings to come back null terminated in fixed size buffer. https://github.com/dotnet/corefx/blob/40cbd5636851093309b841db4d53af369b7ef1c6/src/Common/src/System/Memory/FixedBufferExtensions.cs#L14
is there concern about having a method very similarly named to ToString, which is known to lead to unhappiness when called on a ref struct
Span.ToString() will work fine once https://github.com/dotnet/corefx/pull/26467 lands.
Ahson and I talked about this earlier today. I think this is a very useful proposal.
This method is redundant with String constructor that takes Span. I am wondering why we have both AsXXX
and XXX
constructors for a pretty much everything. Should be have just the AsXXX
methods and drop the constructors? What's the value of redundant APIs?
let's say combine the Directory property on FindData with a filename string to generate a full path
This is not a very good example. For this, we should have Concat overload that takes Spans.
public static unsafe string AsString(this System.ReadOnlySpan<char> source)
{
string result = new string(' ', source.Length);
fixed (char* dest = result, src = &MemoryMarshal.GetReference(source))
{
for (int i = 0; i < source.Length; i++)
{
*(dest + i) = *(src + i);
}
}
return result;
}
This is bad hacky way to implement this. The proper way to implement this is using String constructor that takes char*
and length
.
@JeremyKuhne Fear not! String instances are always null terminated regardless of how they're constructed.
Fear not! String instances are always null terminated regardless of how they're constructed.
@GrabYourPitchforks True, but what I'm looking for is constructing them without embedded nulls. I might have a char buffer with a possibly null terminated string in it. I want to construct it to either (1) the first null char '\0'
or (2) .Length
if no null is present. As strange as it might seem it is pretty common in native interop scenarios.
This method is redundant with String constructor that takes Span. I am wondering why we have both AsXXX and XXX constructors for a pretty much everything. Should be have just the AsXXX methods and drop the constructors? What's the value of redundant APIs?
In theory we could optimize this to unwrap strings wrapped in spans as the functionality gets implemented in the various runtimes. Thinking of MSBuild using Span on desktop in particular. Is there another way to accomplish this or am I missing something that makes that impossible?
@JeremyKuhne Ah, so something like this, but as a single helper extension method?
ReadOnlySpan<char> span = ...;
int idx = span.IndexOf('\0');
string s = ((idx < 0) ? span : span.Slice(0, idx)).AsString();
@GrabYourPitchforks Yes. I linked where I've already created an internal wrapper that does this in my original comment.
Note: This will clearly result in an allocation.
Not sure AsString
sounds expensive enough to represent an api that results in an allocation from a Span (stack -> heap); it sounds more like a lightweight conversion of representation. i.e. its not clear
ReadOnlySpan<char> span = "Test value";
int idx = span.IndexOf(' ');
string s = ((idx < 0) ? span : span.Slice(0, idx)).AsString(); // Allocation might be unexpected
Span.ToString() will work fine once dotnet/corefx#26467 lands.
.ToString
is an expected allocation. Would it cause confusion that on Span generally it would produce a diagnostic output and on char
it produces a interop output? (e.g. additional method as well for ToString for {ReadOnly}Span<char>
)
Looking at other potential allocating methods; reclaiming .ToString
might be best? Would that work for portable span?
ToArray(this Span<T> source)
ToArray(this ReadOnlySpan<T> source)
ToList(this Span<T> source)
ToList(this ReadOnlySpan<T> source)
Not sure
AsString
sounds expensive enough [...] it sounds more like a lightweight conversion of representation
I've made my own extension method for ReadOnlySpan<char>
and named it Materialize()
, I think something like that might work.
I like the API, but I don't think it should be called AsString. I think "As" implies no allocation. If we could simply override ToString (for span of chars), it would be great.
If we could simply override ToString (for span of chars), it would be great.
We could change the {ReadOnly}Span.ToString implementation to just do the equivalent of ToString on each element (rather than what Ian's checking in, which I believe just outputs Span[Length]), with a specialized implementation for char to do it with the string ctor and thus more efficiently, but that won't address the core of this issue: supporting this out-of-band, where an override can't be used.
We could also just have a MemoryExtensions.ToString non-extension static method.
I think MemoryExtensions.ToString worse than calling the string constructor and passing in the span.
that won't address the core of this issue: supporting this out-of-band, where an override can't be used.
Is this really about OOB or rather fast/slow span?
(Nevermind what I wrote about the override; I wasn't thinking clearly. Any Span implementation we ship could override it.)
As for ME.ToString being worse than string ctor, the point of this issue is being able to use that functionality without needing .net core 2.1.
Anyway, maybe just changing the ToString override makes the most sense; for char it could just do the equivalent of new string(char*,int)
or new string(ReadOnlySpan<char>)
, and for other Ts we could either just do the Span[Length] thing or some concatenation of ToString of each element.
Anyway, maybe just changing the ToString override makes the most sense; for char it could just do the equivalent of
new string(char*,int)
ornew string(ReadOnlySpan<char>)
, and for other Ts we could either just do the Span[Length] thing or some concatenation of ToString of each element.
Are we OK with different behaviours of ToString() on SpanC#
Span<char> span = new char[10];
ReadOnlySpan<char> roSpan = "abcde".AsReadOnlySpan();
span.ToString(); //"System.Span<char>[10];"
roSpan.ToString(); //"abcde";
Span<char>
and ReadOnlySpan<char>
should have the same behavior here.
We should probably do both.
Most helpful comment
Not sure
AsString
sounds expensive enough to represent an api that results in an allocation from a Span (stack -> heap); it sounds more like a lightweight conversion of representation. i.e. its not clear.ToString
is an expected allocation. Would it cause confusion that on Span generally it would produce a diagnostic output and onchar
it produces a interop output? (e.g. additional method as well for ToString for{ReadOnly}Span<char>
)Looking at other potential allocating methods; reclaiming
.ToString
might be best? Would that work for portable span?