For example next code:
Console.WriteLine("Hello World!".IndexOfAny(new char[]{ }));
Console.WriteLine("Hello World!".AsSpan().IndexOfAny(new char[]{ }));
return -1 for string and 0 for string.AsSpan(). Is it bug?
It may be a bug - @ahsonkhan?
It may be a bug
The Span.IndexOfAny returns 0, as expected. The string.IndexOfAny should also return 0, at least per the docs.
If anyOf is an empty array, the method finds a match at the beginning of the string (that is, at index zero).
I believe this is the offending code path:
https://github.com/dotnet/coreclr/blob/4b9a9d71b06fbce71c7d000e8b2ed3623976b30c/src/System.Private.CoreLib/shared/System/String.Searching.cs#L121
Previously, when IndexOfAny was implemented in native code:
https://github.com/dotnet/coreclr/blob/ef1e2ab328087c61a6878c1e84f4fc5d710aebce/src/classlibnative/bcltype/stringnative.cpp#L442
What is the correct behavior here? Do we need to update the docs?
cc @jkotas, @benaadams
"Hello World!".IndexOf("") returns 0 and says it should return 0 in the docs (rather than -1)
Which would also suggest the Span version of .IndexOfAny is the correct one.
However, it looks like "Hello World!".IndexOfAny(new char[] {}) returns -1 on Full Framework also; is this a compat concern to fix?
is this a compat concern to fix?
It would be, yes.
"Hello World!".IndexOf("")returns 0 and says it should return 0 in the docs (rather than -1)
FYI, the IndexOf behavior is different for string vs array:
C#
static void Main(string[] args)
{
Console.WriteLine(String.Empty.IndexOf(String.Empty)); // 0
Console.WriteLine(Array.IndexOf(Array.Empty<char>(), Array.Empty<char>())); // -1
}
cc @danmosemsft, @GrabYourPitchforks
I talked with Ahson about this earlier. My philosophy on this is:
String.Empty.IndexOfAny(Array.Empty<char>()) should return -1. If it returns 0, that semantically means "there's a character of interest to you at index 0." But dereferencing theString[0] will result in an exception, which is bad. So -1 is the most sane return value.
String.Empty.IndexOf(String.Empty) should return 0. The reason for this is that given a targetString you're searching _within_ and a searchString you're searching _for_, you should be able to do something like assert(targetString.Substring(targetString.IndexOf(searchString), searchString.Length) == searchString). Since it's valid to call String.Empty.Substring(0, 0), a return value of 0 is appropriate.
(Ahson, I realize this last part is different than what you and I discussed earlier. But running through a few mental exercises makes it seem ok.)
Edit: the corollary to this is that Span<T>.Empty.IndexOfAny(...) should also return -1, not 0, since you cannot dereference theSpan[0] without incurring an exception.
In your first example, we are using the index returned from IndexOfAny to deference the string at that index (which you state is bad). However, in the second example, you use the index returned from IndexOf within substring (and deem it OK). If, however, we used that index to deference the original string from the second example, we end up with the same issue as the first example. Shouldn't we be consistent across the board to avoid the risk of potential confusion? Also, should we adhere to the documentation or the implementation here when it comes to the next steps for the fix?
@ahsonkhan Console.WriteLine(Array.IndexOf(Array.Empty<char>(), Array.Empty<char>())); // -1 here you call public static int IndexOf (Array array, object value); method. So it should return -1.
@benaadams IndexOfAny and IndexOf methods have different logic. You can not compare one with other.
@AlexRadch the implementations are different, but are the expectations different?
Should "Hello".IndexOfAny(Array.Empty<char>()) return a different value to "Hello".IndexOf(string.Empty)?
They are both looking for the start of "no chars" in the string
Should
"Hello".IndexOfAny(Array.Empty<char>())return a different value to"Hello".IndexOf(string.Empty)?They are both looking for the start of "no
chars" in the string
"Hello".IndexOf(string.Empty) looking for empty string inside "Hello" string. It looking string.
"Hello".IndexOfAny(Array.Empty<char>()) looking for the first character that is contained within Array.Empty<char>(). It looking char.
Array.Empty<char>() does not contains any char so IndexOfAny should return -1.
-1 also seems like the right answer (to me) for span.IndexOfAny(default).
Marking as 3.0 because we need to make a conscious decision to change the return value on the major version bump, or not.
@ahsonkhan do you expect to be able to do this for P7? And is the risk acceptable? If not please move milestone now to the next major release.
Most helpful comment
I talked with Ahson about this earlier. My philosophy on this is:
String.Empty.IndexOfAny(Array.Empty<char>())should return -1. If it returns 0, that semantically means "there's a character of interest to you at index 0." But dereferencingtheString[0]will result in an exception, which is bad. So -1 is the most sane return value.String.Empty.IndexOf(String.Empty)should return 0. The reason for this is that given atargetStringyou're searching _within_ and asearchStringyou're searching _for_, you should be able to do something likeassert(targetString.Substring(targetString.IndexOf(searchString), searchString.Length) == searchString). Since it's valid to callString.Empty.Substring(0, 0), a return value of 0 is appropriate.(Ahson, I realize this last part is different than what you and I discussed earlier. But running through a few mental exercises makes it seem ok.)
Edit: the corollary to this is that
Span<T>.Empty.IndexOfAny(...)should also return -1, not 0, since you cannot dereferencetheSpan[0]without incurring an exception.