Runtime: Add/Expose String.EndsWith(char)

Created on 3 Dec 2015  路  32Comments  路  Source: dotnet/runtime

Moved bug from https://github.com/dotnet/coreclr/issues/1463 to proper repo

davidfowl commented on Aug 31

So we don't have to rewrite it over and over :smile:. Seems like its already there just made internal for > some reason:

https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/String.cs#L2210

People end up writing code like this:

https://github.com/aspnet/dnx/blob/dev/ext/compiler/preprocess/Internalization.cs#L100

and this

https://github.com/aspnet/dnx/blob/dev/src/Microsoft.Dnx.Runtime/NuGet/Utility/PathUtility.cs#L45

bbowyersmyth commented on Sep 2

I had the same thought a while back and discovered it would be a breaking change for an edge case in VB.

Against the current framework the following is valid VB:
Dim find As Char = "a"c
Dim result = "abc".StartsWith(find)

The compiler wraps find with a call to a string conversion and all is good. If EndsWith(char) was > made public and the code recompiled it would switch from using the CurrentCulture based EndsWith(string) to the Ordinal based EndsWith(char) .

Whether any of the offending characters or the above scenario is deemed important I would love to hear. Especially a StartsWith(char) with direct access to String.m_firstChar

AlexGhiondea commented on Sep 8

It seems like the VB compiler is trying to help by picking an overload that will work.

@VSadov -- do you know what the VB compile would do if we were to introduce the bool EndsWith(char value) overload on string?

@davidfowl can you create an issue in the CoreFx repo (following the API review process)?

api-approved area-System.Runtime up-for-grabs

Most helpful comment

The semantics of the overloads follow the ones that take string, for example, EndsWith('c') should be identical to calling EndsWith("c"). That means the simple one must be culture-aware instead of the current (internal) implementation that is ordinal.

Note that IndexOf("c") and LastIndexOf("c") behave differently than IndexOf('c') and LastIndexOf('c'). All of the methods on string that take char do an ordinal comparison by default. IndexOf(char) and LastIndexOf(char) do an ordinal comparison, whereas IndexOf(string) and LastIndexOf(string) do a culture-sensitive comparison.

How do we reconcile this for StartsWith(char) and EndsWith(char)? Making them culture sensitive by default makes them match the behavior of the string overloads, but makes them inconsistent with the rest of string's methods that operate on single chars.

Also, I could not find any existing methods on string that operates on a char and accepts a StringComparison or CultureInfo. Do we really need it for StartsWith(char) and EndsWith(char)?

The following do an ordinal comparison:

  • IndexOf(char)
  • IndexOf(char, int)
  • IndexOf(char, int, int)
  • IndexOfAny(char[])
  • IndexOfAny(char[], int)
  • IndexOfAny(char[], int, int)
  • LastIndexOf(char)
  • LastIndexOf(char, int)
  • LastIndexOf(char, int, int)
  • LastIndexOfAny(char[])
  • LastIndexOfAny(char[], int)
  • LastIndexOfAny(char[], int, int)

The following do a culture-senstive comparison:

  • IndexOf(string)
  • IndexOf(string, int)
  • IndexOf(string, int, int)
  • LastIndexOf(string)
  • LastIndexOf(string, int)
  • LastIndexOf(string, int, int)

The following take a StringComparison option (Note: there are no StringComparison options for the char overloads, only for the string overloads):

  • IndexOf(string, StringComparison)
  • IndexOf(string, int, StringComparison)
  • IndexOf(string, int, int, StringComparison)
  • LastIndexOf(string, StringComparison)
  • LastIndexOf(string, int, StringComparison)
  • LastIndexOf(string, int, int, StringComparison)

All 32 comments

@VSadov -- any thoughts about how would this impact VB?

in VB char widens to string. The exact match will be preferred, but char can cast to string implicitly if needed.

I am surprised that StartsWith("A") and StartsWith('A') could produce different results though. Even in C# it would feel strange that default behavior of these two overloads WRT single char string is different.

This looks good provided that:

  • We add overloads for both, StartsWith and EndsWith
  • We add all three versions (char, char + bool + culture, char + comparison)
  • The semantics of the overloads follow the ones that take string, for example, EndsWith('c') should be identical to calling EndsWith("c"). That means the simple one must be culture-aware instead of the current (internal) implementation that is ordinal.
  • We suggest renaming the current one to EndsWithOrdinal and leaving it internal

I plan to work on it

@karelz could you assign me? Thanks

@brfalcon will fix this issue. Thanks

Invite sent ...

Invite accepted! Thanks @karelz

Assigned to you ...

Thanks @karelz

Hello @terrajobst! I've added the missing parts as suggested however I am being unsuccessful while trying to consume those new public API's. Which documentation could guide me through the appropriate process?

@brfalcon where did you add those APIs? If they have been added to CoreCLR then we have a 3 step process:

  1. Make the changes in CoreCLR
  2. Consume the CoreCLR from CoreFX
  3. Expose the APIs in CoreFx

@AlexGhiondea, I've added them to CodeCLR. Would you have any documentation that provides more details on how to perform those steps?

I don't think we have great docs on that yet. It's on my todo list.

@brfalcon we are working on that.

If you want to try locally, you can, more or less, replace system.private.corelib.dll from the CoreCLR build into the Tools folder in your CoreFx build. But the official way is still the one I outlined above.

I will try to run locally following your suggestion. Thanks @karelz and @AlexGhiondea !

Hi Alex, I can't find System.Private.Corelib.dll in Tools folder in my CoreFx.

We are following this doc: https://github.com/dotnet/coreclr/blob/master/Documentation/project-docs/adding_new_public_apis.md

We've already did the changes in CoreClr, adding the new methods in the string class and model.xml to expose the new methods. After building it, I opened the System.Private.CoreLib.dll file in ILSPY and could find the new methods. So I guess the CoreClr part is ok.

I believe we need to expose those new methods in System.Runtime, so I added the "stub" methods in src/System.Runtime/ref/System.Runtime.cs. When I try to build I get this error:

MembersMustExist : Member 'System.String.TrimEnd
()' does not exist in the implementation but it does exist in the contract.

I tried to change src/System.Runtime/src/project.json to point to my version of Microsoft.TargetingPack.Private.CoreCLR: 1.2.0-beta-24902-4
And updated CoreClrExpectedPrerelease in dependencies.props to beta-24902-4

So I got another error:

Dependencies invalid: In 'C:\Users\user\S
ource\corefx\src\System.Runtime\src\project.json', 'Microsoft.TargetingPack.Private.CoreCLR 1.2.0-beta-24902-4' must be
'1.2.0-beta-24903-02' (CoreClr)

To bypass the error I removed the element VerifyDependencies in VersionTools.targets.

Am I on the right track?

I got it. To be able to consume my CoreClr I had to add a reference to my CoreClr packages path at element DnuRestoreSource in dir.props.

And set the Env. Var.:
set SkipVerifyPackageVersions=true

@fujiy pointing to your package location that you build in CoreCLR is another option!

Glad to see you got it working!

Why did you have to set that envvar?

@AlexGhiondea, instead of removing the element VerifyDependencies from the VersionTools.targets file, we set that envvar in order to bypass the dependencies verification.

The semantics of the overloads follow the ones that take string, for example, EndsWith('c') should be identical to calling EndsWith("c"). That means the simple one must be culture-aware instead of the current (internal) implementation that is ordinal.

Note that IndexOf("c") and LastIndexOf("c") behave differently than IndexOf('c') and LastIndexOf('c'). All of the methods on string that take char do an ordinal comparison by default. IndexOf(char) and LastIndexOf(char) do an ordinal comparison, whereas IndexOf(string) and LastIndexOf(string) do a culture-sensitive comparison.

How do we reconcile this for StartsWith(char) and EndsWith(char)? Making them culture sensitive by default makes them match the behavior of the string overloads, but makes them inconsistent with the rest of string's methods that operate on single chars.

Also, I could not find any existing methods on string that operates on a char and accepts a StringComparison or CultureInfo. Do we really need it for StartsWith(char) and EndsWith(char)?

The following do an ordinal comparison:

  • IndexOf(char)
  • IndexOf(char, int)
  • IndexOf(char, int, int)
  • IndexOfAny(char[])
  • IndexOfAny(char[], int)
  • IndexOfAny(char[], int, int)
  • LastIndexOf(char)
  • LastIndexOf(char, int)
  • LastIndexOf(char, int, int)
  • LastIndexOfAny(char[])
  • LastIndexOfAny(char[], int)
  • LastIndexOfAny(char[], int, int)

The following do a culture-senstive comparison:

  • IndexOf(string)
  • IndexOf(string, int)
  • IndexOf(string, int, int)
  • LastIndexOf(string)
  • LastIndexOf(string, int)
  • LastIndexOf(string, int, int)

The following take a StringComparison option (Note: there are no StringComparison options for the char overloads, only for the string overloads):

  • IndexOf(string, StringComparison)
  • IndexOf(string, int, StringComparison)
  • IndexOf(string, int, int, StringComparison)
  • LastIndexOf(string, StringComparison)
  • LastIndexOf(string, int, StringComparison)
  • LastIndexOf(string, int, int, StringComparison)

@justinvp that's a very good point. @terrajobst do we want to bring it back for API review?

If we come to the conclusion that the culture aware implementation of EndsWith(char) and StartsWith(char) isn't needed, I could reuse the implementation from the internal method EndsWithOrdinal to build both EndsWith(char) and StartsWith(char). Does it makes sense?

Hi @terrajobst, did you have a chance to evaluate this item? Thanks!

Let's discuss it in next API review.

That's unfortunate but our goal is to make things consistent with existing APIs, especially overloads. It seems all char based overloads are using ordinal semantics while the string ones are culture-sensitive. In that case, I think we should follow that.

So the new plan should be:

  • We add overloads for both, StartsWith and EndsWith
  • We add only the char version (no other overloads operating on chars take comparisons, cultures, or booleans indicating casing)
  • The semantics of the overloads follow the other ones that take char, for example, EndsWith('c') should be identical to calling s.LastIndexOf('c') == s.Length - 1.

If there is a scenario for using culture-sensitive char-operations we can add additional overloads. However, we should do this across all existing ones and thus it should be a separate issue.

[@VSadov] I am surprised that StartsWith("A") and StartsWith('A') have different results

Yes. Unfortunately we cannot change that behavior anymore.

BTW: This is another example, why we need API proposal in the "formal form" (ideally in the top-most post). It makes it much, much easier to track and understand what we approved and what is the diff in next API proposal revision, etc. I will be insisting it more in future :)

Thanks, we'll take care of it

The API needs to be exposed & tests added.

Ooops, I somehow missed that :(

Was this page helpful?
0 / 5 - 0 ratings