Runtime: Add a MutableString Class that contains the missing StringBuilder functinality

Created on 4 May 2018  路  3Comments  路  Source: dotnet/runtime

From many issues and suggestions about StringBuilder class discussed in these topics

  • Add search methods to the StringBuilder dotnet/runtime#26052
  • StringBuilder should have all functionality of String dotnet/runtime#18417
  • Allow Regex to match and Replace patterns in StringBuilder dotnet/runtime#26070
  • Iterating over a string builder by index becomes ~exponentially slow for large builders dotnet/runtime#24395
    I suggest to Leave the StringBuilder as it is to be a lightweight version to build large strings, and create a new class named MutableString with theses aspects:
    1- Have all the functionality as the the StringBuilder.
    2- Have the missing methods of the String class like IndexOf.
    3- Have methods that use the Regex to find and replace patterns.

To achieve this, the MutableString class chunks must be double liked via the m_chunkPrevious and m_chunkNextFields and use these methods to traverse the chunks:

private MutableString MoveToChunk (int chunkIndex)
private MutableString MoveToFirstChunk( )
private MutableString MoveToLastChunk( )
private MutableString private MutableString private MutableString MoveToNextChunk( )
private MutableString MoveToPrevChunk( )

and a files m_currentChunk that holds the current chunk we moved to.
These methods will solve problems involving iterating the Linked list, so they can be used to implement the search an replace methods instead of the Indexer.
They also can be used to optimize the Indexer, like this:
1- All chunks (except the last one) are full of 8000 chars, then the index of the chunks that contains the wanted index can be computed directly:
int chunkId = (int)Math.Truncate(index / 8000);
It is important to ensure that each chunk is full to the last char before creating a new chunk.
2- Use MoveToChunk(chunkId) to get to the chunk.
3- Read the char at
var i = index - chunked * 8000;
Of cource there is a const for the 8000 length. this is a quick outline.

Again: this indexer is not suitable for iteration, and foreach must be used instead.

The MutableString also should Implement IEnumerator Interface to allow us to use foreach to loop through all the chars avoiding the overhead of using the indexer.

I hope CoreFx approve this proposal, or use it to improve the StringBuilder. If not, I hope someone help me implementing the MutableString as a Nuget. I failed to compile the StringBuilder on my PC because it depends on other protected members in CoreFx.
Feel free to modify and implement these ideas.
I had implemented something like that in the past in the LinkedList, the full cod is here:
https://github.com/dotnet/corefx/issues/25804#issuecomment-386586551

Most helpful comment

I do not think we should do this. We already have a plethora of string representations, e.g.

  • String
  • StringBuilder
  • ReadOnlySpan<char>
  • Memory<char>

plus efforts underway to add more:

  • ValueStringBuilder (https://github.com/dotnet/corefx/issues/28379)
  • Utf8String (https://github.com/dotnet/corefxlab/issues/2165)

I do not want to see us add yet another one.

All 3 comments

I do not think we should do this. We already have a plethora of string representations, e.g.

  • String
  • StringBuilder
  • ReadOnlySpan<char>
  • Memory<char>

plus efforts underway to add more:

  • ValueStringBuilder (https://github.com/dotnet/corefx/issues/28379)
  • Utf8String (https://github.com/dotnet/corefxlab/issues/2165)

I do not want to see us add yet another one.

Since we won't create a new stringlike type, I'm going to close this. There's other issues tracking adding more stringlike API to StringBuilder.

@stephentoub
@danmosemsft
Adding a new class is a part of the proposal, another part is to use the same ideas to improve the StringBuilder. @vancem is improving the indexer already, but without making the StringBuilder Implement the IEnumerator to use foreach, using the for loop is still a desaster. To Implement the Enumerator, the MoveXX methods I proposed are still needed, and they can make use of ForeachChunk that @vancem adding.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jamesqo picture jamesqo  路  3Comments

nalywa picture nalywa  路  3Comments

bencz picture bencz  路  3Comments

aggieben picture aggieben  路  3Comments

matty-hall picture matty-hall  路  3Comments