Very often you have to remove whitespaces from a string, especially when working with user inputs, e. g. telephone numbers, postal codes, credit card / bank numbers etc.
For example: “ab1234 5678 9999” and you want just “ab123456789999”.
There are trim methods, but they only consider the start and the end (TrimStart(), TrimEnd() or Trim() for both). It would be very handy if there would be a TrimAll() method which just removes all whitespaces (at the start, at the end and in between). This would be easier to read and write than the alternative way with Replace() and it would complete it.
namespace System
{
public sealed class String : IEnumerable<char>, IEnumerable, ICloneable, IComparable, IComparable<String?>, IConvertible, IEquatable<String>
{
+ public String TrimAll() {}
}
}
``` C#
var testString = "+49(0) 1234 5678 999 10 ";
// Expected: "+49(0)1234567899910"
// Current way
var newString = testString.Replace(" ", System.String.Empty);
// Suggestion
var newString2 = testString.TrimAll();
```
No alternative.
I don't know any risks regarding my proposal.
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Seems a similar proposal to #41287, just with a more narrow focus.
That being said, »trimming« usually means removing stuff from either _end_ and I haven't seen that many octopus strings that have more than two ends from which to trim stuff, so the name might be misleading as it is right now.
I don't think that _trim_ is only limited to the start or the end, it just means it shortens something and therefore the where doesn't matter.
In the scenarios provided (postal codes, telephone numbers), wouldn't you also want to trim out all punctuation? So things like '+', '(', '-', ')' also get trimmed along with whitespace?
My proposal ygra linked for you does exactly what you want but is not limited to single characters only. Why not just up vote it and show your interest and post your scenario as a comment so the .NET team will have it easier when reviewing?
In the scenarios provided (postal codes, telephone numbers), wouldn't you also want to trim out all punctuation? So things like '+', '(', '-', ')' also get trimmed along with whitespace?
For my current use cases I actually just want a method to trim whitespaces.
My proposal ygra linked for you does exactly what you want but is not limited to single characters only. Why not just up vote it and show your interest and post your scenario as a comment so the .NET team will have it easier when reviewing?
When I created my proposal I didn't see your proposal. They are similar, but I see my proposal more like a completion to the other trim methods (start, end, both) and mine would add the in between.
For my current use cases I actually just want a method to trim whitespaces.
Sure, that makes sense. But would a "remove all whitespace everywhere" routine be generally applicable to a large developer audience? As I mentioned earlier, when processing postal codes or phone numbers or dates or whatever, the desire is generally to remove whitespace _and_ punctuation. I'm trying to validate that this API proposal would have benefit for the wider ecosystem and that it's not confined to just a small handful of use cases.
How about string.RemoveAll(string original,ReadOnlySpan<char> removedChars); ?
As I mentioned earlier, when processing postal codes or phone numbers or dates or whatever, the desire is generally to remove whitespace and punctuation
If you remove punctuation from the given phone number, you will get a completely different phone number. Therefore the goal is not to have a numbers only string but to have a compact string (without whitespaces) so you can save it in a database for example.
An other example could be a product code which has a format like xxxx-xxxx-xxxx. Here you don't want to remove punctuation, just whitespaces. A user could write: 1111 - 2222 - 3333 and with TrimAll() you would get the wanted string 1111-2222-3333.
I'm in the same camp as @GrabYourPitchforks; I've not seen enough demand for something like this to make it worth adding, especially with the variations someone may need. Seems like exactly the kind of thing Regex is good at, e.g.
C#
text = Regex.Replace(text, @"\s", "");
If you remove punctuation from the given phone number, you will get a completely different phone number.
Other than digits, you can only dial (send tones for) # and *. "Punctuation" here is going to be non-dialable characters commonly used in formatting - parenthesis, dashes, slashes, etc. You might end up with different formatting, but it's still the same number (and in fact, you normally want to canonicalize the formatting anyways, which this would help with).
An other example could be a product code which has a format like xxxx-xxxx-xxxx. Here you don't want to remove punctuation, just whitespaces.
This is iffy. Many product codes don't use dashes to actually differentiate different codes, the dashes are there to break them up for the benefit of the reader; they're there to help you keep track of place. Good UI design for entering them will auto-insert dashes in the correct places to prevent users needing to (and to not have to worry about the "correct" dash, since you might get different ones on different keyboards).
Seems like exactly the kind of thing Regex is good at
text = Regex.Replace(text, @"\s", "");
Well, you are right, but with this argument you could also remove the current methods TrimStart(), TrimEnd() and Trim(), because you could simply use Regex for it. Same goes for LINQ. You can always use raw SQL.
You created such methods to reduce the boilerplate, make code more readable, follow conventions etc. and this is the intention for my proposal.
Other than digits, you can only dial (send tones for) # and *
No, I can dial "+" too which is needed for international calls.
"Punctuation" here is going to be non-dialable characters commonly used in formatting - parenthesis, dashes, slashes, etc. You might end up with different formatting, but it's still the same number
if you remove the paranthesis of the given phone number, together with the plus sign, you will get "4901234567899910" which is not the same number as "+49(0)1234567899910".
Also I did not say that it should be a dialable number (for a computer).
You're right that you should canonicalize phone numbers and there might be other things you should consider. This was just meant as a basic example that there might be use cases where you not only want whitespace be removed from the start and the end but also removed in between.
Good UI design for entering them will auto-insert dashes in the correct places to prevent users needing to
I totally agree with you, that a good UI should do that, but that doesn't mean that every product manufacturer is doing it. I do know some cases in the past where I had to type dashes to actually being able to activate a digital product.
Also this was just an additional example. I did not start a study because of this proposal to find the best use cases out there. I just think and I know friends of mine which agree with me, that it can be useful, if there is a simple method to remove all whitespaces from a string.
I know that you can use Replace / Regex as I wrote that in my example, but there are trim methods which actually has the only function to remove whitespaces from a string. I just want to extend that behaviour to also consider the middle part of a string, not only the start and the end.
When we consider an API we ask ourselves a bunch of scenario-based questions. Loosely, these questions would be:
Of course, there are always exceptions to this. But it should give a good overview of the initial process. :)
The questions I've been posing on this thread are an effort to weigh this feature request against these criteria. That's why we're digging in deep to what the actual scenarios are and whether the dev audience at large would be well-served by this proposal.
This is trivial to do currently, and a rare use case, so I don't see a need for it.
Something I've seen much more frequently (and is in other libraries in other languages) is Squish. Which both trims whitespace from the ends of a string and collapses contiguous whitespace inside the string.
Another variant of this would be:
string RemoveAll(Func<char, bool> predicate)
Most helpful comment
When we consider an API we ask ourselves a bunch of scenario-based questions. Loosely, these questions would be:
Of course, there are always exceptions to this. But it should give a good overview of the initial process. :)
The questions I've been posing on this thread are an effort to weigh this feature request against these criteria. That's why we're digging in deep to what the actual scenarios are and whether the dev audience at large would be well-served by this proposal.