From UserVoice https://windowsserver.uservoice.com/forums/301869-powershell/suggestions/18580849-bug-sort-is-incorrect-for-strings-containing-the
"somefile1","somefile2","s-abc","s-little","s-foo","s-poo","s-wtf" | sort
s-abc
s-foo
s-little
s-poo
s-wtf
somefile1
somefile2
s-abc
s-foo
s-little
somefile1
somefile2
s-poo
s-wtf
> $PSVersionTable
Name Value
---- -----
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
PSVersion 6.0.0-alpha
PSEdition Core
BuildVersion 3.0.0.0
SerializationVersion 1.1.0.1
PSRemotingProtocolVersion 2.3
CLRVersion
WSManStackVersion 3.0
GitCommitId v6.0.0-alpha.17
It looks as .Net issue (tested on Windows PowerShell and PowerShell Core):
PS C:\WINDOWS\system32> using namespace System.Collections.Generic
PS C:\WINDOWS\system32> $a = New-Object List[string]
PS C:\WINDOWS\system32> "somefile1","somefile2","s-abc","s-little","s-foo","s-poo","s-wtf" | % {$a.Add($_)}
PS C:\WINDOWS\system32> $a
somefile1
somefile2
s-abc
s-little
s-foo
s-poo
s-wtf
PS C:\WINDOWS\system32> $a.Sort()
PS C:\WINDOWS\system32> $a
s-abc
s-foo
s-little
somefile1
somefile2
s-poo
s-wtf
Then it is Windows issue, because for displaying purposes Windows sort filenames in the same order.
PS C:\WINDOWS\system32> [string]::Compare("som","s-l")
1
PS C:\WINDOWS\system32> [string]::Compare("som","s-m")
1
PS C:\WINDOWS\system32> [string]::Compare("som","s-p")
-1
From MSDN String.Compare Method:
Character sets include ignorable characters. The Compare(String,鈥係tring,鈥侭oolean) method does not consider such characters when it performs a culture-sensitive comparison. For example, if the following code is run on the .NET Framework 4 or later, a culture-sensitive, case-insensitive comparison of "animal" with "Ani-mal" (using a soft hyphen, or U+00AD) indicates that the two strings are equivalent.
Unicode Default_Ignorable_Code_Point
Best Practices for Using Strings in the .NET Framework
Based on this we should re-label the problem as internal.
@joeyaiello @stephentoub The Issue is internal. It seems we need PowerShell-Committee review.
Agree that this is not a bug, but a design choice.
However it is confusing behaviour to many who don't expect this, and is not the desirable behaviour in a number of use cases.
Amending this default behaviour would be a breaking change.
However, adding a parameter to allow users to define the sort behaviour, or adding some field to the property
parameter's hash table would resolve this limitation without negatively affecting existing behaviour, and would help people realise that the current behaviour is the designed behaviour.
Proposal
The Property
parameter accepts a collection of hash tables, where the hash table accepts keys Expression
, Ascending
and Descending
.
Adding another key, SortOrderComparer
, which takes a property of type IComparer
would allow custom sort behaviour to be specified for each property. Thus to get the behaviour most people would expect, they could do something like this:
[string[]]$list = @("somefile1","somefile2","s-abc","s-little","s-foo","s-poo","s-wtf")
$list | sort -Property @{Expression={$_}; SortOrderComparer=[System.StringComparer]::Ordinal}
I agree that we could resolve the Issue by means of adding a new parameter to set a comparer options. It seems we should be more general then StringComparer.
Great idea, @JohnLBevan.
I suggest also supporting Comparison<T>
delegates as the SortOrderComparer
value (polymorphically), so you can pass script blocks directly (e.g., { param([string]$x, [string]$y) <# return -1, 0, or 1 #> }
and perhaps shortening SortOrderComparer
to Comparer
.
With a Comparer
key present, Expression
should be optional and default to the whole input object.
For reference https://github.com/PowerShell/PowerShell-RFC/pull/167
Most helpful comment
I agree that we could resolve the Issue by means of adding a new parameter to set a comparer options. It seems we should be more general then StringComparer.