It looks like our file sorting in the explorer does not match platform beahviour in some cases.
Windows:
foo.ts is sorted before foo_test.ts but we sort it the other way aroundLinux:
foo.ts is sorted before foo_test.ts but we sort it the other way around[out, outb, outd, Outa, Outc] are showing up as [out, OutA, outb, Outc, outd]macOS:
We use a JavaScript Collator for the comparing here.
Unfortunately I am not able to tweak the Collator options to bring me the desired result...
I want sort order like GitHub repo. Please!

I have some YML files whose names are based on GUIDs. They aren't even close to being alpha sorted.
vscode 1.15.1, macOS Sierra 10.12.6
Linux: a lowercase file seems to be sorted before an upper case file
Umm... I think it is the other way around, upper case first. Basically, just sorting using the ASCII value of each character.
@dlech on Linux, at least, it depends on the chosen 'locale' and the environment variable LC_COLLATE can be used to influence this behaviour (this influences for example the ls command, ).
If you consider to implement / support platform specific behaviour you may want to consider evaluating the locale setting on Linux, specifically LC_COLLATE (if set, otherwise fallback to the set local).
user@host -- ~/tmp/casetest $ LC_COLLATE='en_GB.UTF-8' ls -1
2
5
a
A
Aa
aB
AZ
C
user@host -- ~/tmp/casetest $ LC_COLLATE='en_EN.UTF-8' ls -1
2
5
A
AZ
Aa
C
a
aB
user@host -- ~/tmp/casetest $ LC_COLLATE='C' ls -1
2
5
A
AZ
Aa
C
a
aB
It's called shortlex, or lexiographic sort order. The only thing you need to tweak is the string length. If A is shorter than B then A is smaller than B. This is not going to be covered by any collation. An alternative to this is to introduce padding (padding of the sort so that the comparison is less) but I don't think it's reasonably to do that due to the extra garbage generated.
To be more specific.
For two strings a and b of unequal length, you take the Math.min(a.length, b.length) of both strings and compare that using whatever compare you like to use. If they are equal, i.e. c = 0 then you use the string length to finalize the sort order. i.e. if a and b shared a common prefix but a is shorter, then a is smaller. etc.
@bpasero something like this:
const a = one || "";
const b = other || "";
const minLen = Math.min(a.length, b.length);
const result = intlFileNameCollator
.getValue()
.collator.compare(a.substr(0, minLen), b.substr(0, minLen));
if (
result === 0
) {
if (a.length < b.length) {
return -1;
}
if (b.length < a.length) {
return +1;
}
return 0;
}
return result;
I dropped the
collatorIsNumericstuff because it just adds confusion.
Shortlex orders primarily by length, the code presented implements a different ordering. The example:
aggregate.go
aggregate_registry.go
event.go
event_registry.go
README.md
rehydration.go
as shortlex order would be:
event.go
README.md
aggregate.go
rehydration.go
event_registry.go
aggregate_registry.go
Windows Explorer uses Natural Sort, because length is not sufficient for good order, as an example:
action_5_example.txt
action_10_ex.txt
@egonelbre then I misunderstood the meaning of shortlex, I should have just said lexicographic. but the code does what it is supposed to do. that is, first sort up to X characters, then use the length as a discriminator.
Why complicate this though. Lexicographic (not shortlex as I incorrect first called) it easy to implement and understand. This is not going to get done if we insist on extra work to align with something which is highly Windows Explorer specific. That should not be the high water mark here.
It's not really Explorer specific, you can read more about it in https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/. I mentioned it because your other issue showed that as an example.
The reason you don't want to use lexicographic first is due to the last example.
action_5_example.txt
action_10_ex.txt
Sorted lexicographically is:
action_10_ex.txt
action_5_example.txt
@egonelbre as a programmer, I don't care. But as a user of Windows Explorer, I could see why adhering to the natural sort order would seem more natural.
I tried making this: https://github.com/microsoft/vscode/issues/75415 but was closed. There is currently no option to make the explorer sort native.
and no-one writes 5, 10 we write 05,10, this is a fundamental fact.
vs code sorts like this,
log-10
I'd rather it sort like this,
log-2
It is very jarring for me when vs code does not sort lexicographically.
Almost every other tool I use sorts lexicographically. When vs code tries to be different, it just confuses me for a short moment. Repeatedly. And it adds up.
Like @fenchu said, if I wanted to sort by numeric values, I'd zero-pad those numbers to the desired length.
@bpasero I'm hoping you can advise and save me some time if this doesn't make sense or is unlikely to be accepted as a pull request.
I'm considering creating a pull request that would add a new setting - explorer.sortCaseSensitive with a default value of false.
I considered adding one or more options to the existing explorer.sortOrder setting, but case sensitivity seems to be orthogonal to those options - and (nearly) doubling the number of options to add case sensitive versions doesn't seem like the best idea:
'explorer.sortOrder': {
'type': 'string',
'enum': [SortOrder.Default, SortOrder.Mixed, SortOrder.FilesFirst, SortOrder.Type, SortOrder.Modified],
'default': SortOrder.Default,
'enumDescriptions': [
nls.localize('sortOrder.default', 'Files and folders are sorted by their names, in alphabetical order. Folders are displayed before files.'),
nls.localize('sortOrder.mixed', 'Files and folders are sorted by their names, in alphabetical order. Files are interwoven with folders.'),
nls.localize('sortOrder.filesFirst', 'Files and folders are sorted by their names, in alphabetical order. Files are displayed before folders.'),
nls.localize('sortOrder.type', 'Files and folders are sorted by their extensions, in alphabetical order. Folders are displayed before files.'),
nls.localize('sortOrder.modified', 'Files and folders are sorted by last modified date, in descending order. Folders are displayed before files.')
],
'description': nls.localize('sortOrder', "Controls sorting order of files and folders in the explorer.")
},
This change would only partially address this open issue, since it:
I think it would probably satisfy a lot of people though, and TBH I'm not sure that aligning with platform case sensitivity is the best option. For example, I noticed a number of comments on this and related issues were asking to align the file sort order with github, which always does a case sensitive sort.
Anyway, by providing a setting, people can choose, and by defaulting to the current behavior nobody will be affected by the change unless they want to be.
Also, if the future default behavior is changed, this setting will still be useful for people who want to override that default behavior.
What do you think? Should I go ahead?
If yes, is this the right issue to reference in the PR or should I create a separate issue that just links to this one?
Thanks in advance!
@leilapearson thanks for the offer.
Ideally we would just align explorer sorting with platform sorting without any option. @bpasero already provided a code pointer where he is doing the comparing.
If that is not possible only then we can look into adding more settings.
An alternative is to look to open this up to extensions and then extensions could control this and satisfy the 20 different sorting styles that users want.
@isidorn thanks for the reply. That's why I asked before doing anything other than taking a look at the code.
Opening this to extensions is an interesting option. At the same time, I would still think that offering control over whether the sort is case sensitive or not should be a core option and not require an extension.
It isn't easy on some platforms to adjust the sort order - and having to figure out how to get your whole platform to sort case sensitive in order for VS Code to sort case sensitive seems a bit awkward? Especially if you primarily develop on one platform and only spend a bit of time developing on other platforms.
Also, I find that programming on a platform is a different context than using a platform for office work. Having different sort orders apply to the different contexts often makes sense.
For example, I tend to sort things by most recently modified when I'm working on documents and the like - so this is my default in file explorer and google docs. On the other hand, I don't want my code sorted by modification date and I'm happy with how things are sorted in my terminal - but unfortunately not so happy with how they are sorted in VS Code.
I do agree that too many settings can be a bad thing, but I'm curious if you agree or not that a setting to control case sensitivity would make sense regardless?
P.S. An example of how hard it can be to change the sort (collate) order on a platform is OSX - which doesn't expose any nice way to do that it seems:
https://apple.stackexchange.com/questions/34054/case-insensitive-ls-sorting-in-mac-osx
Ok, makes sense. I would be open to a lean and nice PR that controls if sorting is case sensitive or not.
Thanks
Thanks @isidorn. Before I go ahead, a bit of extra context and one more question...
There are actually a total of 4 options for defining what "alphabetically" means. Per the ECMA 402 standard:
The sensitivity of collator is interpreted as follows:
base: Only strings that differ in base letters compare as unequal. Examples: a ≠b, a = á, a = A.
accent: Only strings that differ in base letters or accents and other diacritic marks compare as unequal. Examples: a ≠b, a ≠á, a = A.
case: Only strings that differ in base letters or case compare as unequal. Examples: a ≠b, a = á, a ≠A.
variant: Strings that differ in base letters, accents and other diacritic marks, or case compare as unequal. Other differences may also be taken into consideration. Examples: a ≠b, a ≠á, a ≠A.
NOTE In some languages, certain letters with diacritic marks are considered base letters. For example, in Swedish, "ö" is a base letter that's different from "o".
Instead of just exposing case sensitivity as a true or false option, I'm thinking it would be best to allow any of the 4 options. base would be the default value since that's what's hardcoded into the current code.
Any objection to offering all 4 options?
@leilapearson this makes sense. However to simplify this a bit I suggest the following:
explorer.sortCaseSensitive setting with string values "on" and "off"string setting this should not be a problem)The reason why I prefer this solution is simplicity and I think it covers the 99% use case.
Let me know what you think.
Sounds good. Thanks @isidorn .
Well @isidorn, that was a bit trickier than expected, but I think I have something that is almost ready to submit. I'll take one last look tomorrow to make sure I haven't missed anything.
Unfortunately the solution I was originally picturing didn't work. It turns out that grouping by case is a very different thing than comparing by case! :-)
There were also some special cases that I needed to adjust for - including the one that @leidegre pointed out. I can describe them next to the relevant code when I submit the PR.
Since the solution was different than I was imagining, the new setting is different than we discussed, but I think it's still simple to understand and use. Let me know if you have any concerns or comments.
Here's what it looks like now:
'explorer.sortOption': {
'type': 'string',
'enum': [SortOption.Numeric, SortOption.Upper, SortOption.Lower, SortOption.Mixed],
'default': SortOption.Numeric,
'enumDescriptions': [
nls.localize('sortOption.numeric', 'Mixes uppercase and lowercase names together. Numbers are sorted numerically, not alphabetically.'),
nls.localize('sortOption.upper', 'Groups uppercase names before lowercase names. Numbers are sorted alphabetically.'),
nls.localize('sortOption.lower', 'Groups lowercase names before uppercase names. Numbers are sorted alphabetically.'),
nls.localize('sortOption.mixed', 'Mixes uppercase and lowercase names together. Numbers are sorted alphabetically.')
],
'description': nls.localize('SortOption', "Further specifies the file and directory sort order.")
}
I'm very happy to say that with this new setting and some small tweaks in the code I believe that the whole problem might be solved - or at least solved enough to satisfy most people.
Namely:
aggregate.go and aggregate_repo.go sort as expectedSortOrder and the new SortOption setting, users should be able to emulate the most popular filename grouping options - including the one that github uses.Whew!
Actually I just realized one more option would be good to add - namely a simple unicode sort.
It seems that the terminal on a Mac uses a simple unicode sort. The file explorer on a Mac uses a localized sort though.
I think a lot of people might want to match their sort order to their terminal, and the other options don't give you that.
It would be trivial to add a SortOption.Unicode to the list.
Sound good. Once you submit a PR feel free to ping me @isidorn on it and we can continue the discussion there. Thanks
Perfect @isidorn . Thanks!
Any progress?
@Sytten some sort order edge cases were addressed in #97200 and that change is available in vscode 1.46.0. See the PR for a detailed description.
I also have an open PR #97272 - old now and sure to need an update - to add some additional lexicographic options to allow sorting in unicode order, locale order with uppercase first, or locale order with lowercase first.
Which specific functionality were you hoping to see addressed?
On MacOS the files are sorted case insensitive and I didnt find a way to sort them case sensitive (without affecting the order files/folders).
PR #97272 adds the option to group files and folders by case, but that PR was submitted at a time when the reviewers weren't available and is out of date now. I'll take a look at resurrecting it.
Just wanted to provide a couple of quick updates for anyone watching this issue.
aggregate.go and aggregate_repo.go are sorted. See Issue #99955 if you want more details.
Most helpful comment
I want sort order like GitHub repo. Please!