Currently we do agressively path normalizations by replacing '/' with '\' on Windows and '\' with '/' on Unix. NormalizePath()
It is acceptable for Windows but has side affects on Unix because '\' is valid char in directory/file names.
.Net Core do the same for Windows but don't the normalization for Unix.
IsPathRooted on Unix (IsDirectorySeparator)
IsPathRooted on Windows (IsDirectorySeparator)
NormalizeDirectorySeparators on Unix
NormalizeDirectorySeparators on Windows
As you see Unix takes into account only '/' and Windows both '\' and '/'.
Should we follow .Net Core in the path normalization?
Related issues:
Simple side affect repo on Unix below:
mkdir /\
cd \
$pwd
\
/
My vote is to follow .NET Core's lead.
In general we should avoid hiding native capabilities, unless absolutely necessary.
Just to add another example:
On Unix, New-Item -Directory Path a\b, instead of creating a single dir. literally named a\b - which is what the nativemkdir a\b does - it creates _two_ directories, subdir a with a subdir b; again a consequence of automatically translating a\b into a/b.
Conversely, Set-Location won't let you change to a directory literally named a\b.
I found the same problem in registry provider. #5536
It seems we have to move the normalization of paths and possibly (partially) globbing into providers.
Worth noting that changing the behaviour so that only forward slashes work on UNIX(-like) would potentially break a lot of otherwise cross-platform scripts, and would mean that PowerShell 6 on Windows supports UNIX-style paths, but PowerShell 6 on UNIX does not support Windows-style paths (which seems sort of the wrong way round).
cross-platform scripts
They don't exist before 6.0. And cross-platform paths too. We can't break what does not yet exist. Currently we can get cross-platform paths only by means of Join-Path. A script must be written this way to be cross platform (If ignore invalid characters in paths). But this is not a complete solution because some paths is masked by the normalization.
In addition to Join-Path, we could consider an accelerator like [portablepathinfo]@($Home, "etc", "app.cfg") to get $Home/etc/app.cfg on Unix and $Home\etc\app.cfg on Windows. Both should take in account invalid characters in paths. To allow non-portable characters we could use special parameter in Join-Path and maybe [pathinfo] accelerator. With this in mind, we could do the normalization of smarter.
Note that you've always been able to use \ and / interchangeably on Windows - while there may be individual external utilities that don't support / (also, support in cmd.exe is patchy), all the major APIs support it (WinAPI, COM, .NET).
Thus - aside from ruling out illegal-in-a-filename characters - literal use of / works as a cross-platform filesystem-path separator.
Also note that we currently have few abstractions for well-known locations: it's currently just $HOME and $PSHOME; $env:PSModulePath _contains_ well-known locations, but not in an individually identifiable manner.
Even a platform-abstracted temporary-files location is currently not implemented - remember #4216?
all the major APIs support it (WinAPI, COM, .NET).
literal use of / works as a cross-platform filesystem-path separator.
I wonder - PowerShell do normalization, .Net do normalization and Win32 do too - why need this on three levels? And after that someone says that PowerShell (Windows) is slow.
Also we do repeated path parsing and rebuilding, To be more resource-efficient and fast we need to delegate common methods to providers, .Net and kernel APIs.
I wonder - PowerShell do normalization, .Net do normalization and Win32 do too - why need this on three levels? And after that someone says that PowerShell (Windows) is slow.
Also we do repeated path parsing and rebuilding, To be more resource-efficient and fast we need to delegate common methods to providers, .Net and kernel APIs.
Even in PowerShell, I think we're doing it multiple times.
The other problem, like I mentioned in https://github.com/PowerShell/PowerShell/issues/5536#issuecomment-387476149, is that we already promise to support multiple container-enabled providers that use conflicting path separators and legal name characters. So PowerShell has to do some abstract, provider-based path handling I imagine. But once we know it's a filesystem path, I agree that we should do as little as possible on top of .NET.
Also, by "breaking change" above, I mean that PSCore 6 has already shipped as GA and people are already writing scripts with backslashes in their paths to be used cross-platform.
I fervently agree that just using / would be better, especially since all the major APIs support it, it was designed into Windows from the start, and backslashes are just the bad legacy of CMD/DOS.
But I think there are scripts already being written as cross-platform that would break, and scripts from older PowerShell versions that should work cross-platform with PS6 if not for the path-separator changing.
@rjmholt
Also, by "breaking change" above, I mean that PSCore 6 has already shipped as GA and people are already writing scripts with backslashes in their paths to be used cross-platform.
That's definitely possible (and perhaps you have already found examples), but do note that the current lack of platform-abstracted locations limits the scenarios in which \-based path literals are potentially useful in cross-platform code:
$HOME$env:TEMP (Windows), $env:TMPDIR (macOS), /tmp (Linux).I personally have no sense of how much that has happened already.
P.S.: My _guess_ is that someone savvy enough to have implemented their own platform abstractions, which requires some knowledge of the Unix world, is likely to have used / as the path separator.
Seems we don't mention that if we send paths through pipeline we again parse, normalize and so on.
Add #3441,#1817 in PR description.
@PowerShell/powershell-committee reviewed this. We agree that the utility of supporting both forward and backslashes as a directory separator is valuable for cross platform scripts as well as being existing behavior. The fundamental issue seems to be that escaped characters in paths are not propagating through providers and this is a bug that should be fixed.
Seems I don't understand the conclusion. :confused:
The fundamental issue is that currently PowerShell _encourages_ the creation of non-portable scripts and thus produces a huge number of auxiliary operations consuming a lot of resources.
Portable scripts shouldn't contain literal paths. We should use Join-Path and something like [portablepath]@().
PowerShell should works with native paths in dir | copy-item without reparsing if we want to somehow get closer to productivity of cmd/bash shells.
PowerShell should works with native paths internally and on top level (-Path/-LiteralPath parameters) - especially in interactive mode. Here we have some Issue and should address them. Why we should be escaping in Unix path on Unix? I suppose it's very annoying. Why we don't support '\' in Unix path?
I'm pretty sure that work in this direction can keep backward compatibility with Windows PowerShell.
Maybe @mklement0 could make more-in-depth review provider path issues.
Most helpful comment
I wonder - PowerShell do normalization, .Net do normalization and Win32 do too - why need this on three levels? And after that someone says that PowerShell (Windows) is slow.
Also we do repeated path parsing and rebuilding, To be more resource-efficient and fast we need to delegate common methods to providers, .Net and kernel APIs.