When the path to the app.config file is an absolute path that contains a \\ double backslash RAR doesn't find the app.config file.
Here RAR is trying to open the app.config file without normalizing the path:
https://github.com/Microsoft/msbuild/blob/64a5b6be6d4c1a45c02537a08dca8dd1db09f7f2/src/Tasks/AppConfig/AppConfig.cs#L26
If an XmlReader.Create is passed an absolute path with \\.. then it's not normalized correctly:
var reader = XmlReader.Create(@"C:\a\b\\..\c"); // will try to read from C:\a\b\c instead of C:\a\c
This is because XmlUrlResolver doesn't do it right:
var resolver = new XmlUrlResolver();
var uri = resolver.ResolveUri(null, @"C:\a\b\\..\c"); // C:/a/b/c, instead of C:/a/c
The attached .csproj:
RarAppConfigPath.zip
Fails to build with:
C:\Temp\RarAppConfigPath\\A\B\\..\app.config : error MSB3249: Application Configuration file "C:\Temp\RarAppConfigPat
h\\A\B\\..\app.config" is invalid. Could not find file 'C:\Temp\RarAppConfigPath\A\B\app.config'.
Note how it should be finding A\app.config but is instead looking for A\B\app.config which doesn't exist.
Is the discrepancy between Path.GetFullPath vs XmlUrlResolver a BCL bug in the latter?
Yup, certainly a bug in XmlUrlResolver. It should ignore the second backslash.
^ @krwq
Xml only does following:
https://github.com/dotnet/corefx/blob/master/src/System.Private.Xml/src/System/Xml/XmlResolver.cs#L40-L45
Which possibly is related with https://github.com/dotnet/corefx/commit/f2df042b1b1772920286e0d3a87830315d6c2562 (cc: @tmds)
Does anyone have rights to move this issue to corefx?
I'm wondering should Uri even normalize double slashes? IMO this is msbuild specific
@krwq what does the latest corefx output here?
var resolver = new XmlUrlResolver();
var uri = resolver.ResolveUri(null, @"C:\a\b\\..\c"); // C:/a/b/c (which is incorrect, should be C:/a/c)
Path.GetFullPath(@"C:\a\b\\..\c") returns C:\a\c which is correct.
@KirillOsenkov note that the parameter to resolver is uri represented as string not a path:
new Uri(@"C:\a\b\\..\c", UriKind.RelativeOrAbsolute).ToString() => file:///C:/a/b/c
Oh, are you saying that it's the Uri constructor, and not the XmlUrlResolver?
Yes, indeed, so it's a bug in Uri then...
Yes - cc: @wtgodbe
http uris show the same behavior:
new Uri("http://www.google.com/a/b//../c").ToString() returns "http://www.google.com/a/b/c"
Looking
I believe this behavior is by-design. As per RFC 3986 Section 3.3, we define a path in a URI as a sequence of path segments, separated by the / character. One valid path segment is the empty string, so when we see // in a path, we'd treat the empty string between the two slashes as a segment. Therefore the .. that comes after // would reference that empty string path and eliminate it, and correctly convert /a/b//../c to a/b/c.
This occurs in code here: https://github.com/dotnet/corefx/blob/be8feef62c5a3f14c13b35e67a523765b84a770e/src/System.Private.Uri/src/System/Uri.cs#L5061-L5072. This loop iterates backwards from the end of the URI string to the beginning. At the moment the loop is examining the first of the two slashes, it knows it's just seen a /.. sequence, and should therefore eliminate the next path segment it sees. As it is currently examining a /, it knows that the string in between the current / and the subsequent / is the segment to be eliminated (in this case, the empty string), so it copies the path segment that followed the /.. (in this case, c) to the index after the current /.
CC @wfurt @davidsh @karelz, in case they disagree with my assessment.
There is a long history of issues caused by components such as XML trying to use System.Uri with platform filepaths. The RFCs for Uri such as 3986 don't have crisp definitions for filepaths in the absence of a scheme designator. Some behaviors that XML needs would be better suited to using the System.IO.Path methods instead of System.Uri.
For example, "file:///etc/hosts" is a URI that explicitly uses the "file" scheme. It uses forward slashes. But things like "C:a\b\..c" have no URI scheme at all. They are implicit "file" scheme paths. The RFC doesn't talk much about that. We invented several behaviors in the .NET System.Uri class to handle those as best as possible. There are different conversions we do from backward slash to forward slash for example.
There are also different behaviors that are scheme specific. Some of those may map to handling path components. So, you can't expect "http" and "file" URIs to behave exactly the same all the time.
In the latest .NET Core, we made more changes to the handling of file paths in order to make things work better on Linux.
In summary, I'm not sure yet whether to consider this particular problem a bug or a by-design behavior in Uri. I think it needs further studying before considering any change to Uri.
I also recommend gathering more information about what other Uri implementations do in this case. For example, what does .NET Framework do? What about WinRT Windows.Foundation.Uri? That one is based on URLMON which is used on Windows for IE/Edge/WinInet. On .NET, Windows.Foundation.Uri is a hidden class and is exposed only as System.Uri to callers. To use it requires using C++/CX or JavaScript.
Looking at what browsers do with similar URIs would also be useful. All of these things will help inform the best decision on whether we can/should make a change to Uri for this scenario.
.NET Framework has the same behavior (new Uri("http://www.google.com/a/b//../c").ToString() -> "http://www.google.com/a/b/c"). I can look into what other implementations/browsers do.
Chrome, Edge & Safari have the same behavior as well.
new Uri("http://www.google.com/a/b//../c").ToString()
I recommend you always use the .AbsoluteUri method for comparison.
c#
var uri = new Uri("http://www.google.com/a/b//../c");
Console.WriteLine(uri.AbsoluteUri.ToString());
I recommend you always use the .AbsoluteUri method for comparison.
Looks like the AbsoluteUri gives the same result as well.
Based on offline conversation with @dthaler, it looks like this is indeed by-design. This is also called out in the below blog post by Dave Risney:
https://davescoolblog.blogspot.com/2011/11/uri-empty-path-segments-matter.html
OK then it looks like MSBuild should just call Path.GetFullPath() to sanitize the file path to app.config before it enters the XML world.
It was very informative, thanks to all!
Most helpful comment
OK then it looks like MSBuild should just call
Path.GetFullPath()to sanitize the file path to app.config before it enters the XML world.It was very informative, thanks to all!