Runtime: [Uri] Inconsistent treatment of %20 in file Uri with/without unicode chars

Created on 14 Sep 2018  路  14Comments  路  Source: dotnet/runtime

Based on .NET Framework bug report: https://developercommunity.visualstudio.com/content/problem/89895/uri-constructor-not-encode-character-after-net-45.html
Reproduced on .NET Core 2.1 (and .NET Framework 4.7.2 which is out of scope here - just for completeness)

```c#
static void Main()
{
Console.WriteLine(new Uri(@"C:##\%20").AbsoluteUri); // Prints: file:///C:/%23%23/%2520
Console.WriteLine(new Uri(@"C:\脙\%20").AbsoluteUri); // Prints: file:///C:/%C3%83/%20

// Works fine for other sequences, e.g. %23
Console.WriteLine(new Uri(@"C:\脙\%23").AbsoluteUri);  // Prints: file:///C:/%C3%83/%2523
// It is consistent for http URIs:
Console.WriteLine(new Uri(@"http://my.com/  /%20").AbsoluteUri); // Prints: http://my.com/%20%20/%20
Console.WriteLine(new Uri(@"http://my.com/脙/%20").AbsoluteUri);  // Prints: http://my.com/%C3%83/%20

}
```

Note that treating of %20 is inconsistent in the first 2 URIs - first one gets % encoded (%2520), second one gets % kept as is (%20). The difference seems to be just usage of multi-byte Unicode char in 2nd case.
As the additional 3 cases above show, http URIs are fine and so are file URIs with other sequences like %23.

area-System.Net bug

All 14 comments

I think that this should be a pretty simple fix, though like all URI changes it will require some extra testing. It can likely be tracked down by looking for references to the URI reserved character set in the implicit file path parsing section. It is likely that in some case we are checking for reserved characters, but do not consider an encoded space "%20".

Would like to take this 鈽濓笍

Awesome 馃槃

I'll assign you the issue. Feel free to reach out if you have any questions.

How would you lay out the Unit tests for this @rmkerr?
Seems like they should be located in UriEscapingTest.cs within the Functional tests

UriEscapingTests.cs looks like an appropriate location. I would add the tests to the file URI escaping section, and would try to stay consistent with the existing tests there. https://github.com/dotnet/corefx/blob/99211937b4f1735b7912f0a064b91ba60c8e9ca9/src/System.Private.Uri/tests/FunctionalTests/UriEscapingTest.cs#L443-L445

Seems like the UriHelper is trying to escape "%" characters, see https://github.com/dotnet/corefx/blob/99211937b4f1735b7912f0a064b91ba60c8e9ca9/src/System.Private.Uri/src/System/UriHelper.cs#L183

When changing this test to else if (ch == '%') the result of Uri(@"C:\##\%20").AbsoluteUri is file:///C:/%23%23/%20.

Two Unit tests fails with this change,
System.PrivateUri.Tests.IriRelativeFileResolutionTest.IriRelativeResolution_CompareImplcitAndExplicitFileWithUnicodeAndReservedCharIriOn_AllPropertiesTheSame

System.PrivateUri.Tests.IriRelativeFileResolutionTest.IriRelativeResolution_CompareImplcitAndExplicitFileWithReservedChar_AllPropertiesTheSame

If you have any tips or hints @rmkerr it would be much appreciated 馃憤

Those tests are verifying the behavior of explicit (starting with "file://") and implicit (starting with a drive name, ie "C:\"). They're encoding the same path explicitly and implicitly, then testing each property to see if it matches between the two.

Interestingly, the test actually expects some properties to be different! I think it's worth manually checking the failed test cases to see what changed. It's possible that those tests are encoding some undesirable behavior that we actually do want to change. On the other hand, it may be that this change has unexpected side effects, and in that case we may not be able to make a fix for compat reasons.

What should be the expected output here @karelz?

Console.WriteLine(new Uri(@"C:\##\%20").AbsoluteUri); // Prints: file:///C:/%23%23/%2520
Console.WriteLine(new Uri(@"C:\脙\%20").AbsoluteUri);  // Prints: file:///C:/%C3%83/%20

Should both of them produce file://C:/.../%20, are Dos file paths supposed to be double encoded and not http/urls?

Both of them should produce %2520 at the end IMO
But I let @rmkerr to make final judgement - he's the expert on Uris ;)

@fredeil do you have any update / energy to move it forward? (I unassigned you for now)

Hi @karelz! I did try to solve this a few times.

The findings was a bit odd, often the results would print out just fine when debugging the code (walking through literally every step) and when running it without a debugger it would yield the wrong results.

Whenever I found a solution that produced the right results on every run for this particular problem something else would break.

I would still really like to solve this but I got stuck. So if you have any tips to offer me it would be greatly appreciated!

Btw loved your talk on NDC Oslo 馃憤

Glad you liked my talk :)
Sorry that the issue caused you such troubles, we will have someone taking a look hopefully sooner. Thanks!

We had troubles to reproduce the same symptoms. @fredeil can you please ping me directly on gitter, Twitter, or via email? (see my GH profile for contacts) Thanks!

To wrap up, @fredeil confirmed offline that he cannot reproduce the problem anymore. cc @ManickaP

Was this page helpful?
0 / 5 - 0 ratings