Rocket: `Uri` issues, summarized

Created on 9 May 2019  路  6Comments  路  Source: SergioBenitez/Rocket

Background

Rocket 0.4 introduced Typed URIs and a new Uri type. The uri! macro and strict Uri validation is a valuable addition to Rocket's API and is a great example of Rocket's promise of safety and correctness, but there are a few problems with it due to deficiencies and/or bugs in Rocket and in browsers.

This is a summary of the following issues and PRs:

  • [ ] #842 - URI References / Uri type does not support anchors.
  • [ ] #853 - Support for absolute Uris / generate absolute URIs from uri!().
  • [ ] #880/#882 - matching Url-Escaped Routes/match percent decoded urls
  • [x] #924/#941 - Un-encoded curly braces fail to parse/Allow more unencoded characters in query strings
  • [x] https://github.com/SergioBenitez/Rocket/issues/998#issuecomment-544169526
  • #995 - (needs some confirmation/testing) Path segments must be valid UTF-8

Bugs

  • Uri does not appear to support fragments in any URI from

    • Workaround: Servers should only deal with fragments in URIs they generate, and most of those can be done via format!.

    • Possible solution: add fragment support, but disallow it or treat it as part of the path/query in incoming URIs

  • Location uses Uri and is therefore too strict about which kinds of URIs it accepts (no relative URI, no fragment)

    • Workaround: Manually set the header as a string.

    • Possible solution: Add support for relative URIs and URIs with fragments to Location and/or Uri

  • Uri is strict about characters such as { needing to be percent-encoded

    • Several browsers send { and other characters unencoded, even when they ought to be encoded.

    • There is currently no workaround. If a user enters a { or certain other relatively innocent charactesr in a plain HTML form submitted with the GET method, Rocket may refuse to process the request entirely.

    • Possible solution: Allow { and some other commonly unencoded characters. Clearly document how this adheres or purposefully does not adhere to the relevant standard and why.

  • (unconfirmed) > isn't even legal in the HTTP header, according to hyper 0.10? Worth double-checking what's going on here.
  • Route matching to the request URI is sensitive to percent encoding (e.g. $ does not match %24).

    • There is currently no workaround

    • Possible solution: Follow an algorithm such as the one outlined in RFC 3986 搂6 to do normalization when doing route comparison.

Requests

  • uri! only takes a string literal that must parse as an Origin as its mount point part.

    • Workaround: Manually paste URI together via format!.

    • Feature requested: Support any runtime-provided Uri as the mount point

    • Possible solution: Accept any Into<Uri> as a mount point. Maybe with a special case - string literals can be checked via TryInto<Uri> at compile time and raise an error if they are invalid.

  • uri! with Option query parameters works surprisingly (e.g. https://github.com/SergioBenitez/Rocket/issues/827#issuecomment-521585546)

What next?

Recapitulating what I wrote in https://github.com/SergioBenitez/Rocket/issues/924#issuecomment-470386375:

Rocket states that it adheres to RFC 7230, which relies on RFC 3986 for many of its URI-related definitions. But strictly adhering to RFC 3986 means that some URLs sent by browsers will be rejected as invalid. WHATWG's URL living standard is intended to replace previous URL standards, but it is still a moving target.

I think the first step to take is to decide if WHATWG's URL living standard is an acceptable specification for Rocket to follow at this time. If it is not suitable, careful and documented deviance from the currently followed specification(s) should be made for the few places that browsers do something different from our expectations.

There are approximately three decisions to be made as I see it:

  • What URI/URL standard to follow - RFC 7230 or WHATWG are the only realistic standards I know of at this time
  • When and how to decode and normalize; either as URIs come in or as they are compared to routes, and which encoding tables to use
  • Extending Uri with support for relative URIs and fragments, or loosening the requirements to use Uri in places such as Location

At the end of the day, I don't want to review or merge a bunch of PRs that modify URI handling without a clear roadmap for doing so in a holistic way.


I will follow up with my own opinions on some of these choices, but I would like to get a feel for what other users of and contributors to Rocket expect as well.

feedback wanted

Most helpful comment

A lot of places get this wrong. I made a comment when Uri was introduced: https://github.com/SergioBenitez/Rocket/issues/443#issuecomment-410438647.

Request Target, e.g. what appears in GET /somepath, is:

     request-target = origin-form
                    / absolute-form
                    / authority-form
                    / asterisk-form

(/ means alternation)

It is not a superset of URI, nor a subset: some Request Targets are not URIs (e.g. *), and some (http) URIs are not Request Targets (e.g. some relative URIs or anything with a fragment).

For Location, the type is:

     Location = URI-reference

and URI-reference is:

URI-reference = URI / relative-ref

      relative-ref  = relative-part [ "?" query ] [ "#" fragment ]

      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty

Depending on how far you want to go with static safety, the types will multiply.

All 6 comments

A lot of places get this wrong. I made a comment when Uri was introduced: https://github.com/SergioBenitez/Rocket/issues/443#issuecomment-410438647.

Request Target, e.g. what appears in GET /somepath, is:

     request-target = origin-form
                    / absolute-form
                    / authority-form
                    / asterisk-form

(/ means alternation)

It is not a superset of URI, nor a subset: some Request Targets are not URIs (e.g. *), and some (http) URIs are not Request Targets (e.g. some relative URIs or anything with a fragment).

For Location, the type is:

     Location = URI-reference

and URI-reference is:

URI-reference = URI / relative-ref

      relative-ref  = relative-part [ "?" query ] [ "#" fragment ]

      relative-part = "//" authority path-abempty
                    / path-absolute
                    / path-noscheme
                    / path-empty

Depending on how far you want to go with static safety, the types will multiply.

The OAuth v2.0 implicit grant can return the access token and other key value pairs in the fragment of the URI. The fragment is appended to the redirect URI and the user is redirected after logging in. However, URI fragment parsing is typically done from the user agent or client side.

What is a more valid use case is that a server may need to send a redirect using the location response header field with a URI that includes a fragment. The OAuth spec notes that some servers do not support fragments in the URI and that a workaround could be to return an HTML page with a button linked to the redirect URI. I am not advocating for adding URI fragment support necessarily but I do believe there are some valid use cases like this.

FWIW, the issues with { and } also appears to cause problems for the pipe character |. I run a production (!) server based on Rocket that many people link to from wikis, where they copy-paste-edit URLs and often modify them with plain-text chars...

I'm attempting to integrate a web service with AWS Cognito, a hosted authn/authz provider. I redirect my clients to Cognito, the client logs in, then Cognito redirects the client with a JSON Web Token to me with the format

https://www.example.com/#id_token=123456789tokens123456789&expires_in=3600&token_type=Bearer

As you can see, Cognito uses the # fragment identifier to delineate the json web token from the path. It doesn't appear as Rocket supports this from looking at the documentation and from reading above.

I'm posting this comment so there is more data on the various use cases that the current URI implementation does not support. Thanks!


Cognito documentation: https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-user-pools-app-integration.html (scroll near bottom)

UPDATE

It appears that my understanding of fragments was misguided, and I should access the data from the fragment another way, perhaps through javascript.

See https://github.com/hyperium/hyper/issues/1621

Here's another issue; moving it here from https://github.com/SergioBenitez/Rocket/issues/827#issuecomment-521585546:

uri! macro with Option is a bit unintutive:

#![feature(proc_macro_hygiene)]

use rocket::{get, routes, uri};

#[get("/?<input>")]
fn hello(input: Option<String>) -> String {
//    let a = uri!(hello: None);
//    let b = uri!(hello: Some("a".to_string()));
    let c = uri!(hello: _);
    let d = uri!(hello: "a");
//    let e = uri!(hello);

    input.unwrap_or_else(|| "Hello, world!".to_string())
}

fn main() {
    rocket::ignite().mount("/", routes![hello]).launch();
}

None of the commented-out uri! invocations work.

This was discussed in IRC (https://mozilla.logbot.info/rocket/20190816) but I don't remember coming to any conclusion.

I hope this is the correct place to add these thoughts. I can open a separate issue if that would be more helpful.

As discussed in #1021 (and probably elsewhere), it's relatively common for server responses to need to include full URIs to its own endpoints. While the current uri! is nice, it's missing some very large parts of this puzzle: scheme, host, and port in particular. I understand that these things are not necessarily simple to provide (e.g. proxies can prevent rocket from being able to infer them accurately at all), but it would be nice for rocket to provide an ergonomic way of solving this problem.

I'm entirely new to rocket, so this initial spitballing is likely to be off-base, so I don't want to suggest a solution which will likely be very wrong. If it is decided that the current set-up is actually ideal and the problem is fundamentally a bit messy, then perhaps providing an example/docs for how to typically do this would be a good substitute.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shssoichiro picture shssoichiro  路  4Comments

marceloboeira picture marceloboeira  路  3Comments

kitsuneninetails picture kitsuneninetails  路  4Comments

Ronaldho80 picture Ronaldho80  路  3Comments

denysvitali picture denysvitali  路  3Comments