Kestrelhttpserver: Handle requests that use an absolute URI as the request path

Created on 2 Mar 2016  ·  18Comments  ·  Source: aspnet/KestrelHttpServer

Kestrel returns HTTP/1.1 500 Internal Server Error when the request has a complete request URI.
Sample app - HelloWorldMvc

Program.cs

using System;
using System.Net;
using System.Net.Sockets;
using System.Text;

namespace KestrelFullRequestUriIssue
{
    public class Program
    {
        public static void Main(string[] args)
        {
            var endpoint = new DnsEndPoint("localhost", 5000);
            Socket s = new Socket(SocketType.Stream, ProtocolType.Tcp);
            s.Connect(endpoint);
            var raw = "GET http://localhost:5000/ HTTP/1.1\r\n" +
                            "Host: localhost:5000\r\n" +
                            "Content-Length: 0\r\n" +
                            "Connection: close\r\n" +
                            "\r\n";

            var request = Encoding.ASCII.GetBytes(raw);
            Console.WriteLine(raw);
            s.Send(Encoding.ASCII.GetBytes(raw));
            byte[] buffer = new byte[1024];
            int count = 0;
            while ((count = s.Receive(buffer)) != 0)
                Console.WriteLine(Encoding.ASCII.GetChars(buffer, 0, count));
            s.Dispose();
        }
    }
}

project.json

{
  "version": "1.0.0-*",
  "description": "KestrelFullRequestUriIssue Console Application",
  "authors": [ "sajaya" ],
  "tags": [ "" ],
  "projectUrl": "",
  "licenseUrl": "",

  "compilationOptions": {
    "emitEntryPoint": true
  },

  "dependencies": {
  },

  "commands": {
    "KestrelFullRequestUriIssue": "KestrelFullRequestUriIssue"
  },

  "frameworks": {
    "dnx451": { },
    "dnxcore50": {
      "dependencies": {
        "Microsoft.CSharp": "4.0.1-beta-23516",
        "System.Collections": "4.0.11-beta-23516",
        "System.Console": "4.0.0-beta-23516",
        "System.Linq": "4.0.1-beta-23516",
        "System.Net.NetworkInformation": "4.0.0-beta-23019",
        "System.Net.Primitives": "4.0.11-beta-23516",
        "System.Private.Networking": "4.0.1-beta-23225",
        "System.Net.Sockets": "4.1.0-beta-23516",
        "System.Threading": "4.0.11-beta-23516"
      }
    }
  }
}

3 - Done

Most helpful comment

Terms:
request-target = the second field in request line, e.g. METHOD request-target HTTP/1.1
absolute-form = a request-target with the full URI, e.g. GET http://something/abc HTTP/1.1.

From RFC 7230, Section 5.3.2 "absolute-form"

To allow for transition to the absolute-form for all requests in some future version of HTTP, a server MUST accept the absolute-form in requests, even though HTTP/1.1 clients will only send them in requests to proxies.

From RFC 7230, Section 5.4 "Host"

A client MUST send a Host header field in an HTTP/1.1 request even if the request-target is in the absolute-form...

From RFC 7230, Section 5.5 "Effective request URI"

If the request-target is in absolute-form, the effective request URI is the same as the request-target.

Otherwise, a server constructs the "effective request URI" from a combination of server configuration, request-target, and headers.

What is Host and request-target differ? RFC 7230 warns that this may occur -- accidentally or maliciously -- but does not provide guidance on whether this should be an error or not. All section 5.5 says is

Once the effective request URI has been constructed, an origin server needs to decide whether or not to provide service for that URI

All 18 comments

It looks like the first request is valid (https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2 outdated RFC, but I assume this hasn't changed), but Kestrel is putting the full URL (e.g. “http://localhost:5000/” instead of “/”) in IHttpRequestFeature.Path.

This causes the following exception to be thrown from PathString when some middleware (presumably Routing) tries to access HttpContext.Request.Path:

System.ArgumentException: The path in 'value' must start with '/'.
Parameter name: value
at Microsoft.AspNetCore.Http.PathString..ctor(String value)
at Microsoft.AspNetCore.Http.Internal.DefaultHttpRequest.get_Path()

The uncaught exception then results in a 500 response.

Yeah, that format is only used with proxies which is why nobody has run into it yet.
https://tools.ietf.org/html/rfc7230#section-5.3.2. Kestrel should still allow it and the other forms mentioned in 5.3, though I don't know if we'll be supporting Connect any time soon.

So for a request for a full URL we should check if it matches one of the addresses Kestrel is listening on, otherwise we should fail. Would a 400 be appropriate for this situation?

I don't think we're ready for that level of validation. We don't check the host header either. Just ignore the scheme and host.

We've decided this should be handled after RC2, since it's an edge case. We can more adequately handle this scenario, as well as #709, once we implement #594.

Moving to Backlog. cc @muratg

Assigning @natemcmaster for further investigation. Please sync up with @cesarbs for his latest findings.

I couldn't reproduce the error with Kestrel 1.0.0 or 1.1.0. @cesarbs did your work for https://github.com/aspnet/KestrelHttpServer/issues/594 fix this?

@natemcmaster The partial fix was to set Path, PathBase and QueryString to string.Empty if the first character of the raw target wasn't '/'. In the case of this request, the first character of raw target is 'h'.

If you change the raw request in Program.cs to the following and check Path and QueryString, you'll see both are empty, so it isn't fully fixed.

var raw = "GET /test?query=string&more=stuff HTTP/1.1\r\n" +
                "Host: localhost:5000\r\n" +
                "Content-Length: 0\r\n" +
                "Connection: close\r\n" +
                "\r\n"

Ah, I see. So we fixed the server doesn't crash, but we still want to parse the request URI into the http context

I'm not sure we do. The absolute URI format is only valid in specific scenarios like proxies.

I would test what other servers like IIS, nginx and maybe node do. If they all have consistent behavior, we should match it.

Given:

GET http://localhost:5000/abc HTTP/1.1
Host: localhost:5000
Content-Length: 0
Connection: close

IIS logs: cs-uri-stem = /abc, cs-host = localhost
IIS with classic ASP.NET. Request["URL"] == "/abc"
nginx: $request = "https://localhost:5000/abc"
node: req.url = "https://localhost:5000/abc"

Given:

GET /abc HTTP/1.1
Host: localhost:5000
Content-Length: 0
Connection: close

IIS logs: cs-uri-stem = /abc, cs-host = localhost
IIS with classic ASP.NET. Request["URL"] == "/abc"
nginx: $request = "/abc"
node: req.url = "/abc"

What if the request line and host header (or scheme) disagree?

Sending this request to http://localhost:3000

GET https://wronghost:4000/abc HTTP/1.1
Host: differenthost:5000

IIS:
Request.Url = http://differenthost/abc
Logging: cs-uri-stem = /abc, cs-host = wronghost:4000

nginx:
$request = https://wronghost:4000/abc

node:
message.url = https://wronghost:4000/abc. This will always match the request URI. See https://nodejs.org/api/http.html#http_message_url

Ok. Can you summarize the relevant sections of the HTTP spec?

Terms:
request-target = the second field in request line, e.g. METHOD request-target HTTP/1.1
absolute-form = a request-target with the full URI, e.g. GET http://something/abc HTTP/1.1.

From RFC 7230, Section 5.3.2 "absolute-form"

To allow for transition to the absolute-form for all requests in some future version of HTTP, a server MUST accept the absolute-form in requests, even though HTTP/1.1 clients will only send them in requests to proxies.

From RFC 7230, Section 5.4 "Host"

A client MUST send a Host header field in an HTTP/1.1 request even if the request-target is in the absolute-form...

From RFC 7230, Section 5.5 "Effective request URI"

If the request-target is in absolute-form, the effective request URI is the same as the request-target.

Otherwise, a server constructs the "effective request URI" from a combination of server configuration, request-target, and headers.

What is Host and request-target differ? RFC 7230 warns that this may occur -- accidentally or maliciously -- but does not provide guidance on whether this should be an error or not. All section 5.5 says is

Once the effective request URI has been constructed, an origin server needs to decide whether or not to provide service for that URI

Given this, I think Kestrel should continue to accept these requests, but we should start pulling the path out of the absolute URI instead of falling back to Path = string.Empty. This implies discarding the scheme and authority sections from the absolute URI, and preferring the value in the "host" header as we do now.

Was this page helpful?
0 / 5 - 0 ratings