Looks like currently an error page content is hardcoded in util.write_error, as a simple html page. For some scenarios this needs to be changed, eg. an API service that should always return JSON.
Yes, custom error pages should be generated by a user app, but if an error happens before passing control to a user app (eg. a malformed HTTP header), the page is always generated by gunicorn.
There's one specific, probably common case which triggers error page generation by gunicorn: when gunicorn is behind nginx, and a request contains a space in GET parameters, like:
$ curl 'http://site/?a=1 &'
<html>
<head>
<title>Bad Request</title>
</head>
<body>
<h1><p>Bad Request</p></h1>
Invalid HTTP Version 'Invalid HTTP Version: '& HTTP/1.0''
</body>
</html>
So nginx passes it to gunicorn without triggering an error, and gunicorn generates an error without calling wsgi app. Note also that the error message contains "Invalid HTTP Version" twice.
Reading the issue and the comment I am guessing we have 2 different issues. 1 is to use a different error page if needed and I would be OK for having a setting for it. The second is about parsing the status line. Though if I read the spec:
5.1 Request-Line
The Request-Line begins with a method token, followed by the Request-URI and the protocol version,
and ending with CRLF. The elements are separated by SP characters. No CR or LF is allowed except
in the final CRLF sequence.
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Gunicorn is correct in its behaviour. The URI shouldn't contains any space. If I would change a thing there it would be making the message more explicit telling to the user it failed to parse the request line. Thoughts?
For the initial problem, i think it would be good to have it in next release.
I only put the example with a space in a url to show that there are not-that-impossible situations when gunicorn generates an error page, even if a wsgi app has implemented full error handling and nginx's error pages are also set up. And yes, gunicorn's behavior is correct (and nginx's behaviour is incorrect if specs say that an invalid request must not be passed) and a better error message would be helpful.
For my purpose I monkey-patched util.write_error, but a supported solution would be nice.
+1
Any news on this? I'm also facing this issue on Django. My error pages showup when i run django manage.py runserver
but if i run my app with gunicorn, It ignores the error pages i setup and displays it's own error page.
This is the only place on the internet that talks about this behaviour. It's like the rest of the world serves their error pages directly with their webserver (Nginx, apache etc) 馃槂
@danidee10 Did you find a workaround for this?
I would be hesitant to change much here. The error pages that Gunicorn generates are really for protocol level failures where we can't necessarily call the WSGI app.
I wonder how Django is doing it. Is the server being more tolerant than Gunicorn but passing an invalid request to the WSGI layer of Django? Or is the server integrated in such a way as to call the custom error pages (but with what request object)?
As far as JSON responses, the same issue occurs with NGINX or really any proxy. If a client makes an invalid HTTP request across the Internet, it's often the case that any number of CDNs, reverse proxies, forward proxies, whatever might be involved. If these fail to parse and forward the request or response, it's not uncommon for it to come back as HTML rather than JSON.
Does anyone have a good idea for a concrete change?
i'm not sure we need to do anything there. One way to allow custom page would be handling them in some kind of templates where the dir can be configured to allow an app to use custom templates .
My feeling is that a web server shouldn't use templates as they are meant for higher-level code (WSGI apps). What would work for me is making this function a part of the config
(maybe with a different API): https://github.com/benoitc/gunicorn/blob/4d3ec28046f4d5d0b9fb8b24c1235c6e369b8837/gunicorn/util.py#L301
Generally it looks like what this comment https://github.com/benoitc/gunicorn/issues/993#issuecomment-284754742 talks about is caused by a wrong WSGI app setup. And my original issue is about very special conditions (buggy proxy) under which gunicorn
generates an error page.
My feeling is that a web server shouldn't use templates as they are meant for higher-level code
馃憤
I am curious if Gunicorn overrides the Django error pages for all errors (I suspect not) or only for certain kinds of protocol errors handled by Gunicorn.
I think using Gunicorn as an API server that only ever serves JSON is probably common, but I don't think I want to commit specific code for that use case.
Providing a way to customize the error response somehow in a config sounds right. If there can be any proposal for this, I would review.
Providing a way to customize the error response somehow in a config sounds right. If there can be any proposal for this, I would review.
Maybe the simplest would be the ability to specify in the config a path of the WSGI app that would handle internal errors? So a config variable like:
internal_errors_rendering_path = '/render_error'
And then gunicorn
on encountering an internal error would call the WSGI app:
/render_error?status_code=403&exc=InvalidRequestLine
This exception name was taken from https://github.com/benoitc/gunicorn/blob/f9ade3af34d2ced37cf95f749c414a995e0118aa/gunicorn/workers/base.py#L203
This calling of the WSGI app looks the simplest because otherwise, to be able to fully customize the error response, I think we'd need abstract all the details like HTTP headers, response content, status code. So it's sort of reinventing WSGI.
That's a very interesting idea!
Most helpful comment
Any news on this? I'm also facing this issue on Django. My error pages showup when i run
django manage.py runserver
but if i run my app with gunicorn, It ignores the error pages i setup and displays it's own error page.This is the only place on the internet that talks about this behaviour. It's like the rest of the world serves their error pages directly with their webserver (Nginx, apache etc) 馃槂