Here's an example:
from flask import Flask
app = Flask(__name__)
@app.route('/<urlvar>')
def hello_world(urlvar):
return 'Hello World! ' + repr(urlvar)
if __name__ == '__main__':
app.run(debug=True)
When running:
curl http://localhost:5000/aaa%2Fbbb
I would expect to see:
Hello World! u'aaa/bbb'
Instead I get:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>
I tested this against Flask 0.9 and Flask 0.10.1. Both had the same error.
While searching for existing bug reports, I came across this: https://github.com/mitsuhiko/werkzeug/issues/21
Is this related? If yes, what's the status?
In any case, being a WSGI spec problem or not, this is behaviour is incorrect!
Ehm. I just realised that using the path type annotation, would solve this. I still believe that a %2F should not be considered a path separator. The question still stands though: Is there any chance that this will be improved on in the future? Or is it an issue with the WSGI spec?
You can try:
pprint.pprint(request.environ),
where you can see that 'PATH_INFO': '/aa/a',
I think WSGI does this.
I have just come across another weird thing: When I use @app.route('/<path:urlvar>') it works when running the builtin server. When deploying behind mod_wsgi it stops working. But in an unexpected way. Behind mod_wsgi this works now:
curl -i "http://routing-manager/route/1.2.3.4/22"
But this does not:
curl -i "http://routing-manager/route/1.2.3.4%2F22"
Note, it's a simple REST service, retrieving information about IP routes. They are written in CIDR notation, which contains a slash. The client sends requests which are properly encoded (with %2F), so I should get this working. Worst case, I could investigate URL rewriting in Apache? But that would be an extremely ugly hack for something the server side should do properly :(
I tracked this down to the apache directive AllowEncodedSlashes. Setting it to On worked. Setting it to NoDecode leves the decoding to the application developer.
So, in the end, this problem seems to be twofold... On the one hand, Flask/Wekzeug handles %2f exactly like it does /. You can work around this by using the path converter. Maybe it can be solved by writing a custom converter (which I just found). I have not tested that yet though. Using path might lead to ambiguities if you have two elements in the path which contain slashes. I don't expect this to happen often though.
On the other hand, apache httpd _refuses_ encoded slashes by default returning a 404 error. This is very misleading and hard to track down.
For now, I have solved my problem with the path workaround. But I will leave this open for now, as I am curious to hear what the stance of the Flask devs is on the fact that Flask treats a %2f as a /. ;)
@exhuma The WSGI layer is on the Apache side, so the problem is caused by Apache.
I'd really like to know if it works on Nginx/uWSGI.
@giio, i just tried uwsgi + nginx, from uwsgi's log i can see the path passed from nginx to uwsgi is is GET /aaa%2Fffff, but when it passed to flask, the path_info in environ is 'PATH_INFO': '/aaa/ffff', but the request_uri is correct as 'REQUEST_URI': '/aaa%2Fffff'. So i think this path decode is done in wsgi layer.
I opened a pull request to werkzeug that addresses this issue.
Regardless if it's getting merged or not you can check it out to see what you need to patch at:
https://github.com/mitsuhiko/werkzeug/pull/478
This is a limitation in WSGI and there is nothing I can do about that.
To get past this for now, I double URL encoded the URL.
There's a guide with a few hacks here: http://www.leakon.com/archives/865
Not to add more to a long-closed issue, but for what it's worth, Passenger's WSGI implementation does not unescape URL-encoded characters before forwarding it along in PATH_INFO. This would seem to be a bug in Passenger (assuming that WSGI says that PATH_INFO should be pre-urldecoded), but I wrote up a workaround on StackOverflow to make it behave the same in that context as in Flask's local server.
If you encode once / will be %2F, then the parameter you get in flask is decoded and appears as a / so it look you have an extra parameter in the url. But if you encode three times in javascript and decoding twice in python it will work (it is a workaround but it will fix this issue):
in javascript
export const multipleEncode = (name) => {
//flask decode one so if you send one encode / become %2F that becomes a / so we need double encoding
return encodeURIComponent(encodeURIComponent(encodeURIComponent(encodeURIComponent(name))));
};
you will send multipleEncode (nameWithSlashes);
in python
from urllib import parse as urllib_parse
def multiple_decode(parameter, info):
if parameter is None:
return parameter
parameter_unquoted=urllib_parse.unquote(urllib_parse.unquote(parameter))
return parameter_unquoted
Most helpful comment
Ehm. I just realised that using the
pathtype annotation, would solve this. I still believe that a%2Fshould not be considered a path separator. The question still stands though: Is there any chance that this will be improved on in the future? Or is it an issue with the WSGI spec?