Sanic: Routing is incorrect with some special characters (e.g. dot)

Created on 21 Mar 2019  路  16Comments  路  Source: sanic-org/sanic

I want to use something like this:

@app.route('/path/to/file.<ext>')
async def generate_some_file(request, ext):
    ...

It works, but the dot is interpreted as a special regex character that represents ANY character, so there urls are working:

  • /path/to/file.txt
  • /path/to/fileQtxt
  • /path/to/file txt
  • /path/to/file%3Ftxt etc.

When I try to escape the dot:

@app.route(r'/path/to/file\.<ext>')
async def generate_some_file(request, ext):
    ...

then this route works as expected, but app.url_for('generate_some_file', ext=ext) returns broken url:

/path/to/file\.txt

Is also affects other regex characters e.g. ^, $, [, ] etc. For example,

@app.route('/path/to/fi[abcl]e.<ext>')
async def generate_some_file(request, ext):
    return response.text(app.url_for('generate_some_file', ext=ext))

When I open GET /path/to/file.txt it returns /path/to/fi[abcl]e.txt that is totally incorrect.

Note that Flask does not interpret these characters as special and it works as expected (werkzeug/routing.py uses re.escape everywhere).

I use Sanic 18.12 (because 19.3 is not yet uploaded to PyPI)

necessary

Most helpful comment

@harshanarayana it works but is too complicated for me :) /path/to/file.<ext> looks nicer

Well, we can make it much simpler by turning it into @app.route(uri="/path/to/<file:file(.*)\.(.*)>")
馃槅

All 16 comments

Thanks, @andreymal ! That is a known bug on the Router - it does not escapes special characters given by the developer prior to compiling into a regular expression. For now, it's good to have this pointed out in an issue - we are kind of discussing the future of our Router, as it started in another issue (#1318), became even a test parameter by @harshanarayana for us and we are open to discussion about the topic, in case you want to join us in our community forums :wink:

@andreymal Did you try using a custom regex instead of the default string format for the path?
@app.route(uri="/path/to/<file:file[abc]\.(txt|jpg|gif)>")?

The above will match /path/to/filea.txt /path/to/fileb.txt /path/to/filea.gif but not /path/to/filea.jfg or /path/to/files.txt or /path/to/fileabtxt

from sanic import Sanic
from sanic.response import json

app = Sanic(__name__)

@app.route(uri="/path/to/<file:file[abc]\.(txt|jpg|gif)>")
def test(request, file):
    return json({
        "test": "message",
        "file": file
    })
app.run(port="5032")

@harshanarayana it works but is too complicated for me :) /path/to/file.<ext> looks nicer

@harshanarayana it works but is too complicated for me :) /path/to/file.<ext> looks nicer

Well, we can make it much simpler by turning it into @app.route(uri="/path/to/<file:file(.*)\.(.*)>")
馃槅

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@鈥峴tale nope, <file:file[abc]\.(txt|jpg|gif)> is still too complicated :)

Yeah, well, the Router _per se_ doesn't escape special RegEx chars (like .) when building the routes, you need to do that by yourself (or else the dot will match all). We do have ongoing conversations regarding the Router (see #1318 and comments), but no decision made so far.

Once #1475 is merged, and v19.6 is out, I am turning back my attention to the Router. But, even with my hyperscan implementation, it still would likely have a problem with that since it will also be regex based. Perhaps we can think about turning off matching for a route, but I am hesitant to say this would be viable, or desirable. Now, Sanic does in theory allow you to swap out its router. I have never tried it, but it should be possible. For the 99% of use cases, I do not want to break or slow it down for this. But, I think maybe someway to opt out should be the direction.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@鈥峴tale nope

Why not simply change Sanic router so that paths are re.escaped prior to injecting dynamic elements' regex? Like @andreymal pointed out, this is how other frameworks handle it. Escaping is done during application startup and thus has no runtime performance cost. This way also the unescaped path fragments may be retained for efficient url_for implementation.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is incorrect, please respond with an update. Thank you for your contributions.

@鈥峴tale I think nope (although I haven't tested it in newer versions yet)

That should get rid of @stale.

The new version of the router continues support for regex matching:

@app.route("/path/to/<file:file\.txt>")

To handle a case like this example where you only want to capture part of the match, it will allow you to declare a single group in your regex.

@app.route("/path/to/<ext:file\.(txt)>")

That group can be named, as long as the name of the matching group is the same as the name of the parameter

# OK
@app.route("/path/to/<ext:file\.(?P<ext>txt)>")

# NOT OK
@app.route("/path/to/<ext:file\.(?P<bad>txt)>")

See also #2031.

Was this page helpful?
0 / 5 - 0 ratings