Kong: Comma Character in URI Regex Incorrectly Parsed

Created on 5 Aug 2017  路  5Comments  路  Source: Kong/kong

Summary

Comma characters in URI regular expressions are not correctly parsed; Kong sees this character as an entity split, not part of the expression.

Steps To Reproduce

  1. Add API object with regular expression containing comma
$ curl localhost:8001/apis -d name=test -d upstream_url=http://httpbin.org -d uris="/foo/\d{1,3}"
{"uris":"uri with value '3}' is invalid: must be prefixed with slash"}

Additional Details & Logs

  • Kong version ($ kong version)

0.11.0rc2

  • Kong debug-level startup logs ($ kong start --vv)

N/A

  • Kong error logs (<KONG_PREFIX>/logs/error.log)

N/A

  • Kong configuration (registered APIs/Plugins & configuration file)

No entities, no configuration changes.

  • Operating System

Trusty

tasbug

Most helpful comment

This is now fixed thanks to #2794! @argentum47 #1391 is apparently the first issue tracking this non-compliant way of parsing arrays. We will eventually address that!

All 5 comments

Interesting. I gave this a bit of thought when I saw it during the weekend. This is trickier than it seems at first, for both UI and implementation reasons. For one, when one tries to do the obvious thing and quote the value in the curl call as in the example above:

$ curl localhost:8001/apis -d name=test -d upstream_url=http://httpbin.org -d uris="/foo/\d{1,3}"

...the quotes are consumed by the shell and the process gets it unquoted anyway, s:

$ function foo() { echo $1; }
$ foo uris=hello
uris=hello
$ foo uris="hello"
uris=hello

so even if we added support for quoted array arguments to protect the comma, the user would have to write it like this, which is awkward:

$ foo uris='"hello","world"'
uris="hello","world"

...and then we'd get to the point on how exactly to implement this, because of all of the corner cases in the grammar (e.g. how should uris="hello"hello,world parse?)

An alternative to quoting would be to require some form of escaping when using commas in regexps, and then the user would have to write something like uris=/foo/\d{1\,3} (and the implementation would still have to be smarter than Penlight's split), and we'd have to document that corner-case accordingly.

Usability-wise, if we only care about commas in the {n,m} case (since actual commas are percent-escaped as%2c), an alternative would be to make the parsing a bit more complicated and treat , inside {} as not an array separator for URI arrays. It's a little special-casey, but from the user's perspective, it's the approach were things "just work". With whatever solution we go with, the parsing will have to be smarter than split anyway, so that sounds reasonable to me.

Is it possible to change the way the uris is accepted and do in an RFC compliant way. like in a json accept an array like uris: ['regexed uri1', 'regexed uri2'] , and in form encoded do it like arrays are done in get params. -d uris="hello" -d uris="world" .

@argentum47 Yes, that is our long-term plan, but we'll have to find an interim solution until then, we won't be changing the entire syntax of the Admin API POST arguments just a few days before a major release :)

@thibaultcha okies. that makes sense.

This is now fixed thanks to #2794! @argentum47 #1391 is apparently the first issue tracking this non-compliant way of parsing arrays. We will eventually address that!

Was this page helpful?
0 / 5 - 0 ratings