NOTE: GitHub issues are reserved for bug reports only. For anything else,
please join the conversation in Kong Nation https://discuss.konghq.com/.
Please read the CONTRIBUTING.md guidelines to learn on which channels you can
seek for help and ask general questions:
https://github.com/Kong/kong/blob/master/CONTRIBUTING.md#where-to-seek-for-help
SUMMARY_GOES_HERE
i configured some routes,but the true matching rule is different from the docs tells. i think docs says the matching rule is:
1) full uri match
2) prefix match
3) regex match
for example, two rules:
a: {path:'/api/account'}
b: {path:'/api/.*'}
when requesting '/api/account/getAll',it matched b.
that is to say the truth is:
1) full uri match
2) regex match
3) prefix match
i use the default kong config and db type is postgres. hope for your answer.
$ kong 0.14.0)$ kong start --vv)<KONG_PREFIX>/logs/error.log)The matching can definitely be complex and counter intuitive and will depend on the exact examples you are using. Please review the following discussions and documentation and reply with the exact routes and requests you are using.
https://discuss.konghq.com/t/a-few-words-on-regex-uris-usage-and-common-pitfalls/506
and
https://docs.konghq.com/1.0.x/proxy/#matching-priorities
Regards,
Rob
I think it is a bug. I will send a PR soon.
This is not a bug, but a known missing feature. The feature would be the equivalent of the nginx location directive's ^~ modifier, which means "don't evaluate this regex if a previous prefix was already matched. See ngx_http_core_module's location directive docs.
The order of evaluation _is_:
But in the case of multiple matches, the _priority_ order is:
Of course, the actual order of evaluation is determined by the number of matching properties. So a Route with a host _and_ a path will _always_ be matched, even if it has a prefix path while another route has a plain matching path (but no matching host, that is).
According to this:
https://docs.konghq.com/1.0.x/proxy/#evaluation-order
Prefix paths are always evaluated before regex paths.
So I guess we need to update the docs? In a meanwhile I corrected this and some other issues in path matching in #4153 (though, now I am not sure should we change it a bit?)
@thibaultcha, I don't fully understand the:
But in the case of multiple matches, the priority order is
You mean if single route has multiple paths, then we pick regex first to get the captures or something totally different?
Just a note, this very same issue has previously been discussed in #3111.
@thibaultcha you described it as:
How is it different from:
Prefix paths are always evaluated before regex paths.
This piece is correct. But regex paths are also evaluated after prefix paths _and take precedence_ if there is a match (over the previous prefix match). That is what the ^~ modifier prevents (as discussed in #4153 - thanks for digging there btw). Yes, the docs could be clarified, definitely (as stated by my year old comment in this issue).
You mean if single route has multiple paths, then we pick regex first to get the captures or something totally different?
Given a Route with a prefix path, and another Route with a regex path (and both match the incoming request path), the latter will override the former, even though the former was _evaluated_ first, and was a match. This is what the documentation describes, the order of evaluation is correctly mentioned, but we fail to mention that a regex match will override the prefix match. This is identical to nginx's behavior except that they have the ^~ modifier to avoid this, which we haven't implemented yet, and fell down in the priority list.
How is it different from:
If the order was what you wrote (regexes then prefixes), then it would never be possible to override a regex with a prefix match (i.e. shadow the regex). But with the current evaluation order (as implemented and correctly documented), it is possible to do so, given we implement the ^~ modifier. In practice for a user, the evaluation order _appears_ to be the one you described, but in the implementation, it isn't. And given we implement the modifier, it will become obvious to the user as well. In short, the evaluation order is justified by the existence of the ^~ modifier, which, since it doesn't exist yet (in Kong), makes the current evaluation order seem incorrect. But it isn't.
Now on to the next steps, and implement the ^~ modifier itself: the only difficulty I see is to translate what is an nginx directive modifier to a Kong concept: is it a Route attribute? Then is it shared by all regex paths of a Route? Is it a prefix that we somehow inject before a regex path? Does it look unintuitive, or cumbersome to specify? Etc... While nginx is configured via a text file, we configure Kong via a dynamic RESTful API, and we must port such concepts intelligently.
@thibaultcha Thank you for the full explanation on the history of how the present status came about.
But I'll play Devil's (or User's?) Advocate a bit. Given that the difference between evaluation order and matching precedence is fully opaque to a Kong user (as it should be; only the second one matters), the users are fully justified in thinking that the "evaluation order" the documentation mentions are in fact the precedence rules (especially since the precedence rules themselves are not distinguished from that), and that either the docs or code are wrong. I'm sure that if I show our proxy docs to 10 people and ask them "given route A with /foo and route B with /f.*, which one will /foobar match?", all of them will answer "A, because prefixes are evaluated first".
If the code is not changing to match their expectations, then the docs should change.
As for the prefix-not-regex functionality, I think the Nginx evaluation order has IMO a weird design that I think we shouldn't expose or replicate (I didn't dig the full history of past Nginx releases, but the ^~ thing looks very tacked-on-after-the-fact). I think the great design questions you raised in your last paragraph on how to present this cleanly in Kong also reflect how awkward the semantics of Nginx are.
In light of all this, moving forward, I believe we should just generalize the regex_priority field into a general priority field — I think this idea has already been floated around. Ideally, it should have some very simple tie-breaking rule when priority is not set explicitly, such as "regexes matched in some stable order first, then prefixes matched longest to shortest next" (which would match our current behavior). This way, a matching rule (prefix or regex) with higher priority would stop evaluation of any rules with lower priority.
Does it look unintuitive, or cumbersome to specify?
I think the explicit priority approach would be the most intuitive, but updating priorities could get cumbersome indeed if somone is ordering each route individually with 1, 2, 3... In practice though, I could see someone using priorities in groups (e.g. setting a bunch of prefix-paths all with priority 100, then some regexes with priority 200, then a fallback / path with priority 999).
This sounds flexible enough. The Nginx docs contain this example to illustrate their semantics:
Let’s illustrate the above by an example:
location = / { [ configuration A ] } location / { [ configuration B ] } location /documents/ { [ configuration C ] } location ^~ /images/ { [ configuration D ] } location ~* \.(gif|jpg|jpeg)$ { [ configuration E ] }The “/” request will match configuration A, the “/index.html” request will match configuration B, the “/documents/document.html” request will match configuration C, the “/images/1.gif” request will match configuration D, and the “/documents/1.jpg” request will match configuration E.
in our case, that would be (modulo regex case-insensitivity)
[ { "paths": ["/$"], "priority": 100, "service": A },
{ "paths": ["/images/"], "priority": 100, "service": D },
{ "paths": ["\\.(gif|jpg|jpeg)$"], "priority": 200, "service": E },
{ "paths": ["/documents/"], "priority": 200, "service": C },
{ "paths": ["/"], "priority": 200, "service": B } ]
Presentation is central for understandability, so ideally we should list routes sorted by matching priority (including the tie-breaking logic). The example above could use 5 different priority numbers, I just used 2 to illustrate the tie-breaking.
(Side note: we could even optimize any <non-regex-chars>$ regex to translate into a exact-string-match like nginx's = modifier when we build the router, without adding an extra user-visible concept.)
If the code is not changing to match their expectations, then the docs should change.
Yes, I said so in my above comment: the docs could be clarified around this use-case, definitely. The code should also change, is what I was hoping was transpiring from my above comments as well: the router is still incomplete in that regard (whichever tie-breaker attribute we end up implementing).
In light of all this, moving forward, I believe we should just generalize the regex_priority field into a general priority field — I think this idea has already been floated around.
Indeed. In December, I advocated for this field to be updated to priority prior to the 1.0 release. The bigger rationale for doing so is the recent addition of the sources and destinations matching attributes, which also need a tie-breaking attribute. Therefore, regex_priority isn't only about regexes anymore. With the release of 1.0 behind us, we have no choice but to carry backwards-compatibility code with us for a long time now.
I like the idea of relying solely on the priority attribute for tie-breaking. We have the luxury of having this option, given Kong's configuration interface, vs. that of nginx's configuration files :tada: However, doing so now may introduce breaking changes in existing routers setup in the wild. First of all, our extensive test suite may catch some of these regressions (and we'll then now to maybe not move forward). If it doesn't, I'd still consider this change risky in a 1.0 world.
I like the idea of relying solely on the priority attribute for tie-breaking
We always need _some_ criteria for tie-breaking (even if it's created_at) if we want to have a stable order (in the sense of sorting algorithms).
My risk-mitigation suggestion is to have the current behavior (regexes first, prefixes long-to-short next) to be the tie-breaker for priority so we can introduce it in Kong 1.1 with a migration and no breakage. regex_priority can be deprecated and removed in 2.0.)
By “solely”, I meant without ^~, and not by changing the evaluation order of the Route’s properties.
closing for lack of activity (and probably outdated)
Don't think this is outdated, but certainly still a missing feature. Shall we reopen but rename it maybe? There's valuable explanations in here.