Describe the bug
The default port for a Sanic app is 8000, which by HTTP standards is a custom port. This port can also be set during calls to app.run().
When using app.url_for('handler.name', _external=True), the resultant URL does not contain the port without manual intervention.
This line is the final step before the application returns the URL. The URL is build using urlunparse which takes a tuple of values as would be producted by urlparse. The snippet below shows that only the first 6 arguments are provided, which does not include the port (unless included as part of netloc).
https://github.com/huge-success/sanic/blob/905c51bef03818c629173938a1183b38dd0ebc65/sanic/app.py#L695
In order to get port, the value of app.confg.SERVER_NAME must be set to the hostname and port
app.config.SERVER_NAME = '{}:{}'.format(host, port)
app.run(host=host, port=port)
Thus, the default behavior for creating the URL is incorrect as it requires manual attribute setting for every single application.
Expected behavior
Calls to app.url_for() with the _external flag should build the correct URL, including the port number, such as http://hostname:8000 without forcing every application to specify the SERVER_NAME config attribute.
Environment (please complete the following information):
Thinking on this more last night, I can see the use case for things like Cloud Foundry deploys where the internal port being served would be obfuscated by the cloud instance, so the application should report :80 or :443, even if its running at :8000.
Perhaps a flag then to enable port inclusion?
Port 80 is the default port for HTTP and 443 is default for https, if the external URL is serving something like 'http://www.example.com/blah' then there's no real reason for port inclusion - however in every other case where the port is non-standard, I agree with your assessment that it should be included in the url. Possibly just inclusion of the port for the external uri should be enough. I'm sure one of the better devs will weigh in on it.
Sorry, when I said "should report :80" I meant that the application would need to generate URLs as if it were on that port (despite running on 8000), meaning the behavior as it stands right now. But this is also just for the case where there is an obfuscation layer or load balancer between callers and the application.
@eric-spitler I think I am understanding now ... You are asking for a setting for a "base URL" that would be used to create an absolute URL that is different than what is being exposed.
For example, I have a sanic instance running as http://auth-service:7777/, but there is a load balancer ahead of it that proxies http://mydomain.com/auth to http://auth-service:7777/. Is this the use case?
Consider the default use case, a simple app serving an endpoint that returns the result of app.url_for:
from sanic.app import Sanic
from sanic.response import text
app = Sanic()
app.config.SERVER_NAME = 'localhost'
@app.route('/test', name='test', methods=['GET'])
def make_url(request):
return text(request.app.url_for('test', _external=True))
app.run()
The application starts up on 127.0.0.1 (default host) and port 8000 (also default). Hitting that simple endpoint returns:
> curl http://127.0.0.1:8000/test
http://localhost/test
Butting hitting the URL returned by app.url_for results in:
> curl http://127.0.0.1/test
curl: (7) Failed connect to 127.0.0.1:80; Connection refused
My point in this issue is that the URL generated by app.url_for does not by its nature return the operating port - it requires setting the following in order to return the actual URL:
app.SERVER_NAME = 'localhost:8000'
The more complex use case is where the application is running behind another interface, such as Cloud Foundry or AWS load balancers, in which case the developer must set app.SERVER_NAME anyway to return the outward-visible hostname. In these cases, the internal port doesn't matter, so calls to app.url_for will commonly need to produce a URL without a port - the load balance should be serving it at 80/443. Considering that, the default behavior in Sanic right now seems as though it is intended to be behind another interface since it omits the port by default. However, many uses of Sanic will not be done this way, they will serve on a specific port and the generated URL should reflect that by default.
As it stands, the developer must always set the following, but the documentation does not make that immediately apparent.
app.SERVER_NAME = '{}:{}'.format(hostname, port)
app.run(host=hostname, port=port)
I propose 2 possible updates:
app.url_for with _external=True requires app.SERVER_NAME to include the host and portapp.url_for to have a include_port=True flag that will add the operating port to the generated external URLsThe second bullet will support both the simple and complex use cases I spoke of. Default the value to True such that the simple example I provided works as-is. Setting it to False would be for the complex case where it is generating a URL that should never have the port because it is being another interface.
Hopefully that makes more sense? Sorry if I was unclear before.
Thanks for the explanation @eric-spitler.
I certainly feel a documentation update is in order here. That makes sense.
As for the flag, I am not opposed to having an include_port=True parameter. But, I would default it to False so that the behavior does not suddenly change for users. Yes, that means the "more complex" use case would be default, which seems sort of backwards, but I feel keeping backwards compatibility on this is a bigger priority. Otherwise unsuspecting developers would all of a sudden start getting ports unexpectedly on their URLs if they upgraded without reading the changelog.
I suggest not use include_port as parameter name due to current special parameters all starts with _ (may use _port for it, and default to None, and also can get the value from configuration like SERVER_NAME does).
Then in url_for function, check _port, if has value, show it, else don't show port in the url.
@lixxu I don't understand your argument; _var indicates semiprivate which sanic honors - for more info.
@lixxu I think the point @eric-spitler was making is that it should be configurable, and transparent.
@lixxu If it pulls from configuration, would that config be set implicitly during the call to app.run()?
Having to specify _port as a number as part of the url_for call defeats the transparency idea that @ahopkins mentioned.
SERVER_NAME is from configuration which is _server in url_for parameters, I mean that (e.g.) use _port in url_for parameters, then SERVER_PORT in configuration file.
In that case, I think that having a SERVER_PORT in the configuration, like @lixxu have mentioned, would be the best approach since we have this approach for the SERVER_NAME:
One problem I see is that urllib.parse.urlunparse doesn't have a "port" value to be set, so we'll have to append it to the netloc parameter, knowing that this function does not ignore default port for protocols鹿 (scheme):
>>> from urllib.parse import urlunparse
>>> urlunparse(("https", "localhost", "/hello/world", "", "", ""))
'https://localhost/hello/world'
>>> urlunparse(("https", "localhost:443", "/hello/world", "", "", ""))
'https://localhost:443/hello/world'
>>> urlunparse(("http", "localhost:80", "/hello/world", "", "", ""))
'http://localhost:80/hello/world'
>>> urlunparse(("http", "localhost", "/hello/world", "", "", ""))
'http://localhost/hello/world'
So, the thing is ...
SERVER_NAME already?80 for http and 443 for https鹿?[1] I don't think that any URL builder would strip the port value if one is given.
I created a PR (#1406) for it.
This is sorted out by #1638, which gets hostname and port always from the same location (proxy headers or if not found, Host header), and thus proper URLs may be generated via request.url_for which, contrary to app.url_for, has access to this data. No fallback to host/port specified at run, or SSL server name identifier / socket port, is needed because Host header is available in all normal circumstances, and it contains this information.
app.url_for, which might be used at layout generation time or otherwise while the app is not yet running, can only rely on config.SERVER_NAME, which should contain the full URL to application root including any non-standard port number.
@eric-spitler Do you believe that the current approach is sufficient?
One shortcoming of this is that in order to use app.url_for, one must manually set app.config.SERVER_NAME = "http://localhost:8000/" to get external URLs with the default development setup, but we cannot give a default value for SERVER_NAME either.
Since app.url_for is likely used before app.run is called, any host/port passed to run is not available to it, and I cannot readily see any easy solution to this.
That PR resolves this case for situations where the URL is being assembled from with a request context. However, it does not resolve it in other cases, such as another thread that is processing ongoing tasks unrelated to incoming requests. The latter use case is likely uncommon so this PR resolves a large percentage of these situations.
So long as the request.url_for usage properly retains the outward facing hostname/IP (such as when behind an AWS VPC), then that will always be a better way to generate the URL anyway.
I personally am okay with ensuring that the hostname:port combination is set (and already apply this automatically in my service template I use), but the documentation update would make this clear.