It seems that aiohttp doesn't work properly when requesting URLs with Fully Qualified Domain Name. This was fixed in urllib3, see https://github.com/urllib3/urllib3/pull/1255 It should probably by fixed also in aiohttp. May be related to https://github.com/aio-libs/aiohttp/issues/3171
aiohttp works when requesting URLs with FQDN
aiohttp raises SSL error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/pavel/.pyenv/versions/3.7.1/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
return future.result()
File "<stdin>", line 3, in main
File "<stdin>", line 2, in fetch
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/client.py", line 1005, in __aenter__
self._resp = await self._coro
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/client.py", line 476, in _request
timeout=real_timeout
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/connector.py", line 854, in _create_connection
req, traces, timeout)
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/connector.py", line 992, in _create_direct_connection
raise last_exc
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/connector.py", line 974, in _create_direct_connection
req=req, client_error=client_error)
File "/home/pavel/Projects/_lab/aiohttp/lib/python3.7/site-packages/aiohttp/connector.py", line 927, in _wrap_create_connection
req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host github.com.:443 ssl:True [SSLCertVerificationError: (1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'github.com.'. (_ssl.c:1051)")]
Run the following code:
import aiohttp
import asyncio
async def fetch(session, url):
async with session.get(url) as response:
return await response.text()
async def main():
async with aiohttp.ClientSession() as session:
html = await fetch(session, 'https://github.com.')
print(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Python 3.7.1
Ubuntu 18.04
pip freeze
aiohttp==3.5.4
async-timeout==3.0.1
attrs==19.1.0
chardet==3.0.4
idna==2.8
multidict==4.5.2
yarl==1.3.0
GitMate.io thinks the contributor most likely able to help you is @asvetlov.
Possibly related issues are https://github.com/aio-libs/aiohttp/issues/2920 (AIOHttp failing after some requests), https://github.com/aio-libs/aiohttp/issues/660 (aiohttp.request hangs on some URLs), https://github.com/aio-libs/aiohttp/issues/206 (SSL issue with aiohttp.request), https://github.com/aio-libs/aiohttp/issues/1403 ([QUESTION] aiohttp.ClientSession.request('GET') issue), and https://github.com/aio-libs/aiohttp/issues/3523 (aiohttp not forwarding cookies with Session requests).
Hostname mismatch, certificate is not valid for 'github.com.'
Your trusted CA chain is probably broken/invalid/misconfigured. It's not aiohttp's fault.
@webknjaz Thank you for taking the time to look at this.
If you are correct, why does the following code work correctly (with the same environment)?
import aiohttp
import asyncio
async def fetch(session, url):
async with session.get(url) as response:
return await response.text()
async def main():
async with aiohttp.ClientSession() as session:
html = await fetch(session, 'https://github.com')
print(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Note that the only difference is https://github.com vs https://github.com. (the dot at the end of the URL).
And Python-Requests with urllib3==1.24.1 works fine too, while it doesn't work for older versions of urllib3 where this wasn't fixed, see https://github.com/urllib3/urllib3/pull/1255.
Ah, ok. But strictly speaking, certificate has CN=github.com which doesn't match github.com.
So from the TLS PoV, everything works as expected.
Yes, it doesn't match. Where do you think this should be fixed if not in aiohttp then? BTW, check this https://github.com/haikuginger/urllib3/blob/68f3475b421f81d0e78eb0c2271d27d8b75bea05/urllib3/connection.py#L128-L144 and the discussion here https://bugs.python.org/issue31997
Yea, I saw that. So I decided to do some research with what I have on my machine.
Google Chrome:
https://github.com./, observe browser sending a request to https://github.com./https://github.com.../, observe browser sending a request to https://github.com./Location headerscurl:
HEAD / HTTP/1.1
Host: github.com
User-Agent: curl/7.63.0
Accept: /< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: GitHub.com
Server: GitHub.com
< Date: Tue, 05 Mar 2019 11:00:12 GMT
Date: Tue, 05 Mar 2019 11:00:12 GMT
< Content-Type: text/html; charset=utf-8
Content-Type: text/html; charset=utf-8
< Status: 200 OK
Status: 200 OK
< Vary: X-PJAX
Vary: X-PJAX
< ETag: W/"86a3fca1a8a21cf08b73b0a956a47d89"
ETag: W/"86a3fca1a8a21cf08b73b0a956a47d89"
< Cache-Control: max-age=0, private, must-revalidate
Cache-Control: max-age=0, private, must-revalidate
< Set-Cookie: has_recent_activity=1; path=/; expires=Tue, 05 Mar 2019 12:00:12 -0000
Set-Cookie: has_recent_activity=1; path=/; expires=Tue, 05 Mar 2019 12:00:12 -0000
< Set-Cookie: _octo=GH1.1.320186990.1551783612; domain=.github.com; path=/; expires=Fri, 05 Mar 2021 11:00:12 -0000
Set-Cookie: _octo=GH1.1.320186990.1551783612; domain=.github.com; path=/; expires=Fri, 05 Mar 2021 11:00:12 -0000
< Set-Cookie: logged_in=no; domain=.github.com; path=/; expires=Sat, 05 Mar 2039 11:00:12 -0000; secure; HttpOnly
Set-Cookie: logged_in=no; domain=.github.com; path=/; expires=Sat, 05 Mar 2039 11:00:12 -0000; secure; HttpOnly
< Set-Cookie: _gh_sess=aVhCRytZY3VmRHdvWFV2aCtYZUNGUFRkL3dkTEIwRThWV2lUcC8xdlRUeW5sd3NDNTgyK3pUb1JDeGtRalVoU29TUUtsRjcwQldaVnBmcWNmWGs5TDZ3bjFGUXVDUGpESmJ0MVJYenE4L3ExejhjTVByY08xK01pU1hTRE40dExLVStBRjlWVWJYZ3RIMG9PTnJPNnhuSjQ1S1NNTzMrbmJZWkQxc3E2cU5tUml5b2psc1NlOVpBK3plQ01weTV5UTNTVU93a3oxS0V0bkQ0L2ZHRXNyUT09LS0xUHdYUjFuRDNmQ2kzSUo5dnlBV0VBPT0%3D--f23d58ffe1fe58e363c6ec7dfa00adc64287e53b; path=/; secure; HttpOnly
Set-Cookie: _gh_sess=aVhCRytZY3VmRHdvWFV2aCtYZUNGUFRkL3dkTEIwRThWV2lUcC8xdlRUeW5sd3NDNTgyK3pUb1JDeGtRalVoU29TUUtsRjcwQldaVnBmcWNmWGs5TDZ3bjFGUXVDUGpESmJ0MVJYenE4L3ExejhjTVByY08xK01pU1hTRE40dExLVStBRjlWVWJYZ3RIMG9PTnJPNnhuSjQ1S1NNTzMrbmJZWkQxc3E2cU5tUml5b2psc1NlOVpBK3plQ01weTV5UTNTVU93a3oxS0V0bkQ0L2ZHRXNyUT09LS0xUHdYUjFuRDNmQ2kzSUo5dnlBV0VBPT0%3D--f23d58ffe1fe58e363c6ec7dfa00adc64287e53b; path=/; secure; HttpOnly
< X-Request-Id: 34295258-8eb5-4678-992f-79dbab314bc0
X-Request-Id: 34295258-8eb5-4678-992f-79dbab314bc0
< Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
< X-Frame-Options: deny
X-Frame-Options: deny
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< X-XSS-Protection: 1; mode=block
X-XSS-Protection: 1; mode=block
< Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
< Expect-CT: max-age=2592000, report-uri="https://api.github.com/_private/browser/errors"
Expect-CT: max-age=2592000, report-uri="https://api.github.com/_private/browser/errors"
< Content-Security-Policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; connect-src 'self' uploads.github.com www.githubstatus.com collector.githubapp.com api.github.com www.google-analytics.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com wss://live.github.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com; frame-ancestors 'none'; frame-src render.githubusercontent.com; img-src 'self' data: github.githubassets.com identicons.github.com collector.githubapp.com github-cloud.s3.amazonaws.com *.githubusercontent.com; manifest-src 'self'; media-src 'none'; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com
Content-Security-Policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; connect-src 'self' uploads.github.com www.githubstatus.com collector.githubapp.com api.github.com www.google-analytics.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com wss://live.github.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com; frame-ancestors 'none'; frame-src render.githubusercontent.com; img-src 'self' data: github.githubassets.com identicons.github.com collector.githubapp.com github-cloud.s3.amazonaws.com *.githubusercontent.com; manifest-src 'self'; media-src 'none'; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com
< X-GitHub-Request-Id: 71BE:48F2:1117C18:1FE14AE:5C7E56BC
X-GitHub-Request-Id: 71BE:48F2:1117C18:1FE14AE:5C7E56BC
<
<ul>
<li>consoleSo it looks like there's no agreement on what clients should do but a single-dot case is handled gracefully.
SNI note: https://tools.ietf.org/html/rfc6066#section-3
[...] The hostname is represented as a byte
string using ASCII encoding without a trailing dot. [...]
the discussion here https://bugs.python.org/issue31997
Right, this clears things up about who should handle trailing dot which is application layer, according to @tiran. This seems fair.
Another excerpt:
IMO the problem should be handled in high level libraries such as urllib. urllib should use the FQDN with trailing dot for DNS resolution, then strip off the trailing dot and use the FQDN for HTTP Host header and server_hostname.
aiohttp.web client should probably do manual ceretificate verification against a name with trailing dot stripped. But it still should use whatever user provided to peform the DNS relolution.
Most helpful comment
Verdict
aiohttp.webclient should probably do manual ceretificate verification against a name with trailing dot stripped. But it still should use whatever user provided to peform the DNS relolution.