I'm running the app
gunicorn -w 2 -b 'localhost:8585' --timeout=200 --certfile=crt.crt --keyfile=key.key service:app
And I get the following, but I do not always get such an answer, most requests are handled correctly, but sometimes an error occurs
[2018-05-08 14:53:36 +0500] [11227] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/gunicorn/workers/sync.py", line 134, in handle
req = six.next(parser)
File "/usr/lib/python3/dist-packages/gunicorn/http/parser.py", line 41, in __next__
self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count)
File "/usr/lib/python3/dist-packages/gunicorn/http/message.py", line 153, in __init__
super(Request, self).__init__(cfg, unreader)
File "/usr/lib/python3/dist-packages/gunicorn/http/message.py", line 53, in __init__
unused = self.parse(self.unreader)
File "/usr/lib/python3/dist-packages/gunicorn/http/message.py", line 165, in parse
self.get_data(unreader, buf, stop=True)
File "/usr/lib/python3/dist-packages/gunicorn/http/message.py", line 156, in get_data
data = unreader.read()
File "/usr/lib/python3/dist-packages/gunicorn/http/unreader.py", line 38, in read
d = self.chunk()
File "/usr/lib/python3/dist-packages/gunicorn/http/unreader.py", line 65, in chunk
return self.sock.recv(self.mxchunk)
File "/usr/lib/python3.5/ssl.py", line 922, in recv
return self.read(buflen)
File "/usr/lib/python3.5/ssl.py", line 799, in read
return self._sslobj.read(len, buffer)
File "/usr/lib/python3.5/ssl.py", line 585, in read
v = self._sslobj.read(len)
OSError: [Errno 0] Error
From my memory, this error happens when a client tries to connect without SSL. Could that be the case for you?
I see your post on the other issue that I closed. My apologies if my comment is not the cause.
Is there a pattern to which requests fail this way?
@usmetanina what kind of clients connect to Gunicorn also? DO you have any SSL options used explicitly to connect to it?
is this already solved ? @usmetanina , because I have exactly the same issue
@benoitc I see @usmetanina's exact error frequently using python3.6 and gunicorn 19.9.0
.
I use the below information to start up gunicorn with a flask app running within a docker container.
gunicorn --workers=3 --bind=0.0.0.0:8000 --config=gunicorn_config.py --preload main
The config file looks like this (domain-with-cert.com of course is a placeholder for the actual domain name):
workers = 3
bind = '0.0.0.0:443'
certfile = '/etc/letsencrypt/live/domain-with-cert.com/fullchain.pem'
keyfile = '/etc/letsencrypt/live/domain-with-cert.com/privkey.pem'
Any thoughts on debugging this would be helpful. If you need further info, just let me know.
@willpatera, see my comment:
From my memory, this error happens when a client tries to connect without SSL. Could that be the case for you?
@tilgovi I saw the above comment. I am pretty sure that the client is connecting over SSL. Any debugging suggestions?
@willpatera I would say, turn on the access logs and see if you can determine which request causes the issue. If you have a reverse proxy in front of gunicorn make sure it has access logs so you can maybe see which request causes an error with gunicorn even if gunicorn never logs it.
@tilgovi I am having the same issues. Had to edit the following information a bit as it was incorrect:
The request that is being made to gunicorn is always the exact same request (but with a different body). So there is no doubt that it is https and not http.
What I do notice is that it always happens when the amount of requests is going up. When the server is busy it seems to have trouble handling the requests properly.
Maybe this has to do with the workers or something like that? If you have any configuration suggestions I would gladly like to test them.
Hi guys, I am still looking for a way to solve this. Currently the only option we have is to downgrade to plain HTTP, which is not feasible at all.
I've witnessed the same thing. Had a production server running Gunicorn + Flask (behind a load balancer) that worked fine for months, then suddenly every request yielded this error until I restarted Gunicorn:
[2019-11-21 07:27:36 +0000] [24245] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/sync.py", line 134, in handle
req = six.next(parser)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/parser.py", line 41, in __next__
self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/message.py", line 181, in __init__
super(Request, self).__init__(cfg, unreader)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/message.py", line 54, in __init__
unused = self.parse(self.unreader)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/message.py", line 193, in parse
self.get_data(unreader, buf, stop=True)
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/message.py", line 184, in get_data
data = unreader.read()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/unreader.py", line 38, in read
d = self.chunk()
File "/usr/local/lib/python3.6/dist-packages/gunicorn/http/unreader.py", line 65, in chunk
return self.sock.recv(self.mxchunk)
File "/usr/lib/python3.6/ssl.py", line 997, in recv
return self.read(buflen)
File "/usr/lib/python3.6/ssl.py", line 874, in read
return self._sslobj.read(len, buffer)
File "/usr/lib/python3.6/ssl.py", line 633, in read
v = self._sslobj.read(len)
OSError: [Errno 0] Error
Nothing in the logs preceding these errors hints at what the trigger could've been.
This was with Gunicorn 19.9.0 running with 3 workers on a single-cored server.
Since this is the first time I've seen this issue, I can't promise I'll ever reproduce it. However, if there's any kind of logging or other diagnostic code anyone would like me to add in on our server that might provide some useful information in the event that this happens again, I'm all ears.
Does your LB call a specific endpoint? How does it answer to the LB request?
When I said "Load Balancer", I really ought to have said CDN or caching layer. Specifically: it's Amazon Cloudfront. It just forwards on requests to our Gunicorn server (running on an EC2 instance) and caches the results for a while.
hrm shouldn't amazon cloudfront terminate the ssl request for you? @ExplodingCabbage . Why gunicorn has to listen on ssl behind?
@benoitc So, there's two layers with SSL involved in the architecture. Members of the public connect to our website via our CloudFront domain over HTTPS, and then CloudFront makes a request to our backend node running Gunicorn, also using HTTPS (with a different domain name and cert), caches the result, and serves it to the public.
I guess maybe you're wondering what the point of using SSL for that second, internal request is? It's certainly arguable it's pointless (although possibly not - it stops Amazon snooping on our comms in their internal network, and there are also regulatory reasons I won't go into why, given my company's industry, we might need to ensure we've got encryption all the way along the pipeline). Whether pointless or not, we do it. ¯\_(ツ)_/¯
could it be that cloudfront is sending to your endpoint a plain HTTP request though? If you have access to the cloudfront logs you should be able to see it.
@benoitc I don't think CloudFront exposes any logs that would be useful, but I'm sure that it wasn't trying to connect over HTTP, since:
@ExplodingCabbage ok I will have a look on it after the 20.0.1 is out. One last thing, which version of Python are you using?
3.6.8
I realise I left out a detail from my story above: before restarting Gunicorn, I also updated the SSL certificate Gunicorn uses with LetsEncrypt. I hadn't thought to mention this because I had wrongly concluded yesterday that there was no way that a certificate would've expired on the day that the errors began and that the certificate update had not in fact been relevant to fixing the problem.
However, from checking some logs, I now realise that the errors in fact began on the day that a previous certificate was due to expire.
There's still some mystery here, and some potential room for improvement (what exactly does this error signify, and why can't Gunicorn give a more useful message?), but the narrative I gave before - in which this error started out of the blue with no apparent cause - isn't right. I'd guess that CloudFront was terminating the connection in response to seeing an expired certificate from the Gunicorn server, and that Gunicorn, rather than being able to understand that and report it meaningfully, lets a messageless OSError bubble up.
I apologise for not having my ducks in a row before reporting. On the other hand, perhaps this'll make it easier to reproduce this exception at will if you want to try and handle the scenario more elegantly.
@ExplodingCabbage oh that's quite interresting, it should be reproducoible at some point then. Thanks for the additional details!
I have just run reproducibly into the same problem and I'm somewhat confident it's the consequence of some kind of resource exhaustion.
For me it was triggered by forgetting a timeout on a blocking call and requests piling up.
HTH
Hello! I'm experiencing this exact issue. I have a gunicorn/flask service running on an ECS cluster behind a network load balancer. Some version specifics:
python - 3.7.4
gunicorn - 19.9.0
flask - 1.0.4
The service is able to respond to requests coming from a client using TLS without issue, however my logs are flooded with OSErrors. As far as I can tell, these are resulting from the health check requests coming from the load balancer (TCP).
I was able to reproduce the error locally by opening and closing a TCP connection manually on the listening port (8000 in this case):
$ nc -vz 127.0.0.1 8000
localhost [127.0.0.1] 8000 (irdmi) open
Which resulted in the following error being thrown:
Traceback (most recent call last):
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/workers/sync.py" line 134 in handle
req = six.next(parser)
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/parser.py" line 41 in __next__
self.mesg = self.mesg_class(self.cfg, self.unreader, self.req_count)
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/message.py" line 181 in __init__
super(Request, self).__init__(cfg, unreader)
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/message.py" line 54 in __init__
unused = self.parse(self.unreader)
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/message.py" line 193 in parse
self.get_data(unreader, buf, stop=True)
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/message.py" line 184 in get_data
data = unreader.read()
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/unreader.py" line 38 in read
d = self.chunk()
File "/nix/store/nh3v0c2nipihwblkdn0mh2kqyv3jq9nz-python3-3.7.4-env/lib/python3.7/site-packages/gunicorn/http/unreader.py" line 65 in chunk
return self.sock.recv(self.mxchunk)
File "/nix/store/azwzsm1pkbzjxpkiq88w68p4jdghgasl-python3-3.7.4/lib/python3.7/ssl.py" line 1056 in recv
return self.read(buflen)
File "/nix/store/azwzsm1pkbzjxpkiq88w68p4jdghgasl-python3-3.7.4/lib/python3.7/ssl.py" line 931 in read
return self._sslobj.read(len)
OSError: [Errno 0] Error
Hope this helps!
Hmm, well after some additional research it seems that this may actually be a bug in the way the python ssl
library handles ragged EOFs on linux: https://bugs.python.org/issue31122
As mentioned by @shevisjohnson if you execute "nc -vz hostname port_no" this error appears.
We can suppress this error in log file by using below logging mechanism.
$cat logging_config.yml
version: 1
formatters:
simple:
format: " %(asctime)s || %(name)s || %(levelname)s || %(message)s"
test_api:
format: "[%(asctime)s] [%(process)s] [%(levelname)s] %(message)s"
handlers:
console:
class: logging.StreamHandler
level: DEBUG
formatter: simple
stream: ext://sys.stdout
test_api_file_handler:
class: logging.handlers.RotatingFileHandler
level: DEBUG
formatter: test_api
filename: logs/test.log
maxBytes: 2000000000
backupCount: 1
encoding: utf8
loggers:
test_api:
level: DEBUG
handlers: [test_api_file_handler]
propagate: 0
root:
level: DEBUG
handlers: [console]
Here is the python file.
import logging
import yaml
from flask import Flask
app = Flask(__name__)
def logSetter(logger_name:str) -> logging:
with open("logging_config.yml", 'r') as f:
config = yaml.safe_load(f)
logging.config.dictConfig(config)
logger = logging.getLogger(logger_name)
return logger
logger=logSetter(logger_name="test_api")
@app.route("/api/test")
def hello():
app.logger.info("hey from api")
return "Hello from Python!"
Hope it helps.
It only took a moment to come up with a reliable reproduction: using hey
to send 100 concurrent requests to latest Gunicorn (20.0.4) using the gthread
worker:
$ hey -n 100 -c 100 https://127.0.0.1:8000
```
$ gunicorn app:app -k gthread --certfile=... --keyfile=...
...
[2020-07-11 19:10:58 +0000] [3628247] [ERROR] Socket error processing request.
Traceback (most recent call last):
return self._sslobj.read(len)
OSError: [Errno 0] Error
Using a Debian 9 / Linux 4.14.67 based environment.
The WSGI app to reproduce need not be anything beyond:
```python
# app.py
def app(environ, start_response):
start_response("200 OK", [])
return ""
In case this helps too!
If the root cause is in fact https://bugs.python.org/issue31122:
This is affecting my organization in prod as well.
I noticed that the bugfix landed in 3.8 and 3.9 branches, but they're considering <= 3.7 EOL and we're still kinda stuck on 3.6 for the time being. Is there a known workaround to this issue at this time in gunicorn itself? Is there anything planned?
We're looking into what could be calling the service so much to trigger this, but I'm just trying to figure out what could be done, as this results in huge resource spikes on the affected nodes.
In addition to jriddy's comment regarding no intention to backport prior to 3.8, if anyone else is having this issue, also note that the fix is set to be included in CPython 3.8.6.
Having trouble telling exactly where this traceback emanates from - in my case, using gevent
as WSGI app server directly, so assuming it's a logging call somewhere within gevent/greenlet, but can't find it as of yet. For Gunicorn, it happens here, for synchronous workers:
In the Gunicorn case, if you're just concerned about noise in logs, might be able to do something such as:
import logging
class HandshakeFilter(logging.Filter):
# example: https://docs.python.org/3/howto/logging-cookbook.html
# I have not tested this
def filter(self, record):
return "socket error processing request" in record.msg.casefold()
logging.getLogger("gunicorn").addFilter(HandshakeFilter())
Related gevent issue: https://github.com/gevent/gevent/issues/1671
Most helpful comment
Hmm, well after some additional research it seems that this may actually be a bug in the way the python
ssl
library handles ragged EOFs on linux: https://bugs.python.org/issue31122