Hello,
I have a simple question about how logging works with gunicorn, and I can't find an explicit answer documented anywhere online (nor is it obvious to me reading through gunicorn the source code).
How does gunicorn ensure that the output from its worker processes is not intermingled? How does it guarantee that log entries generated from its workers are written to gunicorn's access/error log stream atomically? Python's standard library logging module is thread safe but not process safe, so how can we have a guarantee that gunicorn's workers' log entries will not be mangled (e.g. worker process A starts writing a log entry to stdout, OS switches to worker process B halfway through and worker process B starts writing a log entry to stdout).
Taking this question a bit further, if I log to stdout from my WSGI app using a StreamHandler, is this completely disjoint from any logging that gunicorn does? Will this have the consequence of potentially clashing with gunicorn's logging, if I'm sending that to stdout as well? If so, what is the correct way to do "application logging" from the WSGI app (several instances of which may be running at once from the gunicorn workers)?
Please forgive me if I am missing something obvious or fundamental here; this question may not be gunicorn-specific but these questions arose when I was considering how to properly set up logging with my Flask app deployed using gunicorn.
Gunicorn master(arbiter) create workers use os.fork, os.fork will share one file write offset, so this is work ok.
In the application, may be can use gunicorn on_starting hook to create logging, then application import it, but i'm not sure it work ok. As me, one process create one log with pid name, use tail -F to see all files.
I believe there is no guarantee it won't be mangled. It's not likely to be a problem on most platform unless the log message is smaller than a block, I think, if my memory of kernel I/O serves me correctly. So, likely log messages < 4KB in length can never me mangled.
If you have a need for something more and need these guarantees, you can use syslog or other logging support.
Please open any PR if you have a good way to explain more in the documentation, otherwise I think we can close this issue if my understanding is correct.
@bal2ag what said @Skycrab & @tilgovi (Thanks!) . The fd is shared between all workers, and until it isn't over the limit (depending on your system) alls logs will go over it.
Closing the issue now.
Sorry, I have some follow-up questions:
@mingzhaodotname I don't know if WathedFileHandler
is threadsafe and can work with different workers; Using USR1 provides at least a consistent behaviour across platforms.
Maybe you can test it and/or provide a branch that enable it? If you do please open a new ticket/PR so anyone can track it :)
Two interesting boundary conditions:
The emission of a message into a StreamHandler is indeed protected by a lock, so when the same StreamHandler is shared (e.g., across threads) then log messages are emitted atomically: https://github.com/python/cpython/blob/972458a4245afb4e934fbcb9c0224b105fb5d7ba/Lib/logging/__init__.py#L862
The log message and the terminator (newline byte!) are emitted in independent write()
calls: https://github.com/python/cpython/blob/972458a4245afb4e934fbcb9c0224b105fb5d7ba/Lib/logging/__init__.py#L996
We have today seen that when the web application in a single gunicorn worker process uses a normal StreamHandler for emitting log messages then those can interleave with the gunicorn's worker process log message emission, at precisely the interface between two write() call invocations. That is, for a web application log message it can happen that it's not actually newline-terminated in stdout, but directly followed by a log message emitted by the gunicorn worker process.
In this situation where there is just a single worker process and gunicorn and the web application use the same logging facility, seemingly even both use a StreamHandler -- there should be a straightforward way to prevent this from happening.
How can we make both use the same StreamHandler object?
@jgehrcke Did you solve the issue? I saw the same problem with my web application.
Sometimes the log was output like this:
2018-11-15 12:15:01,786 INFO: [x] GET 200 OK /api/threads/457456/votes 0.02s
2018-11-15 12:15:02,782 INFO: [x] GET 200 OK /api/rate/5041901/ratelogs 0.00s 2018-11-15 12:15:02,784 INFO: [x] GET 404 NOT FOUND /api/posts/5041901/votes 0.00s
2018-11-15 12:15:02,786 INFO: [x] GET 200 OK /api/threads/457486/votes 0.01s
2018-11-15 12:15:02,787 INFO: [x] GET 200 OK /api/threads/457486/votes 0.02s
I am using a rotating file handler (python 2) like the code below, in a flask app:
logger = logging.getLogger('app')
logger.setLevel(logging.DEBUG)
fh = logging.handlers.RotatingFileHandler('/home/ubuntu/app.log', maxBytes=1048576, backupCount=10)
logger.addHandler(fh)
If I run this flask app with a gunicorn configuration (reload on code update, 2 gevent workers):
gunicorn --reload --bind 0.0.0.0:8080 -k gevent --timeout 300 --graceful-timeout 300 --workers=2 wsgi:flaskapp
Then logging will work for some time (some days), after that log file will stop getting updated. This problem doesn't occur if use only one gevent worker. What is causing this error? I know that this problem will not happen with streamhandler.
@aloknayak29 you shouldn't do that inside your application. To rotate logs you should send a USR1 signal to the arbiter. As for the log level it's advised to set it for now in the configuration and reload it when needed. If you want to discuss more about this topic please open a new ticket anyway as this one has been closed.
@aloknayak29 that may be separate from the question for this thread. You may want to open another issue to discuss.
@jgehrcke very good catch with the separate write()
calls. I did not know that!
I am not sure there is a solution today other than to configure a logging handler other than StreamHandler
.
I don't know whether Gunicorn should have its own network logging to the master process. That would be a big change, but if it can be done very simply it could be okay.
Most helpful comment
Two interesting boundary conditions:
The emission of a message into a StreamHandler is indeed protected by a lock, so when the same StreamHandler is shared (e.g., across threads) then log messages are emitted atomically: https://github.com/python/cpython/blob/972458a4245afb4e934fbcb9c0224b105fb5d7ba/Lib/logging/__init__.py#L862
The log message and the terminator (newline byte!) are emitted in independent
write()
calls: https://github.com/python/cpython/blob/972458a4245afb4e934fbcb9c0224b105fb5d7ba/Lib/logging/__init__.py#L996We have today seen that when the web application in a single gunicorn worker process uses a normal StreamHandler for emitting log messages then those can interleave with the gunicorn's worker process log message emission, at precisely the interface between two write() call invocations. That is, for a web application log message it can happen that it's not actually newline-terminated in stdout, but directly followed by a log message emitted by the gunicorn worker process.
In this situation where there is just a single worker process and gunicorn and the web application use the same logging facility, seemingly even both use a StreamHandler -- there should be a straightforward way to prevent this from happening.
How can we make both use the same StreamHandler object?