Description
Using peewee with fastapi - everything works as expected initially but after a while I get connection errors:
peewee.InterfaceError: connection already closed
I thought I had handled this with my middleware - which is inline with how other frameworks manage peewee:
@app.middleware("http")
async def manage_db(request: Request, call_next):
try:
db = get_db()
db.connect()
response = await call_next(request)
finally:
if not db.is_closed():
db.close()
return response
I'm unsure why I'm still getting these errors - does anyone have experiences with peewee or any insight on why this is happening?
@eliehamouche It could be due to how the get_db function is implemented -- if connection pooling is being used behind the scenes, you need to make sure that you aren't reusing a connection across multiple requests that will be closed.
Typically, though, you wouldn't want to be opening and closing a database connection on every request -- you want to open a database connection (pool) on app startup, and then associate a connection to each request in the middleware, and only disconnect on app shutdown.
When using sqlalchemy, this more-or-less amounts to creating a session for each request. With encode/databases, the association of a connection to each request is handled transparently using async context variables. I'm not familiar enough with peewee to know how it should look with that library, but there might (should?) be a way to "commit" changes without disconnecting; hopefully there is a way to access different "sessions" for different requests as well. (The logic there is likely to be specific to peewee, so I can't comment on what it would look like.)
Either way, you should make sure that there aren't issues where concurrent requests are making use of the same connection (possibly causing transaction weirdness, etc.).
That's the general approach I had before implementing the middleware above to try and fix the issue, connect on startup, teardown on exit, but I'd get the not connected error every once in a while.
Peewee recommends opening and closing the connection on every request. Their example from the docs for flask:
from flask import Flask
from peewee import *
database = SqliteDatabase('my_app.db')
app = Flask(__name__)
# This hook ensures that a connection is opened to handle any queries
# generated by the request.
@app.before_request
def _db_connect():
database.connect()
# This hook ensures that the connection is closed when we've finished
# processing the request.
@app.teardown_request
def _db_close(exc):
if not database.is_closed():
database.close()
The get_db() method is just instantiating a database object - variables come from the environment:
def get_db():
db = PostgresqlExtDatabase(db_name, user=user, password=password,
host=host, port=port, autocommit=True,
autorollback=True)
return db
Quite baffled by this - it's tricky to look into because when I restart everything works fine
Okay, if that’s the intended usage I suppose it makes sense.
Do you know if peewee is doing connection pooling behind the scenes? If so, I could still imagine async being the source of problems in such a way that it might work if everything was synchronous and only one request was handled at a time.
Do you know if the behavior could be due to a failed database request resulting in an automated disconnect?
One way to try to diagnose it would be to implement a locking mechanism in the middleware that prevents multiple requests from being handled concurrently. This obviously wouldn’t be good for production use, but if you can’t reproduce the error with it on it would be a good indication that it was due to async; in the other case it would indicate the problem was something else.
Ah ha! I found the following text in the peewee docs:
Thread Safety
Peewee keeps track of the connection state using thread-local storage, making the Peewee Database object safe to use with multiple threads. Each thread will have it’s own connection, and as a result any given thread will only have a single connection open at a given time.
The fact that it has one connection per thread is the problem — as I suspected, the disconnect is happening while another (async) request is in progress and is causing problems. Because of this, peewee probably won’t work well with any async server. (Unless there is a way to create multiple connections in the same thread; I don’t know enough about peewee to know if that is possible.)
There is a peewee-async package that may be a drop-in replacement that works with fastapi, but I don’t know how mature it is.
Also relevant: https://github.com/coleifer/flask-peewee/blob/master/docs/gevent.rst
This shows what you have to do to make peewee play nice with greenlets. Given it requires monkeypatching the threading module, to get a similar effect with asyncio I suspect you’d need to modify peewee/threading(?) to somehow get peewee to use asyncio context-local variables rather than thread-local variables.
I’m not an expert at this stuff, but I’d guess it’s not for the faint of heart. Peewee-async is probably your best bet if you want to stick with peewee.
Beat me to it, I was just looking at peewee-async
Might as well just migrate to sqlalchemy
Thanks a lot for your help! I will close this for now - hopefully if anyone else comes across this issue they'll be able to refer to this thread
Thanks for the help @dmontagu ! :taco: :tada:
@eliehamouche this seems quite strange. I'm using Peewee at work without problems.
I just checked a bit about all that just to make sure...
Indeed, Peewee docs say that:
Peewee keeps track of the connection state using thread-local storage [...]
So, it seems it would not be possible to use it with async IO frameworks. But that info is inconsistent with other parts of the docs:
The __init__() method of a Database has a threadlocals=False (defaults to False) that determines that behavior: https://peewee.readthedocs.io/en/2.0.2/peewee/database.html#Database.__init__
So, by default, it would not store the connection in threadlocals, although they recommend using it (assuming you are using a threaded framework).
What we are doing is pretty much the same as you are doing. And in fact, is the same thing that is suggested in the docs for AsyncIO frameworks (the example is Sanic): http://docs.peewee-orm.com/en/latest/peewee/database.html#sanic
As a side note, Peewee creator seems to hate Async IO with his guts :joy: http://docs.peewee-orm.com/en/latest/peewee/database.html#async-with-gevent but still, it's documented there how to use it, and it's being used... :man_shrugging:
I'm still not sure why could you be getting that error, maybe the way you create your Database instance?
...anyway, just wanted to add the extra info in case it's useful, as I'm currently using Peewee, and in theory, it should work. But thanks for reporting back and closing it. Still, feel free to add more comments or issues. :smile: :cake:
@tiangolo Is your work use of peewee exposed to substantial loads? I’m wondering if something might be going on where @eliehamouche was triggering multiple requests simultaneously, or was load testing, or similar, and that’s what was causing the issue. It seems like the sort of thing that might not be a problem under normal usage, or at least under normal usage for certain API designs.
If you’ve tested the implementation against a high volume of concurrent requests that would kill this theory though 😄.
I think a very easy way to test my diagnosis above would be to create an app that looks like this:
@app.get("/trigger-db-middleware")
async def trigger_db_middleware(request: Request):
pass
@app.get("/sleep-then-access-db")
async def sleep_then_access_db(request: Request):
await asyncio.sleep(10)
somehow_access_the_database(request.db)
Then hit /sleep-then-access-db once, then while that is running, make a request (or maybe many) to /trigger-db-middleware and see if the db access in the first call (once the sleep finishes) causes an error.
@tiangolo I'm wondering if this might also be due to the use of async def endpoints vs def endpoints. Given peewee plays nice with threads, maybe there is a way to get this to work by using def endpoints since those use starlette.run_in_threadpool. I wouldn't think this would solve the database connection issues if it happens in middleware, but maybe the connection is set up lazily, and if it gets set up inside a thread it makes use of the thread context?
Separately, let's say you keep the same middleware @eliehamouche has, but remove the db.connect call. Then, have a dependency (let's call it get_database to avoid conflict with the get_db function already present), and have that dependency perform the db.connect call (but keep the disconnect in the middleware). @tiangolo would the dependency function be executed inside the thread (assuming the use of a def endpoint)? If so, maybe that would ensure peewee could orchestrate this all properly.
@tiangolo in your work usage, are the database accesses happening in def endpoints, rather than async def endpoints? @eliehamouche were you using def endpoints or async def endpoints in your server when you were running into this issue?
@tiangolo If the above described approach doesn't work, I'm wondering if this an instance of the potential value of context-manager dependencies that can run entirely inside the threadpool thread. It seems to me that if that could be done, peewee should integrate with FastAPI just as well as sqlalchemy.
I've been investigating this. :male_detective:
I created some example tests to reproduce the error systematically.
And I implemented a potential fix in a PR to Peewee: https://github.com/coleifer/peewee/pull/2072 :heavy_check_mark:
In short, Peewee uses threading.local, but that's not compatible with async frameworks.
I added support for contextvars in a backwards-compatible way. And that solved the error from the example.
There's a lot more details in the PR. :memo:
So, the PR to Peewee was rejected. But then I went and documented everything about using Peewee with FastAPI. It requires a bit more work, but can be done and works well.
The new docs for Peewee are here: https://fastapi.tiangolo.com/tutorial/sql-databases-peewee/
Assuming the original issue was solved, it will be automatically closed now. But feel free to add more comments or create new issues.
Now docs for Peewee are here: https://fastapi.tiangolo.com/advanced/sql-databases-peewee/
Most helpful comment
Now docs for Peewee are here: https://fastapi.tiangolo.com/advanced/sql-databases-peewee/