I am playing around with FastAPI a bit and wanted to connect it to a MongoDB database. I, however, am confused which ODM to choose between motor which is async and mongoengine. Also, in the NoSQL example they have created a new bucket and also the called the code to connect to DB every time it is used. However, both motor and mongoengine seem to prefer a global connection. So what would be a good way to connect to MongoDB?
@Ayush1325 I think the example for sql DBs applies here https://fastapi.tiangolo.com/tutorial/async-sql-databases/#connect-and-disconnect
Basically, you can create/destroy the global connection using the @app.on_event("startup")
and @app.on_event("shutdown")
events on your app.
@jaddison Ok. But should I go with motor or mongo engine? Also in the NoSQL example, it is mentioned that they are retrieving new bucket every time as a single bucket won't work with multithreading in docker image. So is it okay to have a global connection as it is mentioned at least in motor docs that it does not support multithreading.
@Ayush1325 You can use FastAPI as async OR sync web framework, so you should make a decision about that at least before thinking about integration with DB. Now if you want to build async app, you would need to do all the IO asynchronously too, DB operations included. The most popular and (probably) stable async package for interacting with MongoDB is motor
(which is based on no less stable pymongo
package, which you'd want to use in sync app). We use it in our apps with FastAPI and it's been great so far.
As far as multithreading goes you don't need to care about it that much, because FastAPI is single-threaded (and single-cored as well). Every request is handled by separate task in the event loop (uvloop
in this case) so you can just create your mongodb client class, based on motor
bundled one for doing all the calls to the MongoDB (for example check out how it is done in our package (WIP) in client.py and setup_mongodb function in utils.py)
Thanks @jaddison and @levchik for your help here! Great responses! ๐๐ฐ
And thanks @Ayush1325 for reporting back and closing the issue.
Some extra comments:
I would choose the package based on what makes your more efficient and productive, unless you absolutely need the extreme, maximum performance. ...The same way you would choose FastAPI for your app instead of writing it in C by hand.
If you need to use WebSockets, you will need async functions, that could alter your decision.
You could actually also use sync I/O code inside of async functions, if you call it through Starlette's run_in_threadpool
and iterate_in_threadpool
. But that's a rather advanced use case and is not properly documented yet.
@Ayush1325 This is my work, hope to help.
https://github.com/markqiu/fastapi-mongodb-realworld-example-app
Hey @markqiu, I had a look to your repo, that I found rich of cues on how to deal with a mongo db, however I'm finding it a total nightmare having to deal with ObjectId searialization/deserialization and I saw in the repo you avoid using ObjectIds aswell. Do you have any ideas on how to deal with that? When I jsonable_encode an object containing an ObjectId I'm fine at serializing it to string, and this can be done with the trick suggested by @tiangolo here, but then when I need to, say, insert the document in a collection, and need to insert the field as an ObjectId I'd have to serialize it and then rember to convert each field every time. I don't know if a dict_encoders
method would be an option for @samuelcolvin for having a str
encoded to an ObjectId
when calling the dict
method.
Example:
from pydantic import BaseModel, Field
from bson import ObjectId
class ObjectIdStr(str):
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, v):
if not isinstance(v, ObjectId):
raise ValueError("Not a valid ObjectId")
return str(v)
class MyModel(BaseModel):
id: ObjectIdStr = Field(..., alias="_id")
class Config:
allow_population_by_field_name=True
dict_encoders = {str: lambda x: ObjectId(x) if ObjectId.is_valid(x) else x}
mymodel = MyModel(id=ObjectId("5df3c81e99256b2fe60b5f8d")
mymodel.json() # {"_id": "5df3c81e99256b2fe60b5f8d"}
mymodel.dict() # {"_id": ObjectId("5df3c81e99256b2fe60b5f8d")}
what do you think?
I create a mixin, like that:
class DBModelMixin(BaseModel):
id: Optional[Any] = Schema(..., alias="_id")
@validator("id")
def validate_id(cls, id):
return str(id)
Hope it works for you.
Thank you @markqiu! Even if that wasn't exactly what I was looking for, as I would still have a string in place of an object id when I'm casting the model to a dictionary, that somehow gave me that simple obvious idea I was missing: creating the ObjectId
itself as custom data type. I don't know why that didn't came to mind before, sorry ๐ I'll leave an example in case someone else needs it!
from pydantic import BaseModel, Field, validator
from fastapi.encoders import jsonable_encoder
from typing import Optional, Any
from bson import ObjectId
class ObjectIdStr(str):
@classmethod
def __get_validators__(cls):
yield cls.validate
@classmethod
def validate(cls, v):
if not ObjectId.is_valid(str(v)):
return ValueError(f"Not a valid ObjectId: {v}")
return ObjectId(str(v))
class DBModelMixin(BaseModel):
id: Optional[ObjectIdStr] = Field(..., alias="_id")
class Config:
allow_population_by_field_name = True
json_encoders = {ObjectId: lambda x: str(x)}
class Foo(DBModelMixin):
some_other_id: ObjectIdStr = ObjectId()
foo = Foo(_id=ObjectId())
print(foo)
print(foo.dict(by_alias=True))
print(foo.json())
print(jsonable_encoder(foo, by_alias=True))
# Now I'm allow to:
# some_mongo_collection.insert_one(foo.dict(by_alias=True))
#
# id=ObjectId('5df4c40d7281cab2b8cd4a58') some_other_id=ObjectId('5df4c40d7281cab2b8cd4a57')
# {'_id': ObjectId('5df4c40d7281cab2b8cd4a58'), 'some_other_id': ObjectId('5df4c40d7281cab2b8cd4a57')}
# {"id": "5df4c40d7281cab2b8cd4a58", "some_other_id": "5df4c40d7281cab2b8cd4a57"}
# {'_id': '5df4c40d7281cab2b8cd4a58', 'some_other_id': '5df4c40d7281cab2b8cd4a57'}
@stefanondisponibile Thanks for that snippet โ that was very helpful.
Is that still the best way to handle serializing / deserializing custom types? It seems a bit unintuitive that the JSON serialization and deserialization live in two different places, especially coming from marshmallow, where this is as simple as implementing _serialize()
and _deserialize()
methods on the custom type:
https://marshmallow.readthedocs.io/en/stable/custom_fields.html
Most helpful comment
Thank you @markqiu! Even if that wasn't exactly what I was looking for, as I would still have a string in place of an object id when I'm casting the model to a dictionary, that somehow gave me that simple obvious idea I was missing: creating the
ObjectId
itself as custom data type. I don't know why that didn't came to mind before, sorry ๐ I'll leave an example in case someone else needs it!