Fastapi: [QUESTION] Recommended way to use mongodb with FastAPI

Created on 21 Aug 2019  ยท  10Comments  ยท  Source: tiangolo/fastapi

I am playing around with FastAPI a bit and wanted to connect it to a MongoDB database. I, however, am confused which ODM to choose between motor which is async and mongoengine. Also, in the NoSQL example they have created a new bucket and also the called the code to connect to DB every time it is used. However, both motor and mongoengine seem to prefer a global connection. So what would be a good way to connect to MongoDB?

question

Most helpful comment

Thank you @markqiu! Even if that wasn't exactly what I was looking for, as I would still have a string in place of an object id when I'm casting the model to a dictionary, that somehow gave me that simple obvious idea I was missing: creating the ObjectId itself as custom data type. I don't know why that didn't came to mind before, sorry ๐Ÿ™ˆ I'll leave an example in case someone else needs it!

from pydantic import BaseModel, Field, validator
from fastapi.encoders import jsonable_encoder
from typing import Optional, Any
from bson import ObjectId

class ObjectIdStr(str):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not ObjectId.is_valid(str(v)):
            return ValueError(f"Not a valid ObjectId: {v}")
        return ObjectId(str(v))

class DBModelMixin(BaseModel):
    id: Optional[ObjectIdStr] = Field(..., alias="_id")

    class Config:
        allow_population_by_field_name = True
        json_encoders = {ObjectId: lambda x: str(x)}

class Foo(DBModelMixin):
    some_other_id: ObjectIdStr = ObjectId()

foo = Foo(_id=ObjectId())
print(foo)
print(foo.dict(by_alias=True))
print(foo.json())
print(jsonable_encoder(foo, by_alias=True))

# Now I'm allow to:
# some_mongo_collection.insert_one(foo.dict(by_alias=True))
#
# id=ObjectId('5df4c40d7281cab2b8cd4a58') some_other_id=ObjectId('5df4c40d7281cab2b8cd4a57')
# {'_id': ObjectId('5df4c40d7281cab2b8cd4a58'), 'some_other_id': ObjectId('5df4c40d7281cab2b8cd4a57')}
# {"id": "5df4c40d7281cab2b8cd4a58", "some_other_id": "5df4c40d7281cab2b8cd4a57"}
# {'_id': '5df4c40d7281cab2b8cd4a58', 'some_other_id': '5df4c40d7281cab2b8cd4a57'}

All 10 comments

@Ayush1325 I think the example for sql DBs applies here https://fastapi.tiangolo.com/tutorial/async-sql-databases/#connect-and-disconnect

Basically, you can create/destroy the global connection using the @app.on_event("startup") and @app.on_event("shutdown") events on your app.

@jaddison Ok. But should I go with motor or mongo engine? Also in the NoSQL example, it is mentioned that they are retrieving new bucket every time as a single bucket won't work with multithreading in docker image. So is it okay to have a global connection as it is mentioned at least in motor docs that it does not support multithreading.

@Ayush1325 You can use FastAPI as async OR sync web framework, so you should make a decision about that at least before thinking about integration with DB. Now if you want to build async app, you would need to do all the IO asynchronously too, DB operations included. The most popular and (probably) stable async package for interacting with MongoDB is motor (which is based on no less stable pymongo package, which you'd want to use in sync app). We use it in our apps with FastAPI and it's been great so far.

As far as multithreading goes you don't need to care about it that much, because FastAPI is single-threaded (and single-cored as well). Every request is handled by separate task in the event loop (uvloop in this case) so you can just create your mongodb client class, based on motor bundled one for doing all the calls to the MongoDB (for example check out how it is done in our package (WIP) in client.py and setup_mongodb function in utils.py)

Thanks @jaddison and @levchik for your help here! Great responses! ๐Ÿš€๐Ÿฐ

And thanks @Ayush1325 for reporting back and closing the issue.

Some extra comments:

  1. I would choose the package based on what makes your more efficient and productive, unless you absolutely need the extreme, maximum performance. ...The same way you would choose FastAPI for your app instead of writing it in C by hand.

  2. If you need to use WebSockets, you will need async functions, that could alter your decision.

  3. You could actually also use sync I/O code inside of async functions, if you call it through Starlette's run_in_threadpool and iterate_in_threadpool. But that's a rather advanced use case and is not properly documented yet.

@Ayush1325 This is my work, hope to help.
https://github.com/markqiu/fastapi-mongodb-realworld-example-app

Hey @markqiu, I had a look to your repo, that I found rich of cues on how to deal with a mongo db, however I'm finding it a total nightmare having to deal with ObjectId searialization/deserialization and I saw in the repo you avoid using ObjectIds aswell. Do you have any ideas on how to deal with that? When I jsonable_encode an object containing an ObjectId I'm fine at serializing it to string, and this can be done with the trick suggested by @tiangolo here, but then when I need to, say, insert the document in a collection, and need to insert the field as an ObjectId I'd have to serialize it and then rember to convert each field every time. I don't know if a dict_encoders method would be an option for @samuelcolvin for having a str encoded to an ObjectId when calling the dict method.

Example:

from pydantic import BaseModel, Field
from bson import ObjectId

class ObjectIdStr(str):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not isinstance(v, ObjectId):
            raise ValueError("Not a valid ObjectId")
        return str(v)

class MyModel(BaseModel):
    id: ObjectIdStr = Field(..., alias="_id")
    class Config:
        allow_population_by_field_name=True
        dict_encoders = {str: lambda x: ObjectId(x) if ObjectId.is_valid(x) else x}

mymodel = MyModel(id=ObjectId("5df3c81e99256b2fe60b5f8d")
mymodel.json()  # {"_id": "5df3c81e99256b2fe60b5f8d"}
mymodel.dict()   # {"_id": ObjectId("5df3c81e99256b2fe60b5f8d")}

what do you think?

I create a mixin, like that:

class DBModelMixin(BaseModel):
    id: Optional[Any] = Schema(..., alias="_id")

    @validator("id")
    def validate_id(cls, id):
        return str(id)

Hope it works for you.

Thank you @markqiu! Even if that wasn't exactly what I was looking for, as I would still have a string in place of an object id when I'm casting the model to a dictionary, that somehow gave me that simple obvious idea I was missing: creating the ObjectId itself as custom data type. I don't know why that didn't came to mind before, sorry ๐Ÿ™ˆ I'll leave an example in case someone else needs it!

from pydantic import BaseModel, Field, validator
from fastapi.encoders import jsonable_encoder
from typing import Optional, Any
from bson import ObjectId

class ObjectIdStr(str):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not ObjectId.is_valid(str(v)):
            return ValueError(f"Not a valid ObjectId: {v}")
        return ObjectId(str(v))

class DBModelMixin(BaseModel):
    id: Optional[ObjectIdStr] = Field(..., alias="_id")

    class Config:
        allow_population_by_field_name = True
        json_encoders = {ObjectId: lambda x: str(x)}

class Foo(DBModelMixin):
    some_other_id: ObjectIdStr = ObjectId()

foo = Foo(_id=ObjectId())
print(foo)
print(foo.dict(by_alias=True))
print(foo.json())
print(jsonable_encoder(foo, by_alias=True))

# Now I'm allow to:
# some_mongo_collection.insert_one(foo.dict(by_alias=True))
#
# id=ObjectId('5df4c40d7281cab2b8cd4a58') some_other_id=ObjectId('5df4c40d7281cab2b8cd4a57')
# {'_id': ObjectId('5df4c40d7281cab2b8cd4a58'), 'some_other_id': ObjectId('5df4c40d7281cab2b8cd4a57')}
# {"id": "5df4c40d7281cab2b8cd4a58", "some_other_id": "5df4c40d7281cab2b8cd4a57"}
# {'_id': '5df4c40d7281cab2b8cd4a58', 'some_other_id': '5df4c40d7281cab2b8cd4a57'}

@stefanondisponibile Thanks for that snippet โ€” that was very helpful.

Is that still the best way to handle serializing / deserializing custom types? It seems a bit unintuitive that the JSON serialization and deserialization live in two different places, especially coming from marshmallow, where this is as simple as implementing _serialize() and _deserialize() methods on the custom type:
https://marshmallow.readthedocs.io/en/stable/custom_fields.html

Was this page helpful?
0 / 5 - 0 ratings