Description
Up until database interaction is introduced, the tutorial for FastAPI uses pydantic models for everything, such as this example in the sextion on Extra Models :
class UserOut(BaseModel):
username: str
email: EmailStr
full_name: str = None
class UserInDB(BaseModel):
username: str
hashed_password: str
email: EmailStr
full_name: str = None
This alows for the "database model" to have private data which will not be exposed through the API. Later, in the section on security, a similar trick is used, but this time using inheritance to stack the two models (which I find makes the return-casting used by some functions better encoded in the type system).
class User(BaseModel):
username: str
email: str = None
full_name: str = None
disabled: bool = None
class UserInDB(User):
hashed_password: str
def get_db_user() -> UserInDB:
return UserInDB(
username="johndoe", full_name="John Doe",
email="[email protected]",
hashed_password="fakehashedsecret",
disabled=False
)
def get_user() -> User:
return get_db_user()
However, when proper databases are introduced, those Pydantic models are dropped in favor of a single SQLAlchemy ORM model, with no effort to bridge the two parts. And while one could see this as the SQLAlchemy models completely superseding the Pydantic models, the fullstack demo app appears to actually use both, so there appears to be value in attempting to use them together, something which the documentation doesn't seem to address.
So can/should Pydantic and SQLAlchemy models be used together? If they are, how is one meant to connect the two together and can this still be done while maintaining some kind of type hierarchy?
From what I can tell, there is no problem using both pydantic models and SQLAlchemy models, and if you want to use sqlalchemy as your ORM you will need to use both.
I've found this a little frustrating because, at least in my experience, it leads to lots of similarly-named objects and repetition of definitions, and frequent translation between the types depending on how you are using them. @tiangolo I would be interested if you have any suggestions for design patterns that would reduce the amount of "translation" code associated with having separate pydantic and sqlalchemy models.
More generally, I would be interested to hear if anyone has found a clean approach to bridging the gap between pydantic and a database, especially if it integrates nicely with an asynchronous database driver (I've been using (encode/databases), which only supports SQLAlchemy core, not the ORM). In particular, it would be nice if I only had to create a single class for each conceptual model (I'd be okay with separate classes for pydantic and the orm as long as they could be derived from a common root).
My current approach has been to create a container class that holds references to the database table and the appropriate creation/in-db pydantic models (similar to UserCreate and UserInDB from the docs), and has a variety of generic classmethods for CRUD that make use of the creation/in-db types for input / return, but it still feels like a hack.
Hi @dmontagu ,
I am interested in your 'hack' 馃槃 Could you share some code sample ?
On my side, I find it useful to split the models used for frontend interaction (Pydantic) from those used for the DB interaction and business logic (SQLAlchemy).
I have therefore followed the scaffold from https://github.com/tiangolo/full-stack-fastapi-postgresql, with some repetition between models and db_models. But again, I am fine with it since it adresses different purpose. The point where repetition becomes to cumbersome is the crud layer, where you basically have to copy-paste again and again the same pieces of code. I have made a PR to try to reduce this friction: https://github.com/tiangolo/full-stack-fastapi-postgresql/pull/23
I am curious to see how this goes along with @dmontagu approach...
By the way, fastapi takes care to translate SQLAlchemy models to pydantic ones on the api endpoint layer, with two limitations I have found so far on JSON fields (see https://github.com/tiangolo/fastapi/issues/211 for details) and Enum fields in a specific use case (see https://github.com/tiangolo/fastapi/issues/196)
So, Pydantic and SQLAlchemy are separated. Pydantic is used for documentation, validation, and data serialization. SQLAlchemy for SQL ORM stuff.
FastAPI is not coupled with any DB, so, SQLAlchemy is optional.
If you don't care about having documentation, validation, and serialization, you don't need to use Pydantic, you could return SQLAlchemy models directly. They would be converted to JSON in a "best-effort" way.
If you want both, for now, you have to write both.
I agree I don't like the duplication of properties in classes. But there's still no way to automatically generate one from the other.
That's a good candidate for a third-party package, that generates Pydantic models from SQLAlchemy models (or other ORMs) automatically. But it doesn't exist yet. There are some attempts at doing something similar but I don't know a complete solution yet. Hopefully, someone will build it (or I'll do it later).
For now, I'll take it as a request to update the docs clarifying the use of both Pydantic and SQLAlchemy at the same time.
If you want another inspiration there's also https://github.com/thomaxxl/safrs
Thanks @euri10 .
If you want another inspiration there's also https://github.com/thomaxxl/safrs
Thanks for the link @euri10 . I will stick with fastapi and the separation for now :two_hearts:
For now, I'll take it as a request to update the docs clarifying the use of both Pydantic and SQLAlchemy at the same time.
Yeah, that was my intention, the doc jumps from one to the other with little to no transition and it's really glaring considering how of the rest of the doc tends to flow together.
Great! I'll update it.
+1 to clarifying the docs about how the pydantic models and sqlalchemy models should be used together, I also found this point a little confusing at first.
@ebreton I've been iterating on my approach, and due to weaknesses in some combination of 1) the python typing system, 2) pycharm's implementation of type checking, and 3) my understanding of python's typing system, I've modified my approach to make heavier use of instances for the sake of better-working auto-completion. In particular, it looks like generic class variables aren't supported in the python type hinting system, so I modified my approach so that the "container class" table is now actually just a separate instance for each separate model.
Here's a simplified version of my implementation (removing some complexity due to things like automatically generated uniqueness constraints)
import uuid
from typing import Generic, List, Tuple, Type, TypeVar
import sqlalchemy as sa
from databases import Database
from pydantic import BaseModel
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.sql import Insert
def get_metadata() -> sa.MetaData:
... # returns my configured sqlalchemy.MetaData object
def get_database() -> Database:
... # returns my configured databases.Database object
class BaseColumns:
def values(self) -> List[sa.Column]:
"""
Mirrors the `values` function on the table.c object from sqlalchemy
"""
return list(self.__dict__.values())
T = TypeVar("T", bound=BaseColumns)
class TypeHintedTable(sa.Table, Generic[T]):
""" Only intended for use as a typehint, NOT to be actually instantiated """
@property
def c(self) -> T:
return NotImplemented
@property
def columns(self) -> T:
return NotImplemented
ColumnsT = TypeVar("ColumnsT", bound=BaseColumns)
CreationModelT = TypeVar("CreationModelT", bound=BaseModel)
DatabaseModelT = TypeVar("DatabaseModelT", bound=BaseModel)
class PydanticTable(Generic[ColumnsT, CreationModelT, DatabaseModelT]):
def __init__(self,
*,
table_name: str,
columns: ColumnsT,
creation_model: Type[CreationModelT],
db_model: Type[DatabaseModelT],
):
assert issubclass(creation_model, BaseModel) # needed by pycharm, despite bound
assert issubclass(db_model, BaseModel) # needed by pycharm, despite bound
self.columns: ColumnsT = columns
self.creation_model: Type[CreationModelT] = creation_model
self.db_model: Type[DatabaseModelT] = db_model
self.table_name = table_name
self.table: TypeHintedTable[ColumnsT] = self._get_table()
def _get_table(self) -> TypeHintedTable[ColumnsT]:
# In my actual implementation, I also generate uniqueness constraints
table_args = self.columns.values()
return sa.Table(self.table_name, get_metadata(), *table_args)
# Some example CRUD methods (actual implementation has more)
async def read_by_columns_values(
self, *, columns: List[sa.Column], values: List[Tuple]
) -> List[DatabaseModelT]:
assert all(len(value) == len(columns) for value in values)
query = self.table.select().where(sa.tuple_(*columns).in_(values))
result = await get_database().fetch_all(query)
return [self.db_model(**x) for x in result]
async def create(
self, creation_requests: List[CreationModelT]
) -> List[DatabaseModelT]:
# In my actual implementation, I also handle uniqueness conflicts
insertion_values = [item.dict() for item in creation_requests]
insertion_query = Insert(
self.table,
values=insertion_values,
returning=self.table.c.values()
)
result = await get_database().fetch_all(insertion_query)
return [self.db_model(**x) for x in result]
An example of instantiating the relevant classes for a particular conceptual model would then look like this:
class Product(BaseModel):
brand: str
name: str
description: str
class ProductCreate(Product):
pass
class ProductInDB(ProductCreate):
product_id: uuid.UUID
class ProductColumns(BaseColumns):
def __init__(self):
self.product_id = sa.Column(
"product_id", UUID, primary_key=True,
server_default=sa.text("uuid_generate_v4()")
)
self.brand = sa.Column("brand", sa.String, nullable=False)
self.name = sa.Column("name", sa.String, nullable=False)
self.description = sa.Column("description", sa.String, nullable=False)
columns = ProductColumns()
ProductORM = PydanticTable(
table_name="products",
columns=ProductColumns(),
creation_model=ProductCreate,
db_model=ProductInDB,
)
ProductORM
or the equivalent would then be imported anywhere you wanted to access the associated sqlalchemy table or the various crud methods. For example, if you have a fastapi endpoint that accepts, let's say, a list of ProductCreate
as a body argument, the endpoint body can just be return await ProductORM.create(product_create_list)
and it will return a list of the created ProductInDB
.
In my case, I also wanted the ability to easily build more complex sqlalchemy core queries involving multiple models, and the reason for the BaseColumns
+ TypeHintedTable
approach is that I can autocomplete to the columns in the table when I write ProductORM.table.c.
(I haven't seen a way to get sqlalchemy to do this that doesn't involve heavily depending on the sqlalchemy ORM). I also have an unbound version of the read_by_columns_values
method that lets me take the result of a more complex query and convert the result into pydantic models.
This approach works fine for my use case because my models are all essentially immutable, but I think most people will just find it easier to stick with sqlalchemy's ORM. In my case, I don't need to handle a very large number of distinct model-querying patterns and I find it a little easier to reason through the database accesses and associated performance implications when I'm closer to the raw sql, so I chose not to depend on sqlalchemy ORM. (But I would probably still use it though if it had improved async compatibility.)
A quick note, I'm working on a PR in Pydantic here: https://github.com/samuelcolvin/pydantic/pull/520
To improve the way we can interact with SQLAlchemy models and Pydantic, to solve several lazy-loading issues, loading relationships (when needed), handling lists of results, etc.
I'm also updating the SQLAlchemy tutorial with all those changes, to show how to use both SQLAlchemy and their role. And the same documentation scripts are being used for the tests.
But this work on the tutorial is on hold for a bit until the PR is merged.
I read the conversation in your PR, but I'm not formalir with pydantic. does it mean that if I implement a __iter__
on my orm class, returning a instance in handler should work fine even without your PR merged?
I'm wrong
Looks like https://github.com/samuelcolvin/pydantic/pull/562 has been merged.
Any updates on this topic? FastAPI combined with nested models and sqlalchemy database functionality would be a great plus! Thanks for the good work so far, very impressing!
I just finished integrating Pydantic ORM mode into FastAPI, it is released as version 0.30.0
:tada:
The new docs include how to use Pydantic with SQLAlchemy, how relationships work, etc: https://fastapi.tiangolo.com/tutorial/sql-databases/
Stellar! Thanks @tiangolo.
That pretty much solves the issue as far as I'm concerned.
Great! Thanks for reporting back and closing the issue @sm-Fifteen .
Is it possible to create some sort of derived BaseModel that has orm_mode
always set to true? Adding the Config
subclass for every one of my models is not very DRY.
yes it is, you just create your base class that has all what you want (in your case in particular orm_mode) then in all subsequent classes you inherit from that base class
Yes @acnebs I do that myself, along with some other config settings and utility methods.
I've developed a library that meticulously converts an SqlAlchemy model into Pydantic. It even supports relationships :)
Since I was tired of reinventing the wheel and needed something to tackle exactly the same problem I created ormar
package - _an async mini orm with fastapi in mind and pydantic validation_. Ormar
was inspired by encode/orm
package which seems either stale or abandoned (and uses typesystem
anyway).
Ormar
bases its validation on pydantic so it can be used directly with fastapi as response and request models.
Feel free to check it out: https://github.com/collerek/ormar
@collerek why "mini"?
@Kludex Cause it does not keep in memory any information regarding already instantiated objects so if you do something like this:
record1 = await Model.objects.get(pk=1)
record2 = await Model.objects.get(pk=1)
assert record1==record2
# True - but in reality they are 2 separate python objects that knows nothing about each other
# so if you do:
record1.parameter = 'something else'
await record1.save()
# the record2 does not update - you won't have errors like in sqlalchemy orm modifing two times the same objects etc.
So it behaves pretty much like i.e. gino orm.
Also starting in August the scope was much smaller (no many2many relations etc.), this has changed but the "mini" in name prevails.
When I compare to full fledged ORMs like django, sqlalchemy etc. it looks and feels tiny 馃槃
Most helpful comment
I just finished integrating Pydantic ORM mode into FastAPI, it is released as version
0.30.0
:tada:The new docs include how to use Pydantic with SQLAlchemy, how relationships work, etc: https://fastapi.tiangolo.com/tutorial/sql-databases/