First of all, thank you so much for your awesome job! Pydantic is a very good library and I really like its combination with FastAPI. :rocket:
So my question is does pydantic.dataclasses.dataclass support classic mapping in SQLAlchemy? I am working on a project and hopefully can build it with clean architecture and therefore, would like to use pydantic's dataclass for my domain model and have SQLAlchemy depend on it.
I have a proof of concept code snippet as below, which would work if I use the dataclass from Python's std lib.
from dataclasses import field
from pydantic.dataclasses import dataclass
from pydantic import constr
from typing import List
from datetime import datetime
from sqlalchemy import create_engine, MetaData, Table, Column, Integer, String, ARRAY, TIMESTAMP
from sqlalchemy.orm import sessionmaker, mapper
metadata = MetaData()
person_table = \
Table('people', metadata,
Column('id', Integer, primary_key=True, autoincrement=True),
Column('name', String),
Column('age', Integer),
Column('hobbies', ARRAY(String)),
Column('birthday', TIMESTAMP)
)
@dataclass
class Person:
name: str = constr(max_length=50)
age: int = field(default_factory=int)
hobbies: List[str] = field(default_factory=list)
birthday: datetime = field(default_factory=datetime.now)
mapper(Person, person_table)
engine = create_engine(
'postgresql://postgres:[email protected]:5432/test_db', echo=True)
metadata.create_all(engine)
session = sessionmaker(bind=engine)()
person = Person(hobbies=['golf', 'hiking'],
birthday=datetime(1985, 7, 25))
session.add(person)
session.commit()
Error Output:
> python main.py
2019-12-08 22:11:07,663 INFO sqlalchemy.engine.base.Engine select version()
2019-12-08 22:11:07,663 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,664 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-12-08 22:11:07,664 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-12-08 22:11:07,665 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,666 INFO sqlalchemy.engine.base.Engine select relname from pg_class c join pg_namespace n on n.oid=c.relnamespace where pg_catalog.pg_table_is_visible(c.oid) and relname=%(name)s
2019-12-08 22:11:07,666 INFO sqlalchemy.engine.base.Engine {'name': 'people'}
2019-12-08 22:11:07,667 INFO sqlalchemy.engine.base.Engine
CREATE TABLE people (
id SERIAL NOT NULL,
name VARCHAR,
age INTEGER,
hobbies VARCHAR[],
birthday TIMESTAMP WITHOUT TIME ZONE,
PRIMARY KEY (id)
)
2019-12-08 22:11:07,667 INFO sqlalchemy.engine.base.Engine {}
2019-12-08 22:11:07,683 INFO sqlalchemy.engine.base.Engine COMMIT
Traceback (most recent call last):
File "/home/xzhan/Development/Projects/sqlalchemy_mapping/.venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1955, in add
state = attributes.instance_state(instance)
AttributeError: 'Person' object has no attribute '_sa_instance_state'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 37, in <module>
session.add(person)
File "/home/xzhan/Development/Projects/sqlalchemy_mapping/.venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1957, in add
raise exc.UnmappedInstanceError(instance)
sqlalchemy.orm.exc.UnmappedInstanceError: Class '__main__.Person' is mapped, but this instance lacks instrumentation. This occurs when the instance is created before sqlalchemy.orm.mapper(__main__.Person) was called.
Thanks in advance!
Best to use ORM mode.
I have no personal interest in ORMs or supporting them - long term they're a mistake, you'd do much better to just write SQL.
However if there's some easy way to support mapper that required a relatively small change to pydantic, I'd be happy to review a PR.
Thank you for the quick response!
Wish I had the SQL skill level to say that lol. :man_facepalming: I have worked on a Django project where most database operations are written in raw sql and it's not a pleasant experience. I find it to be a great maintenance burden not only for me but for other newer team members.
I will take a look and see what I can do when I am more settled with my current project. I shall experiment with BaseModel and ORM Mode as well. Thanks again!
Django's ORM is kind of a special case, since:
Good luck with from_orm that's the approach most people take.
From my personal experience, the major difference between SQLAlchemy and Django ORM is really the difference between Active Record and Data Mapper. (Or maybe I am just not experienced enough to spot/appreciate the others lol. :man_facepalming:)
I really disliked SQLAlchemy when I first started Python web dev in Flask & Flask-SQLAlchemy and came to like Django ORM quite a lot when my new job uses Django. But using Django ORM means a tight data coupling. With some background in .NET Core and EF Core and recently learning more about software architecture, I slowly come to appreciate many design choices SQLAlchemy makes. The syntax is not the best for sure, but many things just make more sense now.
Edit: SQLAlchemy Core also has some very useful tools for building queries, including helpers for using textual SQL.
i have closed my other issue (https://github.com/samuelcolvin/pydantic/issues/1403) and would like to add my request here.
This would be great if there was some way to support sqlalchemy mapper mode. This allows us to not double-define models for sqlalchemy and pydantic separately and will be a huge help in using fastapi
a good compromise would be if pydantic could take a python standard dataclass and convert it into a new dataclass made of pydantic dataclasses.
Right now, I'm doing this
from sqlalchemy import Table, Text, MetaData, Column, Integer, String, ForeignKey, create_engine
from sqlalchemy.orm import mapper, relationship, sessionmaker
from typing import Optional
from pydantic import EmailStr
from pydantic.dataclasses import dataclass as py_dataclass
from dataclasses import dataclass
metadata = MetaData()
audit_mixin = Table('audit_mixin', metadata,
Column('id',Integer, primary_key=True),
Column('performed_by',Integer, nullable=True))
user = Table('user', metadata,
Column('user_id', Integer, primary_key=True),
Column('name', String(50)),
Column('email', String(100)),
Column('fullname', String(50)),
Column('nickname', String(12))
)
@dataclass
class AuditMixin:
id: int
performed_by: int
@py_dataclass
class AuditMixinPy(AuditMixin):
pass
@dataclass
class User(AuditMixin):
user_id :int
identity :str
identity_type :str
row_status :str
comments :str
@py_dataclass
class UserPy(User):
pass
engine = create_engine("sqlite:///", echo=False)
metadata.create_all(engine)
session = sessionmaker(engine)()
mapper(AuditMixin, audit_mixin)
mapper(User,user)
u = User(100,123,1, "a","dfd","dfdf","dfdd")
u.id
u.performed_by
session.add(u)
session.commit()
u = User("sss",123,1, "a","dfd","dfdf",1)
u = UserPy("sss",123,1, "a","dfd","dfdf",1)
Hi!
This is really interesting feature, and I would like to give it some thought. I was playing around with sqlalchemy mapper and pydantic, so the main problem was:
@no_type_check
def __setattr__(self, name, value):
...
if not hasattr(self, '__fields_set__'):
object.__setattr__(self, '__fields_set__', set())
This will cause some problems with __eq__, because __eq__ checks all fields, but sqlalchemy loads some private attributes. @samuelcolvin , what do you think about this approach? As I see no tests failed.
Yes, sqlalchemy users would have to rewrite __eq__ of their pydantic BaseModels, but I think it's much easier for developers.
That's an interesting idea.
Could you illustrate with a complete example?
On Tue, 21 Apr, 2020, 11:38 Denis Surkov, notifications@github.com wrote:
Hi!
This is really interesting feature, and I would like to give it some
thought. I was playing around with sqlalchemy mapper and pydantic, so the
main problem was:@no_type_check
def __setattr__(self, name, value):
...
if not hasattr(self, '__fields_set__'):
object.__setattr__(self, '__fields_set__', set())This will cause some problems with eq, because eq checks all field,
but sqlalchemy loads some private attributes. @samuelcolvin
https://github.com/samuelcolvin , what do you think about this
approach? As I see no tests failed.Yes, sqlalchemy users would have to rewrite eq of their pydantic
BaseModels, but I think it's much easier for developers.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/samuelcolvin/pydantic/issues/1089#issuecomment-616972869,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAASYU7YJP5V2FRZ7JZ7YOTRNUZ5LANCNFSM4JYCWC2A
.
Some dirty hacks, but I hope you understand the idea.
pydantic/main.py
class BaseModel(...):
...
@no_type_check
def __setattr__(self, name, value):
if self.__config__.extra is not Extra.allow and name not in self.__fields__:
raise ValueError(f'"{self.__class__.__name__}" object has no field "{name}"')
elif not self.__config__.allow_mutation:
raise TypeError(f'"{self.__class__.__name__}" is immutable and does not support item assignment')
elif self.__config__.validate_assignment:
known_field = self.__fields__.get(name, None)
if known_field:
value, error_ = known_field.validate(value, self.dict(exclude={name}), loc=name, cls=self.__class__)
if error_:
raise ValidationError([error_], self.__class__)
self.__dict__[name] = value
# this lines added
if not hasattr(self, '__fields_set__'):
object.__setattr__(self, '__fields_set__', set())
# end
self.__fields_set__.add(name)
...
metadata = MetaData()
agents = Table(
'agents', metadata,
Column('someid', Integer, primary_key=True, index=True),
Column('raw_json', JSON)
)
def start_mappers():
mapper(agent_models.Agent, agents)
class Agent(BaseModel):
someid: int = Field(...)
raw_json: t.Optional[t.Dict[str, t.Any]] = Field(None)
def __eq__(self, other):
if isinstance(other, Agent):
return self.someid == other.someid
else:
return False
class Config:
extra = Extra.allow
allow_mutation = True
orm_mode = True
def test_agent_mapper_can_load_lines(session):
some_dict = {
"key": "value"
}
raw_json = json.dumps(some_dict)
session.execute(
'INSERT INTO agents (someid, raw_json) VALUES '
'(1, NULL ),'
f'(2, :raw_json)', dict(raw_json=raw_json, )
)
session.commit()
expected = [
models.Agent(someid=1, raw_json={}),
models.Agent(someid=2, raw_json=some_dict)
]
assert session.query(models.Agent).all() == expected
It is indeed quite an interesting use-case to use @dataclass with classical mapping..
I'm using pydantic for request validation (this dataclass instance is sent around in the application). But, I also wanted to persist values in this pydantic instance. So, here's my solution/hack to implement this transparently.
Model definitions:
from pydantic.dataclasses import dataclass as pydantic_dataclass
from dataclasses import dataclass
@pydantic_dataclass
class PydanticRequest:
test: confloat(ge=10, le=100)
@dataclass # This then takes the annotations and builds a new dataclass ;)
@duplicates_pydantic_dataclass(PydanticRequest) # This prepares __annotations__ dynamically
class Request:
pass
And the decorator which does the job
def duplicates_pydantic_dataclass(pydantic_cls):
def wrapper(cls):
cls.__annotations__ = {}
for field_key, field_value in pydantic_cls.__dataclass_fields__.items():
cls.__annotations__[field_key] = field_value.type
return cls
return wrapper
This way I transparently use pydantic for validations
pr = PydanticRequest(test=11.0)
sqlalchemy mappers are defined for Request !
For CRUD operation, I have the following:
from dataclasses import asdict
def instantiate_with_data(cls, instance):
return cls(**asdict(instance))
>>> pr = PydanticRequest(test=11.0)
>>> r = instantiate_with_data(Request, pr)
>>> session.add(r)
Although, it suits alright for _simple_ objects and mappings
Just to add since it isn't mentioned here:
Since v1.7 pydantic has supported automatically creating pydantic dataclasses for stdlib dataclasses, see #1817
Most helpful comment
Just to add since it isn't mentioned here:
Since v1.7 pydantic has supported automatically creating pydantic dataclasses for stdlib dataclasses, see #1817