It would be great if pydantic supported TypedDict.
import sys; print(sys.version): 3.7.4import pydantic; print(pydantic.VERSION): 0.32.1from pydantic import BaseModel
from mypy_extensions import TypedDict
class Data(TypedDict):
a: int
class User(BaseModel):
data: Data
if __name__ == '__main__':
external_data = {
'data': {
'a': 'invalid',
}
}
# should raise exception
user = User(**external_data)
This is very close to dataclasses or pydantic's own models. What is the usecase where TypedDict would be preferable to dataclasses or models?
My primary use case is integrating pydantic into existing source code (which uses marshmallow currently) where plain dictionaries are used. Changing all the source code to using ModelBase instead of plain dicts is infeasible. I can obviously define a pydantic's model and then call .dict(), but it will most definitely incur a significant performance penalty (the project processes event stream and does hundreds or even thousands of validations per second).
I see.
Most of the overheads of parsing data are not in calling .dict(), it's in the actual parsing and (to a lesser extent) building the model of what what the data should be like.
As long as you're creating the models once and then just calling .dict() on them, the performance won't be very different from if we implemented support for TypedDict.
In raw models will be slightly faster, with dataclasses we create a hidden model to do the actual validation, we would probably have to do the same for TypedDict. So you would have normal pydantic performance + some overheads.
By the way dict(model) will be slightly faster than model.dict() since it doesn't have to worry about all the exclude/include logic.
Given integration requirements with non-pydantic packages/apis (where you would have to load to / dump from a pydantic model), I could also imagine TypedDict having some static type-checking benefits over subclasses of BaseModel (at least, without the pycharm and/or as-yet-unreleased mypy plugin), since replacing with BaseModel would drop the static checking of keyword arguments.
I looked into this a little, and it looks like it may be difficult to determine at runtime whether a given type is actually a TypedDict. So far, the best check I can find is:
def is_typed_dict_type(type_: AnyType) -> bool:
return lenient_issubclass(type_, dict) and getattr(type_, '__annotations__', None)
@roganov If this is of critical importance to you, I think the same approach used to get validation for Literal types might work here. In particular, you'd need to write and incorporate TypedDict analogs of make_literal_validator and is_literal_type in the appropriate places. I don't currently have the time to implement this myself, but would review a pull request for it. (Though I would also understand if @samuelcolvin wanted to veto in favor of limiting scope creep of supported typing_extensions types.)
If you want to try implementing it yourself, here's a start, though I expect it may require some tweaks before it fully integrates into the field building process:
from typing import Any, Callable
from typing_extensions import TypedDict
from pydantic import BaseModel, AnyType
def make_typed_dict_validator(type_: Any) -> Callable[[Any], Any]:
class TypedDictModel(BaseModel):
__annotations__ = type_.__annotations__
TypedDictModel.__name__ = type_.__name__
def typed_dict_validator(v: Any) -> Any:
return TypedDictModel(**v).dict()
return typed_dict_validator
def is_typed_dict_type(type_: AnyType) -> bool:
return issubclass(type_, dict) and getattr(type_, '__annotations__', None)
assert not is_typed_dict_type(BaseModel)
assert not is_typed_dict_type(dict)
assert not is_typed_dict_type(TypedDict)
class A(TypedDict):
x: int
assert is_typed_dict_type(A)
validator = make_typed_dict_validator(A)
print(validator({"x": 1}))
# {'x': 1}
print(validator({"x": "x"}))
"""
pydantic.error_wrappers.ValidationError: 1 validation error for A
x
value is not a valid integer (type=type_error.integer)
"""
Alternatively, you may be able to come up with a way to use code like the above to produce a custom type that you can use to validate your TypedDicts.
I can obviously define a pydantic's model and then call .dict(), but it will most definitely incur a significant performance penalty (the project processes event stream and does hundreds or even thousands of validations per second).
The nature of pydantic validation is that it is done via parsing -- I don't think that you'll be able to use pydantic to validate a TypedDict without essentially having it parse the dict as a model.
That said, thanks to cythonization, __slots__, and other performance-oriented design choices, using pydantic may not be much slower than hand-crafted checks anyway (at least, if they are implemented in python anyway).
Thanks @dmontagu I'll try implementing this hopefully next week.
This feature is now present in Python 3.8 (https://docs.python.org/3/library/typing.html#typing.TypedDict)
Sometimes python amazes me.
NamedTupledataclassTypedDictIt doesn't really fit with the zen of python:
There should be one-- and preferably only one --obvious way to do it.
Hi guys, it is a very demanded feature :)
Do you have any plans for its implementation?
Happy to accept a PR to implement it.
I don't think in the short term I'll be building it myself.
Sorry for spamming, but maybe for someone, it would be useful as a quick solution (like in my case).
Let say we have the next data structure:
from typing import Dict, List, Optional, TypedDict
from uuid import UUID
class SessionUser(TypedDict):
id: int
name: str
uuid: UUID
email: str
username: str
class SessionToken(TypedDict):
user: SessionUser
session_id: UUID
then it could be parsed as:
from typing import _TypedDictMeta as TypedDictMeta
types: dict = {}
def parse_dict(typed_dict: TypedDictMeta) -> Type[BaseModel]:
annotations = {}
for name, field in typed_dict.__annotations__.items():
if isinstance(field, TypedDictMeta):
annotations[name] = (parse_dict(field), ...)
else:
default_value = getattr(typed_dict, name, ...)
annotations[name] = (field, default_value)
return create_model(typed_dict.__name__, **annotations)
def as_typed_dict(
json_dict: Dict[str, Any],
typed_dict: TypedDictMeta,
) -> Dict[str, Any]:
model = types.get(typed_dict)
if not model:
model = types[typed_dict] = parse_dict(typed_dict)
return model(**json_dict).dict()
as_typed_dict({...}, SessionToken)
It would be super cool if one could get a TypedDict from a pydantic class.
Use case: I have code which uses dictionaries a lot. At some point I will convert them to the Pydantic class, but it would help for refactoring to have an in-between state where I use a TypedDict derived from a pydantic class.
I also have a similar use case. I use celery for managing tasks. The biggest challenge with that is serialisation between tasks (since pickle isn't feasible) - to keep everything JSON serialisable I like to use TypedDict but often I'd like to convert those into a pydantic model instead. Custom classes is advantageous over dict in many cases obviously.
My challenge is to due this in a "static type safe" way as well as performant. I feel like there is a lot to gain from pydantic for this use case but I have cracked the code completely yet I feel.
Adding basic support for TypedDict is easy and doesn't cost much. Seeing the large number of up votes, I reckon it can be added. I opened a PR for this. Feedback more than welcome
Most helpful comment
Sometimes python amazes me.
NamedTupledataclassTypedDictIt doesn't really fit with the zen of python: