Pydantic: Custom Root Types - validate(), dict() not as expected

Created on 28 Jan 2020 · 10Comments · Source: samuelcolvin/pydantic

             pydantic version: 1.4
            pydantic compiled: False
                 install path: /home/rtbs106/rtb-test/pydantic/pydantic
               python version: 3.7.5 (default, Nov 20 2019, 09:21:52)  [GCC 9.2.1 20191008]
                     platform: Linux-5.3.0-26-generic-x86_64-with-Ubuntu-19.10-eoan
     optional deps. installed: ['typing-extensions', 'email-validator', 'devtools']

pydantic offers a feature called "Custom Root Type": https://pydantic-docs.helpmanual.io/usage/models/#custom-root-types

For now it looks like there is general poor support for such objects across available BaseModel methods.

`validate()` function used by `fastapi` does not consider this setting at all:

from pydantic import BaseModel
from typing import List

class SubModel(BaseModel):
    name: str

class ListModel(BaseModel):
    __root__: List[SubModel]

def test_validate():
    list_model = ListModel.validate([{"name": "foo"}])
    assert isinstance(list_model, ListModel)

This should pass, however pydantic.errors.DictError: value is not a valid dict is raised instead.
validate() function also isn't documented at all.
While such thing is implemented by parse_obj() it does not implement other features that validate() has, for example cls.__config__.orm_mode.
Also these two functions looks pretty the same, what are the differences between them?

`dict()` method used by `fastapi` returns value other than expected

While this is documented, and probably dict() should not return anything other than dict at the moment there is no function opposite to parse_obj(), eg. returning what dict() returns for normal models, and direct __root__ value for custom root type objects. Maybe serialize_obj() should be added?

Custom Root Type could probably use single `init` parameter.

This would allow to completly hide __root__ argument

Custom Root Type models should be a separate classes?

As handling them is a separate thing, and putting if __custom_root_type__ everywhere does not seem to be reasonable.

Change

Source

peku33

👍7

Most helpful comment

dict() behaviour will change in v2 to default to returning the root object, but that's backwards incompatible so can't happen now. It's also not a bug.

Regarding validate(), it's used internally when validating sub-models, it should probably be explicitly private in v2.

If there are problems with fastAPI, please create an issue there.

samuelcolvin on 30 Jan 2020

👍2

All 10 comments

Reference: #730 (.json() behavior is changed to unwrap __root__)

phy25 on 29 Jan 2020

Yes, .json() works as expected, however for now there is no way to get actual value not as a string

peku33 on 29 Jan 2020

dict() behaviour will change in v2 to default to returning the root object, but that's backwards incompatible so can't happen now. It's also not a bug.

Regarding validate(), it's used internally when validating sub-models, it should probably be explicitly private in v2.

If there are problems with fastAPI, please create an issue there.

samuelcolvin on 30 Jan 2020

👍2

I think this may be related, apologies if not...

But I would really love to be able to instantiate a model that uses __root__ without having to use __root__ as a key in the input data.

This is particularly in the case where the rooted model is a child attr of the parent model you are instantiating.

I realise there is one case which works, you can:

class RootedModel(BaseModel):
    __root__: Dict[str, str]

RootedModel.parse_obj({"dynamic_field": "some value"})

But this fails as soon as you want to instantiate the parent:

>>> class ParentModel(BaseModel):
>>>     rooted: RootedModel
>>>
>>> ParentModel.parse_obj({"rooted": {"dynamic_field": "some value"}})

ValidationError: 1 validation error for ParentModel
rooted -> __root__
  field required (type=value_error.missing)

I don't really like that the __root__ "internal special name" is exposed to the public data model at all.

It would be great if the existing special case, for parse_obj, was consistently used everywhere so that instantiation of sub-objects in the same fashion can succeed.

anentropic on 1 Apr 2020

Well, it seems that an easy workaround in this example is to eliminate RootedModel and just:

class ParentModel(BaseModel):
    rooted: Dict[str, str]

ParentModel.parse_obj({"rooted": {"dynamic_field": "some value"}})

...in which case I am not really sure what the point of __root__ is

Well, on large models it allows to separate the validation etc more logically. Or maybe you want to reuse the definition of RootedModel across multiple parents. So yes, it would be nice if it was more usable.

anentropic on 1 Apr 2020

👍1

Weird, I had the opposite issue: able to instantiate via a parent class but not directly. To overcome this, I'm now detecting direct instantiations and fixing the parameters passed to BaseModel, explicitly setting the __root__ keyword arg:

class MovieGenre(BaseModel):
    class MovieGenreEnum(str, Enum):
        Action = "Action"
        Drama = "Drama"
    __root__: MovieGenreEnum

    def __init__(self, *args, **kwargs):
        if len(args) == 1 and type(args[0]) == str:
            # a genre was passed as a parameter - this is a direct instantiation 
            genre = args[0]
            super().__init__(__root__=MovieGenreEnum(genre), *args[1:], **kwargs)
        else:
            super().__init__(*args, **kwargs)

A similar approach might solve your issue too. I wonder if there's a more elegant solution though.

AAlon on 21 Apr 2020

Regarding validation, validate_model() appears to be considered public.

from typing import Dict
from pydantic import BaseModel, validate_model

class StrDict(BaseModel):
    __root__: Dict[str, str]

value, fields_set, error = validate_model(StrDict, {'foo': 'bar'})
print(error)

yields

ValidationError(model='StrDict', errors=[{'loc': ('__root__',), 'msg': 'field required', 'type': 'value_error.missing'}])

Is this the intended behavior?

patrickkwang on 9 Jun 2020

yes, with validate_model you should use validate_model(StrDict, {'__root__': {'foo': 'bar'}})

samuelcolvin on 11 Jun 2020

Okay. The feeling I'm getting is that the output of dict() should validate, i.e.

foo: BaseModel = ...
value, fields_set, error = validate_model(foo.__class__, foo.dict())
assert error is None

for any foo that is an instance of a subclass of BaseModel.

This seems to be true currently, and if it is meant to be true generally, this indicates a validation bug that mirrors the dict() bug described in #1414.
```python
from typing import Dict
from pydantic import BaseModel, validate_model

class StrDict(BaseModel):
__root__: Dict[str, str]

class NestedDict(BaseModel):
v: StrDict

value, fields_set, error = validate_model(NestedDict, {'v': {'foo': 'bar'}})
print(error)

patrickkwang on 11 Jun 2020

Hi!
I guess we can just change the order in BaseModel.validate

     @classmethod
     def validate(cls: Type['Model'], value: Any) -> 'Model':
-        if isinstance(value, dict):
+        if cls.__custom_root_type__:
+            return cls.parse_obj(value)
+        elif isinstance(value, dict):
             return cls(**value)
         elif isinstance(value, cls):
             return value.copy()
         elif cls.__config__.orm_mode:
             return cls.from_orm(value)
-        elif cls.__custom_root_type__:
-            return cls.parse_obj(value)
         else:
             try:
                 value_as_dict = dict(value)