Pydantic: Add support for builder pattern

Created on 29 Nov 2020  路  6Comments  路  Source: samuelcolvin/pydantic

Checks

  • [x] I added a descriptive title to this issue
  • [x] I have searched (google, github) for similar issues and couldn't find anything
  • [x] I have read and followed the docs and still think this feature/change is needed
  • [x] After submitting this, I commit to one of:

    • Look through open issues and helped at least one other person

    • Hit the "watch" button on this repo to receive notifications and I commit to help at least 2 people that ask questions in the future

    • Implement a Pull Request for a confirmed bug

Feature Request

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.7.2
            pydantic compiled: True
                 install path: C:\Users\Foo\AppData\Local\Programs\Python\Python38\Lib\site-packages\pydantic
               python version: 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
                     platform: Windows-10-10.0.18362-SP0
     optional deps. installed: []

The problem I'm having now is that I have a model with lots of required fields and ideally, I'd like to be able to do something like

obj = ObjWithLotsofRequiredFields()
obj.a = ...
obj.b = ...
obj.c = ...
...
obj.z = ...

Unfortunately I can't do that because a-z are required fields so Pydantic doesn't let me instantiate ObjWithLotsofRequiredFields()

As an alternative, I can use placeholders for storing a-z and instantiate ObjWithLotsofRequiredFields at the end, like

a = ...
b = ...
c = ...
...
z = ...
obj = ObjWithLotsofRequiredFields(
  a=a, 
  b=b, 
  c=c,
  ...,
  z=z
)

Unfortunately, this makes the code longer and creates duplication.

Hence, I think a builder pattern could come in handy. It would look something like

builder = ObjWithLotsofRequiredFieldsBuilder()
builder.a = ...
builder.b = ...
builder.c = ...
...
builder.z = ...
obj = builder.build()

Note that this code looks almost identical to the first one and it looks at lot cleaner than the second one. The ObjWithLotsofRequiredFields object would only get instantiated and validated at the end.

feature request

Most helpful comment

If this is the API we want, here is a basic POC (note that it won't work with cython as we use eval!)

from typing import Type

from pydantic import BaseModel


class Spam(BaseModel):
    foo: str = "oof"
    bar: str
    baz: int


def Builder(model: Type[BaseModel]):
    builder_code = f"""
class {model.__name__}Builder:
    def __init__(self):
        self.values = {{}}

    def build(self):
        return {model.__name__}(**self.values)"""

    for field_name in model.__fields__:
        builder_code += f"""

    def {field_name}(self, x):
        self.values['{field_name}'] = x
        return self"""

    locs = {}
    builder_code_str = compile(builder_code, f'__{model.__name__}Builder', 'exec')
    eval(builder_code_str, globals(), locs)
    return locs[f'{model.__name__}Builder']


SpamBuilder = Builder(Spam)
print(repr(SpamBuilder().foo('qwe').bar('qwe').baz(1).build()))
# Spam(foo='qwe', bar='qwe', baz=1)

All 6 comments

Hello @tc8
I'm not sure it should be directly done by BaseModel as it makes things more complex imo. But adding this logic on a builder class is quite easy!

from pydantic import BaseModel


class BuildBaseModel(BaseModel):
    def __init__(self, **data):
        object.__setattr__(self, '__build_values__', data)

    def __setattr__(self, key, value):
        if hasattr(self, '__build_values__'):
            self.__build_values__[key] = value
        else:
            super().__setattr__(key, value)

    def build(self):
        if not hasattr(self, '__build_values__'):
            # The model has already been built
            return
        super().__init__(**self.__build_values__)


class Model(BuildBaseModel):
    a: str
    b: str
    c: str

m = Model()
m.a = 'q'
m.b = 'w'
m.c = 'e'
m.build()
m.build()
print(repr(m))  # Model(a='q', b='w', c='e')

Hope it helps!

Thanks @PrettyWood, that's pretty close to what I'm looking for but ideally I'd like to be able to differentiate the builder class from the model class. That way it's more explicit if something is being built and I can't accidentally pass a partially built model around in place of the real model.

Just move the methods over to a new class:

from __future__ import annotations

from typing import Dict, Generic, Type, TypeVar

from pydantic import BaseModel

TModel = TypeVar('TModel', bound=BaseModel)


class Foo(BaseModel):
    bar: str


class Builder(Generic[TModel]):
    model: Type[TModel]
    values: Dict[str, object]

    def __init__(self, model: Type[TModel]) -> None:
        super().__setattr__('model', model)
        super().__setattr__('values', {})

    def __setattr__(self, name: str, value: object) -> None:
        self.values[name] = value

    def build(self) -> TModel:
        return self.model(**self.values)


foo_builder = Builder(Foo)
foo_builder.bar = 1
foo = foo_builder.build()
assert foo.dict() == {'bar': '1'}

In Rust there's a popular library rust-derive-build that creates builder for any data struct with just a macro annotation.

Without getting into rust specifics, if translated to python world the code would look like following.

class Spam(BaseModel):
    foo: str = "oof"
    bar: str
    baz: int

SpamBuilder = Builder(Spam)

# then elsewhere
spam: Spam = Spam().bar("rab").baz(10).build()

# or in multiple steps:
spamtmp = SpamBuilder()
# some other code
spamtmp.foo("oof").bar("rab") # all these methods will return self, so that we could chain calls in required.
spamtmp.baz(10)
# and finally
foo_final = foo1.build() 

Makes working with data model so easy. My 2垄s.

If this is the API we want, here is a basic POC (note that it won't work with cython as we use eval!)

from typing import Type

from pydantic import BaseModel


class Spam(BaseModel):
    foo: str = "oof"
    bar: str
    baz: int


def Builder(model: Type[BaseModel]):
    builder_code = f"""
class {model.__name__}Builder:
    def __init__(self):
        self.values = {{}}

    def build(self):
        return {model.__name__}(**self.values)"""

    for field_name in model.__fields__:
        builder_code += f"""

    def {field_name}(self, x):
        self.values['{field_name}'] = x
        return self"""

    locs = {}
    builder_code_str = compile(builder_code, f'__{model.__name__}Builder', 'exec')
    eval(builder_code_str, globals(), locs)
    return locs[f'{model.__name__}Builder']


SpamBuilder = Builder(Spam)
print(repr(SpamBuilder().foo('qwe').bar('qwe').baz(1).build()))
# Spam(foo='qwe', bar='qwe', baz=1)

Just move the methods over to a new class:

You could add field validation to the builder by re-using some of the attribute setting logic:
```python
class Builder(Generic[TModel]):
model: Type[TModel]
values: Dict[str, object]

def __init__(self, model: Type[TModel]) -> None:
    super().__setattr__('model', model)
    super().__setattr__('values', {})

def __setattr__(self, name: str, value: object) -> None:
    if name not in self.model.__fields__:
        raise ValueError(f'"{name}" is not a valid attribute name for "{self.model.__name__}".')
    known_field = self.model.__fields__[name]
    dict_without_original_value = {k: v for k, v in self.model.__dict__.items() if k != name}
    value, error_ = known_field.validate(value, dict_without_original_value, loc=name, cls=self.model.__class__)
    if error_:
        raise ValidationError([error_], self.__class__)
    self.values[name] = value

def build(self) -> TModel:
    return self.model(**self.values)

```

Was this page helpful?
0 / 5 - 0 ratings