Pydantic: Generate actual signature for BaseModel.__init__

Created on 25 Nov 2019  路  22Comments  路  Source: samuelcolvin/pydantic

Feature Request

That idea came for me when I saw this issue:
https://github.com/tiangolo/fastapi/issues/318
And this particular comment that shows that standart inspect tools unable to determine signature of a pydantic model:

@tiangolo, I had an opportunity to fiddle a little bit more with this today.
Here is a full example of the different ways to specify parameters using depends: https://gist.github.com/Atheuz/075f4d8fe3b56d034741301ba2574ef1

You can run it using this command uvicorn run:app --reload --host 0.0.0.0 --port 8080

I essence: dataclass classes work if I specify Query(None) and Depends, pure classes work if I specify Query(None) and use Depends, and input parameters in the signature just works.

Pydantic classes do not work, at least in terms of the generated docs, it just says data instead of the expected dt and to_sum.

dataclasses in the generated docs:
billede

pydantic in the generated docs:
billede

This, however is not true for dataclasses, where __init__ is generated on class creation.

@dataclass
class Dataclass:
    id: int
    name: str = 'John'

class Model(BaseModel):
    id: int
    name: str = 'John'

help(Dataclass.__init__)
'''
Help on function __init__:
__init__(self, id: int, name: str = 'John') -> None
'''
help(Model.__init__)
'''
Help on function __init__ in module pydantic.main:
__init__(__pydantic_self__, **data: Any) -> None
    Initialize self.  See help(type(self)) for accurate signature.
'''

Of course, we cannot follow dataclasses in this part and generate __init__ of BaseModel, since it will cause getting TypeError instead of ValidationError on missing values, etc., but we still can help inspect tools to see the right signature of our models.

from functools import update_wrapper
import pydantic

def __init__(self, *, id: int, name: str = 'John') -> None: ...
update_wrapper(Model.__init__, __init__)
help(Model.__init__)
'''
Help on function __init__ in module __main__:
__init__(self, *, id: int, name: str = 'John') -> None
'''

Possible approaches

I see this ways of creating typed __init__:

  • Generate it as a string and use eval()
  • Create function normally and modify its __code__ signature
    and then use functools.update_wrapper(Model.__init__, typed_func)

Example

# using `Model` declared above

def f(): ...

f.__code__ = f.__code__.replace(   # __code__.replace() available only in python >= 3.8
    co_kwonlyargcount=len(Model.__fields__),
    co_argcount=1,
    co_varnames=('__pydantic_self__',) + tuple(Model.__fields__)
)

defaults, annotations = {}, {}
for name, field in Model.__fields__.items():
    if field.default:
        defaults[name] = field.default
    annotations[name] = field.type_

f.__kwdefaults__ = defaults
f.__annotations__ = annotations
help(f)
'''
Help on function f in module __main__:
f(__pydantic_self__, *, id: int, name: str = 'John')
'''
feature request help wanted

All 22 comments

Your approach sounds very good to me, PR welcome.

Questions:

  • I assume this would not impact performance. Am I right?
  • Are there any situations where this could cause problems or backwards incompatibility? E.g. do we need to wait for v2? I assume not.
  • Are there any other places where this could help? E.g. ipython maybe? Presumably not pycharm, vscode or vim?

Would it work to just set the __signature__ attribute of the model __init__? This is enough to work with inspect.signature, which is used, for example, by FastAPI; I'm not sure how it relates to the other tools discussed above.

If you just need to update __signature__, I have a variety of examples doing this for various fastapi-related purposes that may be useful for reference.

See, for example, here:
https://gist.github.com/dmontagu/87e9d3d7795b14b63388d4b16054f0ff#file-fastapi_cbv-py-L37-L47

Also, I have worked through most of the config-related logic for what the generated signature should look like in the mypy plugin. So that might provide a good reference, e.g., if Config.allow_extra is False or similar, and you want to be really careful.

I assume this would not impact performance. Am I right?

This has relatively small impact only on model creation.

Are there any situations where this could cause problems or backwards incompatibility? E.g. do we need to wait for v2? I assume not.

If we implement this right (e. g. respect __code__ differences between python versions), no, I don't think so.

Question to think about:

What if subclass of a model declares __init__? Would we merge it's signature, rewrite it or leave as is?

class MyModel(BaseModel):
    id: int
    name: str = 'John'    

    def __init__(self, foo: int = 41, **data):
        self.foo = foo + 1 
        return super().__init__(**data)

Would it work to just set the __signature__ attribute of the model __init__? This is enough to work with inspect.signature, which is used, for example, by FastAPI; I'm not sure how it relates to the other tools discussed above.

@dmontagu , just to be clear, do you suggest construct new signature from scratch and set it to __init__, or generate function and copy its signature to __init__?

Generate a new signature from scratch; it's not too hard using the objects from the inspect module.

Sounds good. Having __signature__ is enough for FastAPI to understand right parameters.

However, I'm not sure if it's gonna work everywhere, as __signature__ is not present in function type by default and used only in inspect module.

I think the best is to have all: valid __wrapped__, __kwdefaults__, __annotations__ and __signature__ together to make sure any kind of inspection will work properly.

Oh, looks like there's case when we can't use __signature__:

        if not name.isidentifier():
>           raise ValueError('{!r} is not a valid parameter name'.format(name))
E           ValueError: 'this-is-funky' is not a valid parameter name

@MrMrRobat
I think this-is-funky is an invalid signature name.
Is there a case to give a method an invalid name?
If I misread it, sorry.

I'm assuming this is coming from an alias? Yeah, I'm generally against giving methods signatures based on aliases.

In general, aliases don't seem to play super nicely with existing python tools, a reason I heavily prefer having allow_population_by_alias=True.

I think part of the problem is that __init__ is used for both parsing, where JSON schema conventions are important, and initializing in non-parsing contexts, where python conventions are important (e.g., for type checking, auto-docs-generation, IDEs, etc.).

The construct method mitigates this to some extent, but I think there is still some awkwardness, e.g., if you need your validators to run on the inputs. I think I personally would prefer if __init__ always expected non-aliases, and aliases were only relevant for the parse methods.

(But I recognize obviously that's a huge breaking change from how things work now, so will probably never happen, and others may not feel the same way I do on this point anyway 馃槄.)

I think this-is-funky is an invalid signature name.
Is there a case to give a method an invalid name?

Not a method, but field.

I'm assuming this is coming from an alias?

In comes from test_create_model:
https://github.com/samuelcolvin/pydantic/blob/c71326d4a6898612597d3c647a4256f168818e30/tests/test_create_model.py#L127-L136

@MrMrRobat
We should ignore an invalid field name to create __signature__ when using create_model() with an invalid name.
Also, Someone may want to add valid field to __sigunature___ on Model which is created by create_model()

model = create_model('FooModel', **{'this-is-funky': (int, 1), 'bar': (int, ...)})
m = model(**{'this-is-funky': '123', 'bar': '456'})
assert m.dict() == {'this-is-funky': 123, 'bar': 456}

print(inspect.signature(model))
# > (**data: Any) -> None

print(model(bar='789'))
# > FooModel this-is-funky=1 bar=789

I would prefer having something like:

print(inspect.signature(model))
# > (*, bar = '456', **data: Any) -> None

print(model(bar='789'))
# > FooModel this-is-funky=1 bar=789

Though it will be more difficult to implement.

BTW, maybe we should forbid to use such names even in create_model in future and force user use an alias, @samuelcolvin , @dmontagu?

BTW, maybe we should forbid to use such names even in create_model in future and force user use an alias

Perhaps we could do that in v2, but for now, I think

print(inspect.signature(model))
#> (*, bar = '456', **data: Any) -> None

should work well, though I realise it won't be particularly easy.

Done. Also, I think we should leave **data even if all fields are valid identifiers and mentioned in new signature, as actual __init__ takes extra fields and passes them in validate_model

@MrMrRobat If Config.allow_extra=False you could remove the **data, and you could also inspect for aliases. Not sure if the extra complexity is worth it though.

What exactly do you mean by inspect for aliases?

What exactly do you mean by inspect for aliases?

I added a comment on the PR that I think clarifies what I was trying to get at with this.

Given the complexity and amount of effort going into #1034, I just want to revisit the purpose of going beyond setting the __signature__.

Just setting __signature__ would involve a smaller change, would not depend on compilation, would not need to use different APIs for different versions of python, and would accomplish the same end goal for the FastAPI swagger docs (using the same APIs FastAPI uses internally).

I recognize from a theoretical perspective this is perhaps inferior to the high degree of fidelity achievable by updating all of the associated attributes, but I also think this approach will add a substantially larger on-going maintenance burden. So I'm curious -- are there any frequently used tools that will benefit from going this extra mile? The only thing I can think of is the built-in help function, but I've never seen anyone make much use of that on a regular basis. (But I also recognize I may not know everyone's usage patterns.)

Given this is basically finished maybe it's just worth keeping, but I at least just want to sanity-check that we still think this is worth it.


Also, I think we should leave **data even if all fields are valid identifiers and mentioned in new signature, as actual __init__ takes extra fields and passes them in validate_model

Note that leaving **data in the signature means that data will show up as a parameter in the FastAPI docs. (I suppose we could/should address that in FastAPI, but it will probably be at least a little bit annoying to do so.) It might at least be good to respect the Config.extra setting here as a way to disable this.

Just setting __signature__ would involve a smaller change, would not depend on compilation, would not need to use different APIs for different versions of python...

What exactly would be simplified if we'd went with __signature__ only? All effort goes to solving problem with copying __init__ so we can set __signature__ or whatever. If we would have __signature__ only, we still would have the same problem.

Quick example:

```py
class Model(BaseModel):
a: int
b: str

Model.__init__.__signature__ = 'my_shiny_signature: a: int, b: str'
print(BaseModel.__init__.__signature__) # my_shiny_signature: a: int, b: str

class Model2(BaseModel)
foo: str
bar: int

Model.__init__.__signature__ = 'my_shiny_signature: foo: str, bar: int'
print(Model.__init__.__signature__) # my_shiny_signature: foo: str, bar: int
print(BaseModel.__init__.__signature__) # my_shiny_signature: foo: str, bar: int

@dmontagu this was my thought too, bit or doesn't work as far as I could see.

The problem is that there's only one __init__ function which is used by all subclasses. If you change an attribute of that funding it'll be changed in that one function and thereby on all subclasses.

There seems to be no way to copy the __init__ function on cython.

As discussed on #1034 it would be much easier to set Model.__signature__ than Model.__init__.__signature__. Would that solve the original issue here?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jasonkuhrt picture jasonkuhrt  路  19Comments

bradodarb picture bradodarb  路  22Comments

dmfigol picture dmfigol  路  38Comments

Yolley picture Yolley  路  18Comments

rrbarbosa picture rrbarbosa  路  35Comments