This request is to have Pydantic models auto-generate their doc strings for their parameters, reading the parameters' Schema objects for more information. The end result would be Model classes who's __doc__ provides details about the parameters the model has. This would have use for people who generate docs for their models though a program like Sphinx to auto create the more complete docstring.
This can, and probably should be an optional thing the user sets or calls since it will require overwriting the __doc__ variable.
This may turn out to be too dependent on individual user preferences of doc style flavors to have any viable officially supported flavor(s) in pydantic, but I wanted to propose anyways.
Below I have a crude toy implementation with examples to show the outputs. I have tested this in Python 3.6 and 3.7 with Pydantic Versions 0.26 and 0.29 and should run as with no external dependencies beyond pydantic itself)
Foreseeable difficulties:
Schema and not variablesKnown issues with toy implementation:
Schema vs. non-Schema parameters:class: TargetClass instead of any further docstring description which would in Sphinx's RST format as a link to that class in the docs, not exactly helpful in all cases though)from enum import Enum
from textwrap import dedent, indent
from typing import Tuple, Dict
from pydantic import BaseModel, Schema, confloat, BaseSettings, validator, ValidationError
####################################
# Start of Auto-Doc Generation block
####################################
class _JsonRefModel(BaseModel):
"""
Reference model for Json replacement fillers
Matches style of:
``'allOf': [{'$ref': '#/definitions/something'}]}``
and will always be a length 1 list
"""
allOf: Tuple[Dict[str, str]]
@validator("allOf", whole=True)
def all_of_entries(cls, v):
value = v[0]
if len(value) != 1:
raise ValueError("Dict must be of length 1")
elif '$ref' not in value:
raise ValueError("Dict needs to have key $ref")
elif not isinstance(value["$ref"], str) or not value["$ref"].startswith('#/'):
raise ValueError("$ref should be formatted as #/definitions/...")
return v
def doc_formatter(target_object):
"""
Set the docstring for a Pydantic object automatically based on the parameters
This could use improvement.
"""
doc = target_object.__doc__
# Handle non-pydantic objects
if doc is None:
new_doc = ''
elif 'Parameters\n' in doc or not (issubclass(target_object, BaseSettings) or issubclass(target_object, BaseModel)):
new_doc = doc
else:
type_formatter = {'boolan': 'bool',
'string': 'str',
'integer': 'int',
'number': 'float'
}
# Add the white space
if not doc.endswith('\n\n'):
doc += "\n\n"
new_doc = dedent(doc) + "Parameters\n----------\n"
target_schema = target_object.schema()
# Go through each property
for prop_name, prop in target_schema['properties'].items():
# Catch lookups for other Pydantic objects
if '$ref' in prop:
# Pre 0.28 lookup
lookup = prop['$ref'].split('/')[-1]
prop = target_schema['definitions'][lookup]
elif 'allOf' in prop:
# Post 0.28 lookup
try:
# Validation, we don't need output, just the object
_JsonRefModel(**prop)
lookup = prop['allOf'][0]['$ref'].split('/')[-1]
prop = target_schema['definitions'][lookup]
except ValidationError:
# Doesn't conform, pass on
pass
# Get common properties
prop_type = prop["type"]
new_doc += prop_name + " : "
prop_desc = prop['description']
# Check for enumeration
if 'enum' in prop:
new_doc += '{' + ', '.join(prop['enum']) + '}'
# Set the name/type of object
else:
if prop_type == 'object':
prop_field = prop['title']
else:
prop_field = prop_type
new_doc += f'{type_formatter[prop_field] if prop_field in type_formatter else prop_field}'
# Handle Classes so as not to re-copy pydantic descriptions
if prop_type == 'object':
if not ('required' in target_schema and prop_name in target_schema['required']):
new_doc += ", Optional"
prop_desc = f":class:`{prop['title']}`"
# Handle non-classes
else:
if 'default' in prop:
default = prop['default']
try:
# Get the explicit default value for enum classes
if issubclass(default, Enum):
default = default.value
except TypeError:
pass
new_doc += f", Default: {default}"
elif not ('required' in target_schema and prop_name in target_schema['required']):
new_doc += ", Optional"
# Finally, write the detailed doc string
new_doc += "\n" + indent(prop_desc, " ") + "\n"
# Assign the new doc string
target_object.__doc__ = new_doc
########################
# Start of Example block
########################
class FruitEnum(str, Enum):
apple = "apple"
orange = "orange"
class Taxes(BaseModel):
"""The State and Federal Taxes charged for operation"""
state: float = 0.06
federal: float = 0.08
city: float = None
class FruitStandNoDoc(BaseModel):
"""
My fruit stand that I sell various things from
"""
fruit: FruitEnum = FruitEnum.apple
stock: int
price: confloat(ge=0) = 0.6
advertising: str = None
currently_open: bool = False
taxes: Taxes = Taxes()
class FruitStand(BaseModel):
"""
My fruit stand that I sell various things from
"""
fruit: FruitEnum = Schema(
FruitEnum.apple,
description="The fruit which I have available at my stand"
)
stock: int = Schema(
...,
description="How many of each fruit to keep on hand"
)
price: float = Schema(
0.60,
description="Price per piece of fruit",
ge=0
)
advertising: str = Schema(
None,
description="Advertising message to display"
)
currently_open: bool = Schema(
False,
description="Is the fruit stand open or not?"
)
taxes: Taxes = Schema(
Taxes(),
description="Taxes charged by the state and local level"
)
print(FruitStandNoDoc.__doc__)
print('-'*20)
print(FruitStand.__doc__)
print('-'*20)
doc_formatter(FruitStand)
print(FruitStand.__doc__)
Outputs the following lines:
My fruit stand that I sell various things from
--------------------
My fruit stand that I sell various things from
--------------------
My fruit stand that I sell various things from
Parameters
----------
fruit : {apple, orange}, Default: apple
The fruit which I have available at my stand
stock : int
How many of each fruit to keep on hand
price : float, Default: 0.6
Price per piece of fruit
advertising : str, Optional
Advertising message to display
currently_open : boolean, Default: False
Is the fruit stand open or not?
taxes : Taxes, Optional
:class:`Taxes`
I'm not opposed to it, my questions/feedback would be:
util function to do create/set a docstring so it has to be called manually?no that could be no / if_missing / alwaysBefore we add it, would anyone else want this?
I think this would be pretty popular as it would interface with canonical Sphinx documentation tech. Effectively auto-docs from the Schema so that you do not need to write this twice.
For speed, we could use the @property decorator so that the doc string would only be evaluated when called (often during docs generation or Jupyter notebooks).
I'm quite in favor of this, as I've been beginning to create docs with pydoc-markdown for an API client I'm writing using Pydantic for data validation/parsing/coersion and such, and having to re-write my docstrings, especially for inherited models, is a bit tedious. I would gladly switch all my definitions to Schema()/Field() defs, if I could get auto-generated docs for each attribute.
@dgasmith For what it's worth, you could implement it without modifying metaclass by making use of __init_subclass__; that would probably be preferable (at least if we followed a similar approach in pydantic), in order to prevent downstream metaclass conflicts.
So ProtoModel would become:
class ProtoModel(BaseModel):
def __init_subclass__(cls) -> None:
cls.__doc__ = AutoPydanticDocGenerator(cls, always_apply=True)
and you could drop the metaclass.
@dmontagu Thanks! I was not aware of this.
@dmontagu 's comment was very helpful for finding a way to autogenerate my own documentation. I really like the __init_subclass__ solution. After quite a bit of experimentation I feel like this would be afeature that doesn't _have to_ be part of pydantic. Instead, what might be extremely useful is to simply add a little section to pydantic's documentation showing how a user could achieve this. A small example would suffice. If you decide this is a good approach I can submit a pull request.
Just a note, that environ-config library provides method generate_help, see https://environ-config.readthedocs.io/en/stable/tutorial.html#debugging (with implementation at https://github.com/hynek/environ-config/blob/0bd960a602878be39cdc24f2d52d2d767d5056a4/src/environ/_environ_config.py#L352 )
Most helpful comment
@dgasmith For what it's worth, you could implement it without modifying metaclass by making use of
__init_subclass__; that would probably be preferable (at least if we followed a similar approach in pydantic), in order to prevent downstream metaclass conflicts.So ProtoModel would become:
and you could drop the metaclass.