Just tried to use schema creation and it's awesome, but I found, I think, bug with optional fields.
Here is my model:
from typing import Optional
from pydantic import BaseModel
class TestModel(BaseModel):
foo: Optional[int] = None
print(TestModel.schema_json(indent=2))
Output:
Traceback (most recent call last):
File "test_pydantic_schema.py", line 9, in <module>
print(TestModel.schema_json(indent=2))
File "/pydantic/pydantic/main.py", line 288, in schema_json
return json.dumps(cls.schema(by_alias=by_alias), default=pydantic_encoder, **dumps_kwargs)
File "/pydantic/pydantic/main.py", line 278, in schema
s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
File "/pydantic/pydantic/main.py", line 278, in <dictcomp>
s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
File "/pydantic/pydantic/fields.py", line 146, in schema
if issubclass(self.type_, Enum):
TypeError: issubclass() arg 1 must be a class
Same issue with Union:
from typing import Union
from pydantic import BaseModel
class TestModel(BaseModel):
foo: Union[str, int]
print(TestModel.schema_json(indent=2))
I think pydantic should cover cases like this, I don't know what is the best way to solve issue with Optional, but in case of Union oneOf or anyOf rule can be used http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.7.
yes.
I guess Optional is just equivalent of Union[None, Foobar].
As discussed on #129 I'm not that bothered by json-schema or it's equivalents, but yes, something like that makes sense.
Perhaps we just have "type": ["int", "float"] and an array for type just means "union"?
For JSON Schemas, "union" means:
{
"type": "object",
"properties": {
"foo": {
"anyOf": [
{ "type": "int },
{ "type": "float" },
]
}
}
}
Also I don't think that we need to use anyOf for Optional, I think better to convert this case to property with default=null.
I'm still not sure how to display unions, I don't think we should return multiple instances of the entire field definition, eg. if we have
"Gender": {
"type": "int",
"title": "Gender",
"required": false,
}
and the value can also be a string, I think
"Gender": {
"type": ["int", "str"],
"title": "Gender",
"required": false,
}
or
"Gender": {
"type": {"anyOf": ["int", "str"]},
"title": "Gender",
"required": false,
}
is clearer than
"Gender": {
"anyOf": [
{
"type": "int",
"title": "Gender",
"required": false,
},
{
"type": "str",
"title": "Gender",
"required": false,
},
]
}
But that get's more complicated in strange situations like Union[str, FoobarModel]
What do you think?
ok, after looking at #218 I propose the following way to deal with this:
type becomes a dictionary.
For "simple" fields, eg. just an int or float, type like:
'type': {
'item_type': 'float',
}
For unions we'd have:
'type': {
'mode': 'any_of',
'item_type': ['str', 'float'],
}
For things like List[int] we'd have:
'type': {
'item_type': 'int',
'shape': 'list',
}
or List[Union[int, str]] we'd have:
'type': {
'mode': 'any_of',
'item_type': ['int', 'str'],
'shape': 'list',
}
For Dict[str, int] we'd have:
'type': {
'item_type': 'int',
'key_type': 'str',
'shape': 'dict',
}
(we could add key_mode if the key is a union, but that's a very rare case)
The names mode, item_type and shape are not great (suggestions very welcome), but I think this is the best way to go.
Questions:
type: {'item_type': 'int', ...} or just add all these items to the main dict?properties on sub models?Union[int, FoobarModel] - I guess we have type as above, then properties. What do we do with Union[FooModel, BarModel] - maybe this is just banned?@Gr1N @x1ah
May missed enum type ?
In my opinion, we should carefully change the field structure. and typing.Union 's field info looks good.
from typing import Union
from pydantic import BaseModel
class Foo(BaseModel):
ubar = Union[int, str]
print(Foo.__fields__['r'].info)
# {'type': 'typing.Union[int, str]',
# 'required': True,
# 'sub_fields': [<Field r_int: type='int', required=True, validators=['int_validator']>,
# <Field r_str: type='str', required=True, validators=['not_none_validator', 'str_validator','anystr_strip_whitespace', 'anystr_length_validator']>]}
and similar for List[int, str], we get
{'type': 'typing.List[int]',
'required': True,
'sub_fields': [<Field r_int: type='int', ...>]}
by this way, this problem will be solved without changing the original fields struct, just put into sub_fields.
What do you think? @samuelcolvin
Just copying python's syntax is ok with me, it'll make it slightly harder for people to read programmatically, but it'll be more concise and extendible.
The case of enums is already implemented via choices.
are their any case's I've missed?
What about Optional[any type]?
And one question, why we want to invent new format? Why not use JSON Schema spec? I understand that spec in some cases can be very explicit, but it covers all needs to my mind. Also it's good to follow specification in cases if we want to generate API specs from pydantic models and give them to another team who works, for example, with different language, and if generated specs follow JSON schema spec we can use them to generate code for any language we want.
I think your suggestion of default=null for Optional is good.
And one question, why we want to invent new format? Why not use JSON Schema spec?
see https://github.com/samuelcolvin/pydantic/issues/129#issuecomment-364754565, also:
Understand. Can you please share how cases Union[int, FoobarModel] and Union[FooModel, BarModel] will be looks in proposed format?
Ok, I've redesigned it, (maybe we're now back at JSON schema? :smile:):
simple:
'type': {
'type': 'float',
}
Union[float, int]:
'type': {
'type': 'any_of',
'types': [
{
'type': 'float',
},
{
'type': 'int',
},
],
}
List[int]:
'type': {
'type': 'list',
'item_type': {
'type': 'int',
},
}
List[Union[int, str]]:
'type': {
'type': 'list',
'item_type': {
'mode': 'any_of',
'types': [
{
'type': 'float',
},
{
'type': 'int',
},
],
}
}
Dict[str, int]:
'type': {
'type': 'dict',
'item_type': {
'type': 'int',
},
'key_type': {
'type': 'str',
},
'shape': 'dict',
}
FooModel:
'type': {
'type': 'object',
'properties': {
...
}
}
Enum(int):
'type': {
'type': 'int',
'choices': [
[1, 'Male'],
[2, 'Female'],
[3, 'Other'],
]
}
Then Union[int, FoobarModel] is just:
'type': {
'mode': 'any_of',
'types': [
{
'type': 'int',
},
'type': 'object',
'properties': {
...
}
}
],
}
and Union[FooModel, BarModel] is:
'type': {
'type': 'any_of',
'types': [
{
'type': 'object',
'properties': {
...
}
},
'type': 'object',
'properties': {
...
}
}
],
}
In other words types are recursive; any_of, list and dict are just types of type.
For a whole field, we then get:
'age': {
'type': {
'type': 'float',
},
'name': 'Age',
'required': True,
}
What do you think?
Update: correct object.
Update 2: mode > type
LGTM, let's try to go with your proposal.
as per discussion on gitter:
type > schema'schema': {'type': 'int'} which is just 'schema': 'int', this applies both at the top level for simple types and where simple types are nested within more complex types.Overal the design is encapsulated by:
'foobar': {
'schema': {
'type': 'list',
'item_type': {
'mode': 'any_of',
'types': [
'float', 'int',
],
}
},
'name': 'Foobar',
'required': True,
}
I've implemented this in #232, it's not exactly as described above, but similar. The nice thing is that this doesn't break much of what's already implemented in schema()
Most helpful comment
as per discussion on gitter:
type>schema'schema': {'type': 'int'}which is just'schema': 'int', this applies both at the top level for simple types and where simple types are nested within more complex types.Overal the design is encapsulated by: