Pydantic: Failed to create schema with optional fields

Created on 29 Jun 2018 · 14Comments · Source: samuelcolvin/pydantic

Just tried to use schema creation and it's awesome, but I found, I think, bug with optional fields.

Here is my model:

from typing import Optional
from pydantic import BaseModel

class TestModel(BaseModel):
    foo: Optional[int] = None

print(TestModel.schema_json(indent=2))

Output:

Traceback (most recent call last):
  File "test_pydantic_schema.py", line 9, in <module>
    print(TestModel.schema_json(indent=2))
  File "/pydantic/pydantic/main.py", line 288, in schema_json
    return json.dumps(cls.schema(by_alias=by_alias), default=pydantic_encoder, **dumps_kwargs)
  File "/pydantic/pydantic/main.py", line 278, in schema
    s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
  File "/pydantic/pydantic/main.py", line 278, in <dictcomp>
    s['properties'] = {f.alias: f.schema(by_alias) for f in cls.__fields__.values()}
  File "/pydantic/pydantic/fields.py", line 146, in schema
    if issubclass(self.type_, Enum):
TypeError: issubclass() arg 1 must be a class

Same issue with Union:

from typing import Union
from pydantic import BaseModel

class TestModel(BaseModel):
    foo: Union[str, int]

print(TestModel.schema_json(indent=2))

I think pydantic should cover cases like this, I don't know what is the best way to solve issue with Optional, but in case of Union oneOf or anyOf rule can be used http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.7.

bug

Source

Gr1N

Most helpful comment

as per discussion on gitter:

the top level property should change type > schema
there should be a shortened display of just 'schema': {'type': 'int'} which is just 'schema': 'int', this applies both at the top level for simple types and where simple types are nested within more complex types.

Overal the design is encapsulated by:

'foobar': {
  'schema': {
    'type': 'list',
    'item_type': {
      'mode': 'any_of',
      'types': [
        'float', 'int',
      ],
    }
  },
  'name': 'Foobar',
  'required': True,
}

samuelcolvin on 2 Jul 2018

👍2

All 14 comments

yes.

I guess Optional is just equivalent of Union[None, Foobar].

As discussed on #129 I'm not that bothered by json-schema or it's equivalents, but yes, something like that makes sense.

Perhaps we just have "type": ["int", "float"] and an array for type just means "union"?

samuelcolvin on 29 Jun 2018

For JSON Schemas, "union" means:

{
    "type": "object",
    "properties": {
        "foo": {
            "anyOf": [
                { "type": "int },
                { "type": "float" },
            ]
        }
    }
}

Gr1N on 30 Jun 2018

Also I don't think that we need to use anyOf for Optional, I think better to convert this case to property with default=null.

Gr1N on 30 Jun 2018

I'm still not sure how to display unions, I don't think we should return multiple instances of the entire field definition, eg. if we have

  "Gender": {
    "type": "int",
    "title": "Gender",
    "required": false,
  }

and the value can also be a string, I think

  "Gender": {
    "type": ["int", "str"],
    "title": "Gender",
    "required": false,
  }

  "Gender": {
    "type": {"anyOf": ["int", "str"]},
    "title": "Gender",
    "required": false,
  }

is clearer than

  "Gender": {
    "anyOf": [
      {
        "type": "int",
        "title": "Gender",
        "required": false,
      },
      {
        "type": "str",
        "title": "Gender",
        "required": false,
      },
    ]
  }

But that get's more complicated in strange situations like Union[str, FoobarModel]

What do you think?

samuelcolvin on 1 Jul 2018

ok, after looking at #218 I propose the following way to deal with this:

type becomes a dictionary.

For "simple" fields, eg. just an int or float, type like:

'type': {
  'item_type': 'float',
}

For unions we'd have:

'type': {
  'mode': 'any_of',
  'item_type': ['str', 'float'],
}

For things like List[int] we'd have:

'type': {
  'item_type': 'int',
  'shape': 'list',
}

or List[Union[int, str]] we'd have:

'type': {
  'mode': 'any_of',
  'item_type': ['int', 'str'],
  'shape': 'list',
}

For Dict[str, int] we'd have:

'type': {
  'item_type': 'int',
  'key_type': 'str',
  'shape': 'dict',
}

(we could add key_mode if the key is a union, but that's a very rare case)

The names mode, item_type and shape are not great (suggestions very welcome), but I think this is the best way to go.

Questions:

are their any case's I've missed?
do we do type: {'item_type': 'int', ...} or just add all these items to the main dict?
what do we do when implement tuples #12, I guess something like properties on sub models?
what do we do with Union[int, FoobarModel] - I guess we have type as above, then properties. What do we do with Union[FooModel, BarModel] - maybe this is just banned?

@Gr1N @x1ah

samuelcolvin on 2 Jul 2018

🎉1

May missed enum type ?

In my opinion, we should carefully change the field structure. and typing.Union 's field info looks good.

from typing import Union
from pydantic import BaseModel


class Foo(BaseModel):
    ubar = Union[int, str]

print(Foo.__fields__['r'].info)
# {'type': 'typing.Union[int, str]',
#  'required': True,
#  'sub_fields': [<Field r_int: type='int', required=True, validators=['int_validator']>,
#   <Field r_str: type='str', required=True, validators=['not_none_validator', 'str_validator','anystr_strip_whitespace', 'anystr_length_validator']>]}

and similar for List[int, str], we get

{'type': 'typing.List[int]',
 'required': True,
 'sub_fields': [<Field r_int: type='int', ...>]}

by this way, this problem will be solved without changing the original fields struct, just put into sub_fields.

What do you think? @samuelcolvin

x1ah on 2 Jul 2018

Just copying python's syntax is ok with me, it'll make it slightly harder for people to read programmatically, but it'll be more concise and extendible.

The case of enums is already implemented via choices.

samuelcolvin on 2 Jul 2018

are their any case's I've missed?

What about Optional[any type]?

And one question, why we want to invent new format? Why not use JSON Schema spec? I understand that spec in some cases can be very explicit, but it covers all needs to my mind. Also it's good to follow specification in cases if we want to generate API specs from pydantic models and give them to another team who works, for example, with different language, and if generated specs follow JSON schema spec we can use them to generate code for any language we want.

Gr1N on 2 Jul 2018

I think your suggestion of default=null for Optional is good.

And one question, why we want to invent new format? Why not use JSON Schema spec?

see https://github.com/samuelcolvin/pydantic/issues/129#issuecomment-364754565, also:

It would be masses of work (perhaps more than the rest of pydantic combined) to fully support JSON schema, which is only draft anyway.
pydantic isn't tied to JSON and I think we would run into lots of conflicts, eg. (AFAIK) json schema doesn't cover lots of types with pydantic supports, eg. decimal, uuid, dict keys have to be strings in json

samuelcolvin on 2 Jul 2018

Understand. Can you please share how cases Union[int, FoobarModel] and Union[FooModel, BarModel] will be looks in proposed format?

Gr1N on 2 Jul 2018

Ok, I've redesigned it, (maybe we're now back at JSON schema? :smile:):

simple:

'type': {
  'type': 'float',
}

Union[float, int]:

'type': {
  'type': 'any_of',
  'types': [
    {
      'type': 'float',
    },
    {
      'type': 'int',
    },
  ],
}

List[int]:

'type': {
  'type': 'list',
  'item_type': {
    'type': 'int',
  },
}

List[Union[int, str]]:

'type': {
  'type': 'list',
  'item_type': {
    'mode': 'any_of',
    'types': [
      {
      'type': 'float',
      },
      {
      'type': 'int',
      },
    ],
  }
}

Dict[str, int]:

'type': {
  'type': 'dict',
  'item_type': {
    'type': 'int',
  },
  'key_type': {
    'type': 'str',
  },
  'shape': 'dict',
}

FooModel:

'type': {
  'type': 'object',
  'properties': {
    ...
  }
}

Enum(int):

'type': {
  'type': 'int',
  'choices': [
      [1, 'Male'],
      [2, 'Female'],
      [3, 'Other'],
  ]
}

Then Union[int, FoobarModel] is just:

'type': {
  'mode': 'any_of',
  'types': [
    {
      'type': 'int',
    },
      'type': 'object',
      'properties': {
        ...
      }
    }
  ],
}

and Union[FooModel, BarModel] is:

'type': {
  'type': 'any_of',
  'types': [
    {
      'type': 'object',
      'properties': {
        ...
      }
    },
      'type': 'object',
      'properties': {
        ...
      }
    }
  ],
}

In other words types are recursive; any_of, list and dict are just types of type.

For a whole field, we then get:

'age': {
  'type': {
    'type': 'float',
  },
  'name': 'Age',
  'required': True,
}

What do you think?

Update: correct object.
Update 2: mode > type

samuelcolvin on 2 Jul 2018

LGTM, let's try to go with your proposal.

Gr1N on 2 Jul 2018

as per discussion on gitter:

the top level property should change type > schema
there should be a shortened display of just 'schema': {'type': 'int'} which is just 'schema': 'int', this applies both at the top level for simple types and where simple types are nested within more complex types.

Overal the design is encapsulated by:

'foobar': {
  'schema': {
    'type': 'list',
    'item_type': {
      'mode': 'any_of',
      'types': [
        'float', 'int',
      ],
    }
  },
  'name': 'Foobar',
  'required': True,
}

samuelcolvin on 2 Jul 2018

👍2

I've implemented this in #232, it's not exactly as described above, but similar. The nice thing is that this doesn't break much of what's already implemented in schema()

samuelcolvin on 20 Jul 2018

Was this page helpful?

0 / 5 - 0 ratings