Pydantic: JSON Schema Nullable Required Syntax?

Created on 1 Mar 2020  路  26Comments  路  Source: samuelcolvin/pydantic

Question

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.4
            pydantic compiled: False
                 install path: C:\Users\Work\Envs\songspace\Lib\site-packages\pydantic
               python version: 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
                     platform: Windows-10-10.0.19041-SP0
     optional deps. installed: ['typing-extensions']

I would expect the code below to generate something like:
"foo": {"title": "Foo", "type": ["string", "null"]} for one of the permutations attempted. Is there a way to do this without creating a custom nullable datatype for each type? The goal is to telegraph something like what is under discussion in https://github.com/samuelcolvin/pydantic/issues/990 to consumers of the schema. Generally, I find that type: [type1, type2] is the syntax with the broadest support.

I see that Union[str, int] will give an AnyOf syntax in the schema, but I can't find a way to get even that to work with null?

I'm probably missing something?

from typing import Union, List, Optional
from pydantic import BaseModel, Field

class Widget(BaseModel):
    """A Widget"""
    foo: Optional[str] = None
    bar: Union[str, None] = None
    baz: Optional[str] = Field(
        '...',
    )
    quux: Union[str, None] = Field(
        None,
    )

print(Widget.schema_json())
Change help wanted question

Most helpful comment

I ran into the same issue, here is my test case:

from pydantic import BaseModel
from typing import Optional

class Foo(BaseModel):
    # This field must be set, but can be None
    optional_int: Optional[int] = ...

# The schema says that the type of optional_int is 'integer'
assert Foo.schema() == {
    'title': 'Foo',
    'type': 'object',
    'properties': {'optional_int': {'title': 'Optional Int', 'type': 'integer'}},
    'required': ['optional_int'],
}

# This works:
foo_json = '{"optional_int": null}'
foo_object = Foo(optional_int=None)
assert(Foo.parse_raw(foo_json) == foo_object)
assert(foo_object.json() == foo_json)

The json is exactly as I expected it to be. However, when I try to validate the json string {"optional_int": null} with the generated schema (I used jsonschemavalidator.net), it will complain that it expected an integer but got null.

I would have expected the properties of optional_int to be {'title': 'Optional Int', 'type': ['integer', 'null']}.

All 26 comments

I run into a similar issue when I use inspect.signature():

import inspect
from typing import Union, List, Optional
from pydantic import BaseModel, Field

class Widget(BaseModel):
    """A Widget"""
    foo: Optional[str] = None
    bar: Union[str, None] = None
    baz: Optional[str] = Field(
        ...,
    )
    quux: Union[str, int] = Field(
        None,
    )
    barbq: Union[None, str] = ...
    barbbq: Optional[str] = ...

print(inspect.signature(Widget))

I would have expected the signature to explicitly allow null for one of these the parameters eg Union[str, None]

Actual result:
(*, foo: str = None, bar: str = None, baz: str, quux: Union[str, int] = None, barbq: str, barbbq: str) -> None

I suppose it's implicit in the signatures with default=None that None is allowed, but I suspect I'm missing something here?

Something like the following gives me the expected output where I need it (and breaks quite a few tests):

diff --git a/pydantic/schema.py b/pydantic/schema.py
index 462829b..6849296 100644
--- a/pydantic/schema.py
+++ b/pydantic/schema.py
@@ -660,7 +660,10 @@ def field_singleton_schema(  # noqa: C901 (ignore complexity)

     for type_, t_schema in field_class_to_schema:
         if issubclass(field_type, type_):
-            f_schema.update(t_schema)
+            if field.allow_none:
+                f_schema.update({'anyOf': [{'type': 'null'}, t_schema]})
+            else:
+                f_schema.update(t_schema)
             break

     modify_schema = getattr(field_type, '__modify_schema__', None)
@@ -689,7 +692,9 @@ def field_singleton_schema(  # noqa: C901 (ignore complexity)
             nested_models.update(sub_nested_models)
         else:
             nested_models.add(model_name)
-        schema_ref = {'$ref': ref_prefix + model_name}
+        schema_ref: Dict[str, Any] = {'$ref': ref_prefix + model_name}
+        if field.allow_none:
+            schema_ref = {'anyOf': [{'type': 'null'}, schema_ref]}
         if not schema_overrides:
             return schema_ref, definitions, nested_models
         else:

Above is hacky and will create issues with schema items with title, description, and types given that the 'type' schema is built separately. But the output does demonstrate what I would expect to see in a standard-compliant JSON Schema for objects that allow Union[type, None]?

I ran into the same issue, here is my test case:

from pydantic import BaseModel
from typing import Optional

class Foo(BaseModel):
    # This field must be set, but can be None
    optional_int: Optional[int] = ...

# The schema says that the type of optional_int is 'integer'
assert Foo.schema() == {
    'title': 'Foo',
    'type': 'object',
    'properties': {'optional_int': {'title': 'Optional Int', 'type': 'integer'}},
    'required': ['optional_int'],
}

# This works:
foo_json = '{"optional_int": null}'
foo_object = Foo(optional_int=None)
assert(Foo.parse_raw(foo_json) == foo_object)
assert(foo_object.json() == foo_json)

The json is exactly as I expected it to be. However, when I try to validate the json string {"optional_int": null} with the generated schema (I used jsonschemavalidator.net), it will complain that it expected an integer but got null.

I would have expected the properties of optional_int to be {'title': 'Optional Int', 'type': ['integer', 'null']}.

Agreed - would love to see field: Optional[str] or field: Optional[str] = None map to json schema of 'type': ['string', 'null']. Was a deal-breaker for adopting pydantic in a project (which relied on existing json schema validation) today.

How are others working around this limitation currently? Custom field types with __modify_schema__?

Having the same issue, any update?

Yes, I am having this exact same issue. It would be really nice if the generated json schema could accept 'null' as a value for optional types.

Please no more "me too" comments, use the emoji reactions.

I acknowledge the problem, sorry for the slow response, see #1340.

I agree the that the correct solution would be to use AnyOf in the schema, and Optional[foo] or Union[None, foo, bar] in the signature.

PR welcome to fix this, otherwise I'll work on it when I get a chance.

Hello!

Here is my attempt to fix the problem and produce proper JSON Schema for a model with nullables.
It's not perfect. Ideally pydantic should treat NoneType as a valid property type and get rid of ModelField.allow_none property, IMO.

If the fix is ok, then I need to fix fastapi tests somehow.

Simple example:

from typing import Optional, Union
from pydantic import BaseModel


class Model(BaseModel):
    id: str = None


class ModelWithNullables(BaseModel):
    p_optional: Optional[int]
    p_union: Union[str, None]
    p_none_default: str = None
    p_opt_model: Optional[Model]


print(ModelWithNullables().schema_json(indent=4))

Note that this PR do not fix Model.__init__() signature due to #1055 .

I've reviewed the PR #1611, and broadly it looks good, just a few things to fix.

The biggest question is about whether this is a breaking change and needs to wait until v2? I know what's proposed is more correct for JSON schema, but I know many people (myself included) have written code to use the scheme that would break with this change.

Ideally pydantic should treat NoneType as a valid property type and get rid of ModelField.allow_none property, IMO.

Agreed, I hope we can make that change in v2.

I suppose it's implicit in the signatures with default=None that None is allowed

I think it's fair to expect that - even with mypy you have to switch on a flag for foobar: int = None to be invalid and require foobar: Optional[int] = None to be used instead.

@samuelcolvin Since you keep referring to v2 here, what are the plans for v2? Is there a roadmap?

I would really need this feature/bugfix to be able to use pydantic for my current project (which relies on existing JSON schema validation)

@brot if what you mean by a Roadmap is "Things that will probably change in Version 2", @samuelcolvin has created a milestone with related issues for v2 here: https://github.com/samuelcolvin/pydantic/milestone/2
There is no scheduled release date that I'm aware of, this is a more organic project than that, dependent on unpaid off-hours availability.

is there any workarounds you guys know of?

The workaround is to add this method to your Model's Config Class:

class YourModel(BaseModel):
    required_foo: int
    nullable_bar: Optional[str]
    nullable_other: Optional[int]

    class Config:
        @staticmethod
         def schema_extra(schema, model):
            for prop, value in schema.get('properties', {}).items():
               if prop in ["nullable_bar", "nullable_other"]:  # Your actual nullable fields go in this list.
                    was = value["type"]
                    value["type"] = [was, "null"]

Hi everyone

In the meantime here is a generic workaround:

from pydantic import BaseModel as PydanticBaseModel


class BaseModel(PydanticBaseModel):
    class Config:
        @staticmethod
        def schema_extra(schema, model):
            for prop, value in schema.get('properties', {}).items():
                # retrieve right field from alias or name
                field = [x for x in model.__fields__.values() if x.alias == prop][0]
                if field.allow_none:
                    # only one type e.g. {'type': 'integer'}
                    if 'type' in value:
                        value['anyOf'] = [{'type': value.pop('type')}]
                    # only one $ref e.g. from other model
                    elif '$ref' in value:
                        if issubclass(field.type_, PydanticBaseModel):
                            # add 'title' in schema to have the exact same behaviour as the rest
                            value['title'] = field.type_.__config__.title or field.type_.__name__
                        value['anyOf'] = [{'$ref': value.pop('$ref')}]
                    value['anyOf'].append({'type': 'null'})

Full example with it below

from pydantic import Field

from typing import Optional, Union

class FooModel(BaseModel):
    foo: str


class BarModel(BaseModel):
    bar: str


class YourModel(BaseModel):
    int_foo: int
    str_foo: str
    model_foo: FooModel
    opt_str_foo: Optional[str] = Field(alias='opt_str_foo_alias')
    opt_model_foo: Optional[FooModel]
    union_foo: Union[int, str]
    union_model_foo_str: Union[FooModel, str]
    union_model_foo_bar: Union[FooModel, BarModel]
    opt_union_foo: Union[int, str] = None
    opt_union_foo2: Optional[Union[int, str]]
    opt_union_model_foo_bar: Optional[Union[FooModel, BarModel]]

assert YourModel.schema() == {
    'type': 'object',
    'title': 'YourModel',
    'properties': {
        'int_foo': {
          'title': 'Int Foo',
          'type': 'integer',
        },
        'str_foo': {
            'title': 'Str Foo',
            'type': 'string',
        },
        'model_foo': {'$ref': '#/definitions/FooModel'},
        'opt_str_foo_alias': {
            'title': 'Opt Str Foo Alias',
            'anyOf': [{'type': 'string'}, {'type': 'null'}],
        },
        'opt_model_foo': {
            'title': 'FooModel',
            'anyOf': [{'$ref': '#/definitions/FooModel'}, {'type': 'null'}]
        },
        'union_foo': {
            'title': 'Union Foo',
            'anyOf': [{'type': 'integer'}, {'type': 'string'}],
        },
        'union_model_foo_str': {
            'title': 'Union Model Foo Str',
            'anyOf': [{'$ref': '#/definitions/FooModel'}, {'type': 'string'}],
        },
        'union_model_foo_bar': {
            'title': 'Union Model Foo Bar',
            'anyOf': [{'$ref': '#/definitions/FooModel'}, {'$ref': '#/definitions/BarModel'}],
        },
        'opt_union_foo': {
            'title': 'Opt Union Foo',
            'anyOf': [{'type': 'integer'}, {'type': 'string'}, {'type': 'null'}],
        },
        'opt_union_foo2': {
            'title': 'Opt Union Foo2',
            'anyOf': [{'type': 'integer'}, {'type': 'string'}, {'type': 'null'}],
        },
        'opt_union_model_foo_bar': {
            'title': 'Opt Union Model Foo Bar',
            'anyOf': [{'$ref': '#/definitions/FooModel'}, {'$ref': '#/definitions/BarModel'}, {'type': 'null'}],
        },
    },
    'required': ['int_foo', 'str_foo', 'model_foo', 'union_foo', 'union_model_foo_str', 'union_model_foo_bar'],
    'definitions': {
        'FooModel': {
            'title': 'FooModel',
            'type': 'object',
            'properties': {
                'foo': {'title': 'Foo', 'type': 'string'}
            },
            'required': ['foo']
        },
        'BarModel': {
            'title': 'BarModel',
            'type': 'object',
            'properties': {
                'bar': {'title': 'Bar', 'type': 'string'}
            },
            'required': ['bar']
        }
    }
}

Hope it helps!

Hi everyone
A generic workaround in the meantime could be

```python
from typing import Optional, Union

from pydantic import BaseModel as PydanticBaseModel, Field

class BaseModel(PydanticBaseModel):
class Config:
@staticmethod
def schema_extra(schema, model):
for prop, value in schema.get('properties', {}).items():
# retrieve right field from alias or name
field = [x for x in model.__fields__.values() if x.alias == prop][0]
if field.allow_none:
# only one type e.g. {'type': 'integer'}
if 'type' in value:
value['anyOf'] = [{'type': value.pop('type')}]
value['anyOf'].append({'type': 'null'})

Hi @PrettyWood, I tried your code but it raises KeyError: 'anyOf' so it seems to be an issue with the logic appending the null value (around the if)

Hello @HacKanCuBa
Which version of _pydantic_ do you use? Could you please provide a gist or something to help me understand

Hello @HacKanCuBa
Which version of _pydantic_ do you use? Could you please provide a gist or something to help me understand

Sure! Is there any way to chat this somewhere else? :P I don't want to flood this thread so much (you can find me on twitter or telegram with the same username)

In [8]: import pydantic

In [9]: pydantic.version.VERSION
Out[9]: '1.6.1'

Apparently it fails on a model relating another model:

class SomeModel(BaseModel):

    field1: int
    field2: str


class RelatedModel(BaseModel):

    field1: int
    related: Optional[SomeModel]
In [3]: RelatedModel.schema()
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-0c3b236f3e85> in <module>
----> 1 RelatedModel.schema()

~/.pyenv/versions/3.8.5/envs/[REDACTED]/lib/python3.8/site-packages/pydantic/main.cpython-38-x86_64-linux-gnu.so in pydantic.main.BaseModel.schema()

~/.pyenv/versions/3.8.5/envs/[REDACTED]/lib/python3.8/site-packages/pydantic/schema.cpython-38-x86_64-linux-gnu.so in pydantic.schema.model_schema()

~/.pyenv/versions/3.8.5/envs/[REDACTED]/lib/python3.8/site-packages/pydantic/schema.cpython-38-x86_64-linux-gnu.so in pydantic.schema.model_process_schema()

<ipython-input-1-621096e38e2c> in schema_extra(schema, model)
     15                     if 'type' in value:
     16                         value['anyOf'] = [{'type': value.pop('type')}]
---> 17                     value['anyOf'].append({'type': 'null'})
     18 

KeyError: 'anyOf'

Hi @HacKanCuBa
I just updated my answer with the generic version to add the case of $ref when we have a BaseModel in the schema.
Please tell me if it solves your problem!
If yes, I'll just hide all our conversation to keep this issue readable ;)

Hi @HacKanCuBa
I just updated my answer with the generic version to add the case of $ref when we have a BaseModel in the schema.
Please tell me if it solves your problem!
If yes, I'll just hide all our conversation to keep this issue readable ;)

It certainly seems to work but something seemed off. After a while I noticed that there's a typo in this line: value['anyOf'] = [{'ref': value.pop('$ref')}] should be value['anyOf'] = [{'$ref': value.pop('$ref')}] (note the $ref as key where the symbol $ is missing)

@PrettyWood I found another issue with your solution: 'null' becomes a string "null" in the OpenAPI schema and not the correct JSON value null. On several validators this produces:

Structural error at components.schemas.RelatedModel.properties.related.anyOf.1.type
should be equal to one of the allowed values
allowedValues: array, boolean, integer, number, object, string

I tried with None but it doesn't work. I'm currently researching and will come back here if I find a solution.

UPDATE 1: the JSON null value is also not valid: Structural error at components.schemas.....anyOf.1.type: should be string, so I'm a bit lost. I'm checking the specs now.

UPDATE 2: quoting the specs

Note that there is no null type; instead, the nullable attribute is used as a modifier of the base type.

UPDATE 3: this seems to work:

class BaseModel(PydanticBaseModel):
    class Config:

        @staticmethod
        def schema_extra(schema, model):
            for prop, value in schema.get('properties', {}).items():
                # retrieve right field from alias or name
                field = [x for x in model.__fields__.values() if x.alias == prop][0]
                if field.allow_none:
                    value['nullable'] = True

UPDATE 4: $refs need to be wrapped in allOf or anyOf for nullable to be valid (I choose anyOf here because it makes more sense to go with nullable, but any choice is fine AFAIK):

class BaseModel(PydanticBaseModel):
    class Config:

        @staticmethod
        def schema_extra(schema, model):
            for prop, value in schema.get('properties', {}).items():
                # retrieve right field from alias or name
                field = [x for x in model.__fields__.values() if x.alias == prop][0]
                if field.allow_none:
                    if '$ref' in value:
                        if issubclass(field.type_, PydanticBaseModel):
                            # add 'title' in schema to have the exact same behaviour as the rest
                            value['title'] = field.type_.__config__.title or field.type_.__name__
                        value['anyOf'] = [{'$ref': value.pop('$ref')}]

                    value['nullable'] = True

@HacKanCuBa Glad you could test it and catch new errors! Thanks for that 馃憤
It means https://github.com/samuelcolvin/pydantic/pull/1611 will need some changes!

I was checking that, too

i don't want to ruin the party with bad news. but the sad situation is that openapi 3 deviates from jsonschema. it does not recognise _null_ as a type, but rather, it insists on a boolean field _nullable_. openapi 3.1 will go fully jsonschema compatible.

nullable is actually easier to fake in:

email: Optional[str] = Field(..., nullable=True)

So basically we could have an enum option in the config like extra. IMO it makes no sense to put it at field level as you either want your schema with or without nullables.
Discard: names to be changed. It's just for the example

class Config:
    schema_nullable = 'hidden' / '3.0' / '3.1'

By default we would keep the 'hidden' behaviour so nothing so the feature wouldn't break anything and wouldn't require a v2 as it's currently the case for the open PR.
If 3.0 is chosen we add nullable: true for the nullables fields in the schema.
If 3.1 is chosen we add type: 'null' in the anyOf part.
Maybe we should add also a 'both' in the enum as they don't have conflict. It would probably make things easier for people working with different tools.
WDYT?

That sounds plausible, I like the idea. Is it possible to set said config "externally", as in i.e. FastAPI setting the correct value for the in-use OpenAPI spec version? If we can do that, then the problem would be mostly solved.

This would mean having a model like this:

class MyModel(BaseModel):
    field1: int
    field2: str

...

MyModel.Config.schema_nullable = '3.0'  # or '3.1'

Does that sound sane, patching EVERY model like this? I mean, I think we need to figure a solution that allows the user and/or the framework to set the right value. Perhaps some global setting to not require monkeypatching every model?

You could just do

from pydantic import BaseConfig

BaseConfig.schema_nullable = '...'

or define your own custom BaseModel since Config is inherited from parent classes

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dconathan picture dconathan  路  3Comments

cdeil picture cdeil  路  3Comments

ashpreetbedi picture ashpreetbedi  路  3Comments

gangefors picture gangefors  路  3Comments

samuelcolvin picture samuelcolvin  路  3Comments