Hi there, thanks for a super cool library! Pydantic has slowly been replacing all other data validation libraries in my projects. :)
As to my question:
I want to validate a JSON that can have one or two keys (each gets a default value if it is not present) but at least one is required.
Let's call the two keys "a" and "b" => in that case I would like these JSONs to be valid
{
"a": 42
}
{
"b": 42
}
{
"a": 42
"b": 42
}
and these to be invalid
{}
{
"a": 42,
"some_other_key": 42
}
With Jsonschema I can do this, like so:
{
"anyOf": [{"required": "a"}, {"required": "b"}],
"properties": {
"a": {"type": "number", "default": 42},
"a": {"type": "number", "default": 42},
},
}
Any ideas how to do this with Pydantic?
Any help is greatly appreciated :)
Hi there, thanks for a super cool library! Pydantic has slowly been replacing all other data validation libraries in my projects. :)
No problem, pleased it's helpful.
Simplest solution is to use a validator
something like
class MyModel(BaseModel):
a: str = None
b: str = None
@validator('b')
def check_a_or_b(cls, v, values):
if 'a' not in values and not b:
raise ValueError('either a or b is required')
return b
Unfortunately:
bpydantic currently has no concept of errors that don't apply to field, so if you wanted something else you'd have to implement it manually at the moment.
Let me know if that works for you.
ah ok, unfortunately I need to be able to generate the correct JSONSchema from it, so I guess this is currently not supported. but thanks for the quick response, anyways!
Happy to try and accept it, but this would probably be a big change for pydantic since it lives "above" the fields where most validation currently goes on. I'm not even sure how you would define it for the schema.
Hi there! I'm facing a similar issue here, where I'm trying to ensure that either a or b is being passed in. I'm using a dataclass validator.
It's unclear if the params are required - pre=True, always=True - the validator didn't seem to trigger otherwise.
Also, considering that the purpose of using type hints to convey notation, this actually triggers a mypy warning:
Incompatible types in assignment (expression has type "None", variable has type "str")
mypy(error)
I tried wrapping the value with typing.Optional but that didn't work either. Is there something I'm overlooking on the right way to express that one of these two values _must_ be present?
@miketheman can you share the code that is resulting in mypy errors?
@timonbimon As of pydantic 0.32, it is now possible to manually specify updates to the generated json schema via the schema_extra attribute on the model's Config. This is discussed at the bottom of the section of the docs on schema creation.
This should make it possible for you to generate your intended JSON schema.
@dmontagu certainly! I basically took the code from the example earlier in this thread - https://github.com/samuelcolvin/pydantic/issues/506#issuecomment-489348525
from pydantic import validator
from pydantic.dataclasses import dataclass
@dataclass
class MyModel:
a: str = None
b: str = None
@validator('b')
def check_a_or_b(cls, v, values):
if 'a' not in values and not b:
raise ValueError('either a or b is required')
return b
And mypy will return these errors:
$ mypy sample.py
sample.py:7: error: Incompatible types in assignment (expression has type "None", variable has type "str")
sample.py:8: error: Incompatible types in assignment (expression has type "None", variable has type "str")
@miketheman annotate the str as Optional[str] (using typing.Optional) and that error should go away. I know you said you tried that, but if that also doesn’t work can you show the error?
@dmontagu Certainly!
Adding the Optional type removes the mypy errors, but then the validator doesn't get applied. Here's a couple of examples:
from typing import Optional
from pydantic import validator
from pydantic.dataclasses import dataclass
@dataclass
class MyModel:
a: Optional[str] = None
b: Optional[str] = None
@validator('b')
def check_a_or_b(cls, v, values):
if 'a' not in values and not b:
raise ValueError('either a or b is required')
return b
mm = MyModel(a="a")
print(mm)
Output:
MyModel(a='a', b=None)
So it appears that the validator isn't being called yet.
If I add always=True to the @validator, I get a NameError:
...
@validator('b', always=True)
def check_a_or_b(cls, v, values):
...
Output:
Traceback (most recent call last):
File "sample.py", line 17, in <module>
mm = MyModel(a="a")
File "<string>", line 4, in __init__
File "pyenv/lib/python3.7/site-packages/pydantic/dataclasses.py", line 72, in _pydantic_post_init
d = validate_model(self.__pydantic_model__, self.__dict__, cls=self.__class__)[0]
File "pyenv/lib/python3.7/site-packages/pydantic/main.py", line 757, in validate_model
v_, errors_ = field.validate(value, values, loc=field.alias, cls=cls or model.__class__) # type: ignore
File "pyenv/lib/python3.7/site-packages/pydantic/fields.py", line 317, in validate
v, errors = self._validate_singleton(v, values, loc, cls)
File "pyenv/lib/python3.7/site-packages/pydantic/fields.py", line 443, in _validate_singleton
value, error = field.validate(v, values, loc=loc, cls=cls)
File "pyenv/lib/python3.7/site-packages/pydantic/fields.py", line 317, in validate
v, errors = self._validate_singleton(v, values, loc, cls)
File "pyenv/lib/python3.7/site-packages/pydantic/fields.py", line 450, in _validate_singleton
return self._apply_validators(v, values, loc, cls, self.validators)
File "pyenv/lib/python3.7/site-packages/pydantic/fields.py", line 457, in _apply_validators
v = validator(cls, v, values, self, self.model_config)
File "pyenv/lib/python3.7/site-packages/pydantic/class_validators.py", line 171, in <lambda>
return lambda cls, v, values, field, config: validator(cls, v, values=values)
File "sample.py", line 15, in check_a_or_b
return b
NameError: name 'b' is not defined
Using pre=True to the @validator appears to have no impact.
Any ideas?
def check_a_or_b(cls, v, values):
if 'a' not in values and not b:
raise ValueError('either a or b is required')
return b
b is not defined — you should be using v instead, since that is the name of the second argument in the signature.
@dmontagu Thanks for pointing that out - that led me further along - I completely overlooked that, the original example had the error already, I simply copied it without actually understanding what it was doing. I changed v to b, to better convey the intent.
Once corrected, it still doesn't do the behavior needed, due to the values inspection - by the time b is evaluated, values = {'a': None} - so 'a' not in values evaluates to False - since a _is_ in `values, so I modified the check a little to get the right concept:
@validator('b')
def check_a_or_b(cls, b, values):
if not values.get('a') and not b:
raise ValueError('either a or b is required')
return b
So now the validator _logic_ seems correct, as the .get() will return None. However, there's an output question that follows.
My understanding from the docs is that if a value is not supplied, the validator won't run, so the @validator needs to have the always - per https://pydantic-docs.helpmanual.io/#validate-always
Adding that to the @validator signature triggers the validator - yay! - and produces the desired validation error, but it provides two errors:
pydantic.error_wrappers.ValidationError: 2 validation errors for MyModel
b
none is not an allowed value (type=type_error.none.not_allowed)
b
either a or b is required (type=value_error)
The docs call out that adding pre-True to the validator is likely the right answer here:
You’ll often want to use this together with pre since otherwise the with always=True pydantic would try to validate the default None which would cause an error.
So changing it to add pre=True as well replaces the (type=type_error.none.not_allowed) error with a duplicate, validator-supplied (type=value_error):
Sample code, updated, for reference, and ought to be runnable:
from typing import Optional
from pydantic import validator
from pydantic.dataclasses import dataclass
@dataclass
class MyModel:
a: Optional[str] = None
b: Optional[str] = None
@validator('b', pre=True, always=True)
def check_a_or_b(cls, b, values):
if not values.get('a') and not b:
raise ValueError('either a or b is required')
return b
mm = MyModel()
Error output:
Traceback (most recent call last):
File "sample.py", line 16, in <module>
mm = MyModel()
File "<string>", line 4, in __init__
File "pyenv/lib/python3.7/site-packages/pydantic/dataclasses.py", line 72, in _pydantic_post_init
d = validate_model(self.__pydantic_model__, self.__dict__, cls=self.__class__)[0]
File "pyenv/lib/python3.7/site-packages/pydantic/main.py", line 785, in validate_model
raise err
pydantic.error_wrappers.ValidationError: 2 validation errors for MyModel
b
either a or b is required (type=value_error)
b
either a or b is required (type=value_error)
This last bit is more in line with what I've been looking for - thanks for taking the time to read through my learning process and guiding! - but the duplicate validation error feels incorrect, so I'm wondering if there's some other elegant way to solve this.
Yes, this get's me occasionally too. The reason for the duplicate error is that Optional[str] is equivalent to Union[None, str], when pydantic finds a union, it builds one "route" for each possible option in the union, if none of the routes succeed all the errors are included in the output.
The solution here is to use
from typing import Optional
from pydantic import validator
from pydantic.dataclasses import dataclass
@dataclass
class MyModel:
a: Optional[str] = None
b: Optional[str] = None
@validator('b', pre=True, always=True, whole=True)
def check_a_or_b(cls, b, values):
if not values.get('a') and not b:
raise ValueError('either a or b is required')
return b
mm = MyModel()
Because we have whole=True the validator is run once for the field, not for each "route". You can also get the same result by changing Optional[str] to just str although then your type hints are slightly wrong.
This is ugly and I don't like it, but I'm not sure how to fix it.
Possible solutions:
None, but this would mean validators aren't called for the None case which is the whole point in always=Truenone is not an allowed value, this error message is almost never any use. In that case, the None route would fail silently, could be confusing when people find their validator being called twice, but with no explanation.@samuelcolvin I think my favorite of those options is deduplicating in the .errors() call.
okay, I might try the third option, or some better solution than not_none_validator and see how I get on.
I think this is fixed on master, create a new issue or comment here if you think I'm wrong.
Confirming that on pydantic 1.0b2 I can remove any of the validator keyword parameters to get the desired result. From the last example:
- @validator('b', pre=True, always=True, whole=True)
+ @validator('b')
And the output:
$ python repro.py
Traceback (most recent call last):
File "repro.py", line 16, in <module>
mm = MyModel()
File "<string>", line 4, in __init__
File "pyenv/lib/python3.7/site-packages/pydantic/dataclasses.py", line 77, in _pydantic_post_init
raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for MyModel
b
either a or b is required (type=value_error)
Thanks @samuelcolvin & others! I couldn't figure out which commit exactly solved this, but happy with the path to 1.0 release.