Pydantic: `each_item` should iterate only at the top-level instead of going through nested items

Created on 17 Sep 2020  路  8Comments  路  Source: samuelcolvin/pydantic

Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":

             pydantic version: 1.5.1
            pydantic compiled: True
                 install path: ...snip...
               python version: 3.6.9 (default, Jun 26 2020, 13:24:16)  [GCC 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.16)]
                     platform: Darwin-18.7.0-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

I'm seeing what appears to be inconsistent behavior with each_item=True when I'm using a field that's a list of tuples.

For instance, a 'normal' model illustrates what seems to be logical behavior.

from typing import List, Tuple
from pydantic import BaseModel, validator

class Thing(BaseModel):
    things: List[str]

    @validator("things", each_item=False)
    def validate_all_things(cls, val):
        print(f"all vals: {val}")
        return val

    @validator("things", each_item=True)
    def validate_single_thing(cls, val):
        print(f"single val: {val}")
        return val


>>> Thing(things=["one", "two", "three,four"])

single val: one
single val: two
single val: three,four
all vals: ['one', 'two', 'three,four']
Thing(things=['one', 'two', 'three,four'])

But if the field is anything like List[Tuple[str], str]], then behavior is absolutely different.

class Thing2(BaseModel):
    things: List[Tuple[str, str]]

    @validator("things", each_item=False)
    def validate_all_things(cls, val):
        print(f"all vals: {val}")
        return val

    @validator("things", each_item=True)
    def validate_single_thing(cls, val):
        print(f"single val: {val}")
        return val


>>> Thing2(things=[("one", "two"), ("three", "four")])

single val: one
single val: two
single val: three
single val: four
all vals: [('one', 'two'), ('three', 'four')]
Thing2(things=[('one', 'two'), ('three', 'four')])

The output from validate_all_things makes perfect sense - it's just the entire list passed in.

I would expect, however, that each_item would do just what it implies, namely calling the validator on each item of the list no matter what the item is. Instead, it's operating on every value within each tuple within the list, and I cannot get it to perform in a way where val == ("one", "two").

Is this expected behavior? If so, why?

I mean, strings are iterable just like tuples, so why isn't each_item iterating over the characters of each string in the first case?

bug

All 8 comments

Hello @kataev
I agree the behaviour is not the expected one and for me this is a bug.

I can see it both ways. But my main problem is that this could well be a breaking change for some people who rely on this behaviour.

Doesn't this strictly speaking, need to wait for v2?

Good question @samuelcolvin. TBH I was so surprised when I saw this issue I really can't imagine this behaviour to be the expected one. Unfortunately it's always hard to make changes that could break someone case but I'm quite sure it's easily fixable.
When I think about it if we really want to be safe we could do what you suggested for improved unions: some kind of feature flag that would be all removed in v2.

@samuelcolvin Good point, I can actually see existing behavior be something that's relied on.. I'm normally not a fan of "just add another kwarg" so adding some keyword like validate_nested that defaults to False could work but might be a bit smelly... I dunno.

from typing import List
from pydantic import BaseModel, validator


class Thing(BaseModel):
    things: List[List[str]]

    # Existing behavior: operates on whole value for property
    @validator("things", each_item=False)
    def one(cls, val):
        assert len(val) > 1, "List must have multiple things in it"
        return val

    # Existing behavior: iterates through nested lists of property
    @validator("things", each_item=True)
    def two(cls, val):
        assert val, "Strings cannot be empty"
        return val

    # NEW BEHAVIOR: opt out of iterating through nested lists
    @validator("things", each_item=True, validate_nested=False)
    def three(cls, val):
        assert len(val) > 1, "Nested list must have multiple things in it"
        return val

# Invalid because of validator `one`
Thing(things=[["a", "b"]])

# Invalid because of validator `two`
Thing(things=[["a",  ""], ["c", "d"]])

# Invalid because of validator `three`
# Currently can't do this without iterating manually in a validator with `each_item=False`
Thing(things=[["a", "b"], ["c"]])

# Valid
Thing(things=[["a", "b"], ["c", "d"]])

We've been bitten by this a few times.
It's especially tricky because people tend to check their validators with the expected behaviour (simple list for example) and it works as expected, and then get really confused by the bugs that occur when any sort of nested data is passed.
I can't really imagine anyone actually expecting this from the get go, but I guess people could rely on this behaviour after figuring it out.
If this can't be fixed until 2.0 perhaps it would be good to add a big warning in the docs on how this works?

Any resolution probably also resolves #1255 just to link the issue

Thanks for the other issue @daviskirk I linked it to my PR as well!
The more I think about it the more I'm convinced it's not the expected behaviour. But we could add an extra kwarg next to each_item like recursive or something else.
v1.8 would hence have a breaking change but the fix would be very easy for people relying on the recursive behaviour: just add this kwarg in the validator.
@samuelcolvin WDYT?

Ah, thanks for linking @daviskirk - I searched prior to adding issue but didn't see that one. Should we close mine @samuelcolvin?

Sorry for the slow reply

Thanks for the other issue @daviskirk I linked it to my PR as well!
The more I think about it the more I'm convinced it's not the expected behaviour. But we could add an extra kwarg next to each_item like recursive or something else.
v1.8 would hence have a breaking change but the fix would be very easy for people relying on the recursive behaviour: just add this kwarg in the validator.
@samuelcolvin WDYT?

I think this is the most sensible solution. @PrettyWood if you update #1991 I'll merge it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sommd picture sommd  路  3Comments

gangefors picture gangefors  路  3Comments

nav picture nav  路  3Comments

dmontagu picture dmontagu  路  3Comments

timonbimon picture timonbimon  路  3Comments