Marshmallow: Dump Validation

Created on 7 Feb 2019 · 11Comments · Source: marshmallow-code/marshmallow

Hi everyone !!

I having some troubles with dump. I want the lib raise a error when some variables in object is missing.

I made the test bellow to explain and the assert failed because nothing is raised.

Is it a bug of dump?

`
from marshmallow import Schema, fields

def test_dump():

class TesteClass:
    def __init__(self):
        self.variable_a = 1

class TesteSchemaClass(Schema):
    variable_a = fields.Integer(required=True)
    variable_b = fields.Integer(required=True)

teste = TesteClass()

schema = TesteSchemaClass()

data, error = schema.dump(teste)

assert len(error.keys()) == 1

question

Source

Bernardoow

Most helpful comment

The correct way to do this is to dump the data, then load it. If the data types are wrong you will probably get errors during dump. This is how it always worked, you just get the real errors instead of ValidationErrors now. If the values are wrong, you will get ValidationErrors during the load operation.

deckar01 on 24 Mar 2020

👍3

All 11 comments

It is a design choice to only validate on load.

lafrech on 7 Feb 2019

Schema.dump() also returns a dictionary of errors, which will include any ValidationErrors raised during serialization. However, required, allow_none, validate, @validates, and @validates_schema only apply during deserialization.

https://marshmallow.readthedocs.io/en/3.0/quickstart.html#validation

Not being able to trust the structure of the object is usually a sign that it is actually data that needs to be loaded. You can use schema.validate(data), but under the hood it is doing the same thing as schema.load(data), it just returns the errors instead of raising them.

https://marshmallow.readthedocs.io/en/3.0/api_reference.html#marshmallow.Schema.validate

deckar01 on 7 Feb 2019

👍2

Hi @lafrech, @deckar01 thanks for help ! I think using the validate schema will be enough to me.

In future, the new option of to do the validation when using dump will be great !!

Bernardoow on 7 Feb 2019

Sorry to bump an old topic.

It is a design choice to only validate on load.

I'm trying to provide a lib that prevents users from returning stuff that aren't what they declared, so I need to validate what they send back, and I expected dump to do this (wrongly so, it's in a warning in validation doc).

In the meantime I can use load then dump so this is not blocking, but I'd like to know if it would be acceptable to add a validate=bool param to dump and dumps. We'd keep the default performance-friendly behaviour of course. As a bonus that would help making it clear that validation is not done on dump even without lookind at the doc

LukeMarlin on 24 Mar 2020

I think the main objection to this is that it requires an important refactor.

See https://github.com/marshmallow-code/marshmallow/issues/1190#issuecomment-532826813 and below.

We could open a dedicated issue for this refactor with milestone 4.0.

lafrech on 24 Mar 2020

Does it? It looks like a "general" issue regarding loading, I was merely asking about a solution like so:

def dump(..., validate=False):
    if validate:
        self.load(...)
    # remaining of actual .dump() method

Not to say that the topic you linked isn't important, by the way. It just feels like a different issue to me (after a quick read)

LukeMarlin on 24 Mar 2020

You can't really just self.load() to validate. This won't work in so many cases (data_key, attribute, pre/post processors,...).

We'd need to make validation independent from load, so that it can be called from load or dump.

lafrech on 24 Mar 2020

Hm so maybe I misunderstood this statement:

You can use schema.validate(data), but under the hood it is doing the same thing as schema.load(data)

If load doesn't really validate, does that mean that load itself s broken? If yes, how does one properly validate if we cannot rely on load, dump and validate?

It looks from the issue you linked that you have a desire for more stricts separation of concerns which would be great. However, wouldn't it be OK to, at the moment, use the same kinda-broken behaviour for both situations? (as in, loading and dumping)

LukeMarlin on 24 Mar 2020

load validates and is not broken. validate loads, which means it requires data in a serialized format. Loading deserialized data is not a supported use case, but will work for certain simple types.

deckar01 on 24 Mar 2020

👍1

deckar01 on 24 Mar 2020

👍3

Sorry to answer again here, but I think it's relevant: the proposed solution isn't quite correct. dump seems to silently correct some types. See the following example:

from marshmallow import fields, Schema
class A(Schema):
    test = fields.Str()
a = A()
a.load(a.dump({"test": 3})) # same with {"test":{}}

Does not raise, because dump transforms 3 into "3", but if you would directly load {"test": 3} you would get a ValidationError. Might be intentional, as it is probably simpler to coerce to string automatically, but in the present case of trying to validate as load would do, it's not good :(

LukeMarlin on 9 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings