Marshmallow: Dump Validation

Created on 7 Feb 2019  路  11Comments  路  Source: marshmallow-code/marshmallow

Hi everyone !!

I having some troubles with dump. I want the lib raise a error when some variables in object is missing.

I made the test bellow to explain and the assert failed because nothing is raised.

Is it a bug of dump?

`
from marshmallow import Schema, fields

def test_dump():

class TesteClass:
    def __init__(self):
        self.variable_a = 1

class TesteSchemaClass(Schema):
    variable_a = fields.Integer(required=True)
    variable_b = fields.Integer(required=True)

teste = TesteClass()

schema = TesteSchemaClass()

data, error = schema.dump(teste)

assert len(error.keys()) == 1

`

question

Most helpful comment

The correct way to do this is to dump the data, then load it. If the data types are wrong you will probably get errors during dump. This is how it always worked, you just get the real errors instead of ValidationErrors now. If the values are wrong, you will get ValidationErrors during the load operation.

All 11 comments

It is a design choice to only validate on load.

Schema.dump() also returns a dictionary of errors, which will include any ValidationErrors raised during serialization. However, required, allow_none, validate, @validates, and @validates_schema only apply during deserialization.

https://marshmallow.readthedocs.io/en/3.0/quickstart.html#validation

Not being able to trust the structure of the object is usually a sign that it is actually data that needs to be loaded. You can use schema.validate(data), but under the hood it is doing the same thing as schema.load(data), it just returns the errors instead of raising them.

https://marshmallow.readthedocs.io/en/3.0/api_reference.html#marshmallow.Schema.validate

Hi @lafrech, @deckar01 thanks for help ! I think using the validate schema will be enough to me.

In future, the new option of to do the validation when using dump will be great !!

Sorry to bump an old topic.

It is a design choice to only validate on load.

I'm trying to provide a lib that prevents users from returning stuff that aren't what they declared, so I need to validate what they send back, and I expected dump to do this (wrongly so, it's in a warning in validation doc).

In the meantime I can use load then dump so this is not blocking, but I'd like to know if it would be acceptable to add a validate=bool param to dump and dumps. We'd keep the default performance-friendly behaviour of course. As a bonus that would help making it clear that validation is not done on dump even without lookind at the doc

I think the main objection to this is that it requires an important refactor.

See https://github.com/marshmallow-code/marshmallow/issues/1190#issuecomment-532826813 and below.

We could open a dedicated issue for this refactor with milestone 4.0.

Does it? It looks like a "general" issue regarding loading, I was merely asking about a solution like so:

def dump(..., validate=False):
    if validate:
        self.load(...)
    # remaining of actual .dump() method

Not to say that the topic you linked isn't important, by the way. It just feels like a different issue to me (after a quick read)

You can't really just self.load() to validate. This won't work in so many cases (data_key, attribute, pre/post processors,...).

We'd need to make validation independent from load, so that it can be called from load or dump.

Hm so maybe I misunderstood this statement:

You can use schema.validate(data), but under the hood it is doing the same thing as schema.load(data)

If load doesn't really validate, does that mean that load itself s broken? If yes, how does one properly validate if we cannot rely on load, dump and validate?

It looks from the issue you linked that you have a desire for more stricts separation of concerns which would be great. However, wouldn't it be OK to, at the moment, use the same kinda-broken behaviour for both situations? (as in, loading and dumping)

load validates and is not broken. validate loads, which means it requires data in a serialized format. Loading deserialized data is not a supported use case, but will work for certain simple types.

The correct way to do this is to dump the data, then load it. If the data types are wrong you will probably get errors during dump. This is how it always worked, you just get the real errors instead of ValidationErrors now. If the values are wrong, you will get ValidationErrors during the load operation.

Sorry to answer again here, but I think it's relevant: the proposed solution isn't quite correct. dump seems to silently correct some types. See the following example:

from marshmallow import fields, Schema
class A(Schema):
    test = fields.Str()
a = A()
a.load(a.dump({"test": 3})) # same with {"test":{}}

Does not raise, because dump transforms 3 into "3", but if you would directly load {"test": 3} you would get a ValidationError. Might be intentional, as it is probably simpler to coerce to string automatically, but in the present case of trying to validate as load would do, it's not good :(

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tadams42 picture tadams42  路  3Comments

sloria picture sloria  路  3Comments

DenisKuplyakov picture DenisKuplyakov  路  4Comments

manoadamro picture manoadamro  路  3Comments

ambye85 picture ambye85  路  4Comments