Marshmallow: @post_load function not invoked if data dict is updated from @pre_load?

Created on 11 Sep 2016  路  13Comments  路  Source: marshmallow-code/marshmallow

Hi guys,

I'm new to marshmallow and I'm trying to de-serialize JSON data where any key can potentially have a 'None' string value (in place of the none json type)

I'd like to set these to Python's None type and after carefully reading the docs it seems like @pre_load would be the way to do this. I'd like to return a POPO as a result, so I also have a @post_load for that.

The problem I'm experiencing is that, once I make an update to the data dict in @pre_load, I get a back in result instead of an instance of class User.

If @pre_load doesn't touch the data (for example, if we pass in user_json2 which has no 'None' strings) then the result is an instance of class User. As soon as we update a key, we're getting back a dictionary. It would appear that post_load is not being invoked.

Here's the example code-

#Our Plain Old Python Object (POPO)
 class User(object):
     def __init__(self, **kwargs):
         for key, value in kwargs.iteritems():
             setattr(self, key, value)

 #Marshmallow Schema
 from marshmallow import Schema, fields, pre_load, post_load
 class UserSchema(Schema):
     firstName = fields.Str(load_from='FIRST_NAME')
     middleNames = fields.Str(load_from='MIDDLE_NAMES')
     lastName = fields.Str(load_from='LAST_NAME')
     fullLegalName = fields.Str(load_from='FULL_LEGAL_NAME')
     knownAs = fields.Str(load_from='KNOWN_AS')
     emailAddress = fields.Str(load_from='EMAIL_ADDRESS')


     @pre_load
     def NoneToNoneType(self, data):
         for key, value in data.iteritems():
             if value == 'None':
                 data[key] = None
         return data

     @post_load
     def OutputDomainObject(self, data):
        #This print statement is never shown if user_json is passed in...
         print('post_load invoked!')
         return User(**data)

 user_json = '{"FIRST_NAME": "Kali","LAST_NAME": "Subra","KNOWN_AS": "None","FULL_LEGAL_NAME": "Kali Subra","EMAIL_ADDRESS": "None","MIDDLE_NAMES": "None"}'
 user_json2 = '{"FIRST_NAME": "Kali","LAST_NAME": "Subra","KNOWN_AS": "Z","FULL_LEGAL_NAME": "Kali Subra","EMAIL_ADDRESS": "Z","MIDDLE_NAMES": "Z"}'    

 user_schema = UserSchema()
 result = user_schema.loads(user_json)

All 13 comments

@nochristrequired

Ran into similar issue, which may be related to yours, and traced to existing schema errors.

Inside marshmallow/schema.py just before post processors run you will see:

if not errors and postprocess:

Schema errors prevent the post processors from running. Address your schema errors and @post_load will fire.

@amcclanaghan is correct; post_load methods will only execute if the data are valid. In your example, None is considered invalid input to your fields, so post_load won't execute. You can pass allow_none=True to your fields if you consider None valid input.

I know this is an old bug, but I must say it's not very helpful that it merely silently fails when there are schema errors. It makes it very hard to debug whats going on when your schema returns a dictionary instead of an object, unless you already know this behaviour.

In marshmallow 3, errors always raise an exception, so it won't silently fail.

class InventorySchema(Schema):

    Id = fields.Int()
    Company = fields.Int()
    LastModified = fields.DateTime()
    Name = fields.Str()
    Value = fields.Str()
    Description = fields.Str()
    Service = fields.Str()
    DateOfResults = fields.DateTime()

    @post_load
    def toModel(self, data):
        return Inventory(**data)

    @post_dump
    def toDict(self, data):
        return dict(data)
>>> row
{'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32', 'Service': 'EC2', 'Name': 'Runni
ng Instances', 'Value': '4', 'Description': 'Across 8 Regions', 'DateOfResults': '2014-08-18T20
:31:21'}
>>> try:
...     item = InventorySchema().load(row)
... except ValidationError as e:
...     print(e)
...
...
>>> item
{'Id': 1}
>>>

I could be mistaken but I'd consider this a silent failure. While this is fixed with a pre_load, should I _have_ to pre_load in order to post_load to catch this issue?

I'm sorry @ptdel, I don't understand.

What's the validation error, here? Can you please provide a complete example? Which version of marshmallow are you using?

3.0.0b8

in the above example i have a post_load and post_dump decorator defined. You'll see a load operation should result in returning a class instantiated with the row, and a dump returns the row as a dict. despite doing a load operation, i get a dict back. (i.e. i get the result of post_dump and not post_load) The load operation (that somehow results in a dump?) doesn't return any errors.

I don't think the post_dump is needed. (By default) the Schema returns a dict anyway. That's the point of serialization.

I don't understand your issue and I might be totally mistaken but I think it is not related to the OP, which was apparently due to a validation error preventing post_load method to be called. Is there a validation error in your case? AFAIU, no.

So I don't understand what happens. How come only Id appears in the result?

Do you pass through toModel? What's the definition of class Inventory?

I might be hitting a different issue good point. In this instance though i'm not trying to just return a dict, this should return an instantiation of Inventory(), not a dict(). I get the dict though, if you catch my drift :)

    __tablename__ = 'inventory'
    __table_args__ = {'mysql_engine': 'InnoDB', 'mysql_charset': 'utf8'}

    Id = Column(Integer, primary_key=True)
    Company = Column(Integer)
    LastModified = Column(DateTime)
    Name = Column(String(255))
    Value = Column(String(255))
    Description = Column(String(255))
    Service = Column(String(255))
    DateOfResults = Column(DateTime)

as to why only Id shows up? I would really like to know. I figured that maybe some validation had failed and those invalid fields (though i don't see anything invalid) were being silently left out of the error messages list

I ran the example with 3.0.0b8 and did not get the reported issue.

from marshmallow import Schema, fields, post_load, post_dump

class Inventory:
    def __init__(self, **data):
        pass

class InventorySchema(Schema):
    Id = fields.Int()
    Company = fields.Int()
    LastModified = fields.DateTime()
    Name = fields.Str()
    Value = fields.Str()
    Description = fields.Str()
    Service = fields.Str()
    DateOfResults = fields.DateTime()

    @post_load
    def toModel(self, data):
        return Inventory(**data)

    @post_dump
    def toDict(self, data):
        return dict(data)

row = {
  'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32',
  'Service': 'EC2', 'Name': 'Running Instances', 'Value': '4',
  'Description': 'Across 8 Regions',
  'DateOfResults': '2014-08-18T20:31:21'
}

item = InventorySchema().load(row)

item
# <__main__.Inventory object at 0x1045a7400>

@deckar01 I was about to say "prepare to fight" because I just ran yesterday, but as I run it again from the REPL:

>>> row = {
...   'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32',
...   'Service': 'EC2', 'Name': 'Running Instances', 'Value': '4',
...   'Description': 'Across 8 Regions',
...   'DateOfResults': '2014-08-18T20:31:21'
... }
>>> item = InventorySchema().load(row)
>>> item
<__console__.Inventory object at 0x7f551f3f4da0>
>>> try:
...     item2 = InventorySchema().load(row)
... except ValidationError as e:
...     print(e)
...
...
>>> item2
<__console__.Inventory object at 0x7f551f7c85c0>
>>>

I have no idea what happened here but it must be something transient that I managed to bork. Disregard.

Maybe a SqlAlchemy issue?

Anyway, I still think the post_dump is useless.

that would be my guess, i'm using scoped sessions so it might be that I instantiated that class outside of the session or something and it got lost. Sorry to hassle yall I don't think this is a Marshmallow issue.

Was this page helpful?
0 / 5 - 0 ratings