Hi guys,
I'm new to marshmallow and I'm trying to de-serialize JSON data where any key can potentially have a 'None' string value (in place of the none json type)
I'd like to set these to Python's None type and after carefully reading the docs it seems like @pre_load would be the way to do this. I'd like to return a POPO as a result, so I also have a @post_load for that.
The problem I'm experiencing is that, once I make an update to the data dict in @pre_load, I get a
If @pre_load doesn't touch the data (for example, if we pass in user_json2 which has no 'None' strings) then the result is an instance of class User. As soon as we update a key, we're getting back a dictionary. It would appear that post_load is not being invoked.
Here's the example code-
#Our Plain Old Python Object (POPO)
class User(object):
def __init__(self, **kwargs):
for key, value in kwargs.iteritems():
setattr(self, key, value)
#Marshmallow Schema
from marshmallow import Schema, fields, pre_load, post_load
class UserSchema(Schema):
firstName = fields.Str(load_from='FIRST_NAME')
middleNames = fields.Str(load_from='MIDDLE_NAMES')
lastName = fields.Str(load_from='LAST_NAME')
fullLegalName = fields.Str(load_from='FULL_LEGAL_NAME')
knownAs = fields.Str(load_from='KNOWN_AS')
emailAddress = fields.Str(load_from='EMAIL_ADDRESS')
@pre_load
def NoneToNoneType(self, data):
for key, value in data.iteritems():
if value == 'None':
data[key] = None
return data
@post_load
def OutputDomainObject(self, data):
#This print statement is never shown if user_json is passed in...
print('post_load invoked!')
return User(**data)
user_json = '{"FIRST_NAME": "Kali","LAST_NAME": "Subra","KNOWN_AS": "None","FULL_LEGAL_NAME": "Kali Subra","EMAIL_ADDRESS": "None","MIDDLE_NAMES": "None"}'
user_json2 = '{"FIRST_NAME": "Kali","LAST_NAME": "Subra","KNOWN_AS": "Z","FULL_LEGAL_NAME": "Kali Subra","EMAIL_ADDRESS": "Z","MIDDLE_NAMES": "Z"}'
user_schema = UserSchema()
result = user_schema.loads(user_json)
@nochristrequired
Ran into similar issue, which may be related to yours, and traced to existing schema errors.
Inside marshmallow/schema.py just before post processors run you will see:
if not errors and postprocess:
Schema errors prevent the post processors from running. Address your schema errors and @post_load will fire.
@amcclanaghan is correct; post_load methods will only execute if the data are valid. In your example, None is considered invalid input to your fields, so post_load won't execute. You can pass allow_none=True to your fields if you consider None valid input.
I know this is an old bug, but I must say it's not very helpful that it merely silently fails when there are schema errors. It makes it very hard to debug whats going on when your schema returns a dictionary instead of an object, unless you already know this behaviour.
In marshmallow 3, errors always raise an exception, so it won't silently fail.
class InventorySchema(Schema):
Id = fields.Int()
Company = fields.Int()
LastModified = fields.DateTime()
Name = fields.Str()
Value = fields.Str()
Description = fields.Str()
Service = fields.Str()
DateOfResults = fields.DateTime()
@post_load
def toModel(self, data):
return Inventory(**data)
@post_dump
def toDict(self, data):
return dict(data)
>>> row
{'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32', 'Service': 'EC2', 'Name': 'Runni
ng Instances', 'Value': '4', 'Description': 'Across 8 Regions', 'DateOfResults': '2014-08-18T20
:31:21'}
>>> try:
... item = InventorySchema().load(row)
... except ValidationError as e:
... print(e)
...
...
>>> item
{'Id': 1}
>>>
I could be mistaken but I'd consider this a silent failure. While this is fixed with a pre_load, should I _have_ to pre_load in order to post_load to catch this issue?
I'm sorry @ptdel, I don't understand.
What's the validation error, here? Can you please provide a complete example? Which version of marshmallow are you using?
3.0.0b8
in the above example i have a post_load and post_dump decorator defined. You'll see a load operation should result in returning a class instantiated with the row, and a dump returns the row as a dict. despite doing a load operation, i get a dict back. (i.e. i get the result of post_dump and not post_load) The load operation (that somehow results in a dump?) doesn't return any errors.
I don't think the post_dump is needed. (By default) the Schema returns a dict anyway. That's the point of serialization.
I don't understand your issue and I might be totally mistaken but I think it is not related to the OP, which was apparently due to a validation error preventing post_load method to be called. Is there a validation error in your case? AFAIU, no.
So I don't understand what happens. How come only Id appears in the result?
Do you pass through toModel? What's the definition of class Inventory?
I might be hitting a different issue good point. In this instance though i'm not trying to just return a dict, this should return an instantiation of Inventory(), not a dict(). I get the dict though, if you catch my drift :)
__tablename__ = 'inventory'
__table_args__ = {'mysql_engine': 'InnoDB', 'mysql_charset': 'utf8'}
Id = Column(Integer, primary_key=True)
Company = Column(Integer)
LastModified = Column(DateTime)
Name = Column(String(255))
Value = Column(String(255))
Description = Column(String(255))
Service = Column(String(255))
DateOfResults = Column(DateTime)
as to why only Id shows up? I would really like to know. I figured that maybe some validation had failed and those invalid fields (though i don't see anything invalid) were being silently left out of the error messages list
I ran the example with 3.0.0b8 and did not get the reported issue.
from marshmallow import Schema, fields, post_load, post_dump
class Inventory:
def __init__(self, **data):
pass
class InventorySchema(Schema):
Id = fields.Int()
Company = fields.Int()
LastModified = fields.DateTime()
Name = fields.Str()
Value = fields.Str()
Description = fields.Str()
Service = fields.Str()
DateOfResults = fields.DateTime()
@post_load
def toModel(self, data):
return Inventory(**data)
@post_dump
def toDict(self, data):
return dict(data)
row = {
'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32',
'Service': 'EC2', 'Name': 'Running Instances', 'Value': '4',
'Description': 'Across 8 Regions',
'DateOfResults': '2014-08-18T20:31:21'
}
item = InventorySchema().load(row)
item
# <__main__.Inventory object at 0x1045a7400>
@deckar01 I was about to say "prepare to fight" because I just ran yesterday, but as I run it again from the REPL:
>>> row = {
... 'Id': 1, 'Company': 1, 'LastModified': '2018-04-09T18:14:32',
... 'Service': 'EC2', 'Name': 'Running Instances', 'Value': '4',
... 'Description': 'Across 8 Regions',
... 'DateOfResults': '2014-08-18T20:31:21'
... }
>>> item = InventorySchema().load(row)
>>> item
<__console__.Inventory object at 0x7f551f3f4da0>
>>> try:
... item2 = InventorySchema().load(row)
... except ValidationError as e:
... print(e)
...
...
>>> item2
<__console__.Inventory object at 0x7f551f7c85c0>
>>>
I have no idea what happened here but it must be something transient that I managed to bork. Disregard.
Maybe a SqlAlchemy issue?
Anyway, I still think the post_dump is useless.
that would be my guess, i'm using scoped sessions so it might be that I instantiated that class outside of the session or something and it got lost. Sorry to hassle yall I don't think this is a Marshmallow issue.