Marshmallow: Merge nested data with the same 'data_key'

Created on 26 Sep 2018  Â·  8Comments  Â·  Source: marshmallow-code/marshmallow

from marshmallow import Schema, fields, EXCLUDE
from pprint import pprint

data = {
    'username' : "John",
    'address': {
        'phone': '88000111000',
        'street': 'Kolobko Str',
        'home': 15,
    },
}

class Address:
    def __init__(self, *args, **kwargs):
        self.street = kwargs.get('street')
        self.home = kwargs.get('home')

class Contacts:
    def __init__(self, *args, **kwargs):
        self.phone = kwargs.get('phone')

class User:
    def __init__(self, *args, **kwargs):
        self.name = kwargs.get('name')
        self.address = kwargs.get('address')
        self.contacts = kwargs.get('contacts')


class AddressSchema(Schema):
    street = fields.String()
    home = fields.Integer()

class ContactsSchema(Schema):
    phone = fields.String()

class UserSchema(Schema):

    name = fields.String(data_key='username')
    address = fields.Nested(AddressSchema, only=['home', 'street'], unknown=EXCLUDE)
    contacts = fields.Nested(ContactsSchema, only=['phone'], data_key='address', unknown=EXCLUDE)

schema = UserSchema()
user = schema.load(data)    
res = schema.dump(user)
pprint(res, indent=2)

code above show different results:

➜ python test_marshmallow/test_nested_fields.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}

➜ python test_marshmallow/test_nested_fields.py
{'address': {'phone': '88000111000'}, 'username': 'John'}

while expected:

{
    'username': 'John',
    'address': {
        'home': 15, 
        'street': 'Kolobko Str',
        'phone': '88000111000'
    }
}

marshmallow==3.0.0b16

backwards incompat enhancement help wanted

Most helpful comment

Having this raise an exception seems like it's the correct solution, given that this is essentially a configuration error. Ideally this would be raised during creation of the schema, helping with visibility.

Quietly (and non-deterministically) doing stuff to the data going through the schema is likely to result in serious and hard to understand bugs.

All 8 comments

Nice.

21:10:34$ python test.py
{'address': {'phone': '88000111000'}, 'username': 'John'}
21:10:37$ python test.py
{'address': {'phone': '88000111000'}, 'username': 'John'}
21:10:39$ python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}

Third run returned another result. Here is an explanation: https://stackoverflow.com/questions/20192950/why-items-order-in-a-dictionary-changed-in-python
And example:

21:17:19$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:02$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:04$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:05$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:06$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:06$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:07$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:08$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:09$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:10$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}
21:18:11$ PYTHONHASHSEED=1 python test.py
{'address': {'home': 15, 'street': 'Kolobko Str'}, 'username': 'John'}

Merging sounds smelly (if that makes sense) and would not resolve order issues as in a dict update, last one speaking is right.

However, we could add a warning when a data_key is set multiple times in the same schema, or conflicts with a field name. Same goes with attribute.

Let's tag this as enhancement.

Or rather, raise an Exception.

Having this raise an exception seems like it's the correct solution, given that this is essentially a configuration error. Ideally this would be raised during creation of the schema, helping with visibility.

Quietly (and non-deterministically) doing stuff to the data going through the schema is likely to result in serious and hard to understand bugs.

I just sent a PR to raise a ValueError in case of duplicate: https://github.com/marshmallow-code/marshmallow/pull/992.

@ehles, @edelooff, please have a look and comment if you have the time.

Looks like a sane interface to me @lafrech :+1:

I've added one small comment on the PR, but that mostly relates to project style, which I can't really speak to.

Looks good, thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rastikerdar picture rastikerdar  Â·  3Comments

agatheblues picture agatheblues  Â·  3Comments

manoadamro picture manoadamro  Â·  3Comments

k0nsta picture k0nsta  Â·  4Comments

tadams42 picture tadams42  Â·  3Comments