When using Nested fields, passing a single string field to only does not work as expected when applying a transform with on_bind_field
from marshmallow import fields, Schema
def test_nested_only_on_bind_field():
def to_camel_case(snake_str):
components = snake_str.split('_')
return components[0] + ''.join(x.title() for x in components[1:])
class CamelCaseSchema(Schema):
def on_bind_field(self, field_name, field_obj):
field_obj.data_key = to_camel_case(field_name)
class UserRoleSchema(CamelCaseSchema):
user_id = fields.Number()
role_name = fields.String()
class UserSchema(CamelCaseSchema):
id = fields.Number()
roles = fields.Nested(UserRoleSchema, many=True, only='role_name')
user = {
'id': 1,
'roles': [
{'user_id': 1, 'role_name': 'admin'},
{'user_id': 1, 'role_name': 'test'},
]
}
serialized_user = UserSchema().dumps(user)
assert serialized_user == {"id": 1, "roles": ["admin", "test"]}
This results in the following error:
/env/lib/python3.6/site-packages/marshmallow/schema.py:476: in dumps
serialized = self.dump(obj, many=many, update_fields=update_fields)
/env/lib/python3.6/site-packages/marshmallow/schema.py:428: in dump
**kwargs
/env/lib/python3.6/site-packages/marshmallow/marshalling.py:147: in serialize
index=(index if index_errors else None)
/env/lib/python3.6/site-packages/marshmallow/marshalling.py:68: in call_and_store
value = getter_func(data)
/env/lib/python3.6/site-packages/marshmallow/marshalling.py:141: in <lambda>
getter = lambda d: field_obj.serialize(attr_name, d, accessor=accessor)
/env/lib/python3.6/site-packages/marshmallow/fields.py:250: in serialize
return self._serialize(value, attr, obj)
/env/lib/python3.6/site-packages/marshmallow/fields.py:455: in _serialize
return utils.pluck(ret, key=self.only)
/env/lib/python3.6/site-packages/marshmallow/utils.py:318: in pluck
return [d[key] for d in dictlist]
KeyError: 'role_name'
This seems to be only affecting the single string case, as passing only=['role_name'] works as expected.
Would it make sense to remove this support entirely for v3? ref https://github.com/marshmallow-code/marshmallow/issues/802
For example, only on Schema does not support a string at all.
I was able to reproduce this error in v3.0.0b10. I reproduced the same issue in v2.15.2 using dump_to.
Here is a simplified repro case (for v3.0.0b10):
from marshmallow import fields, Schema
class UserRoleSchema(Schema):
user_id = fields.Number()
role_name = fields.String(data_key='roleName')
class UserSchema(Schema):
id = fields.Number()
roles = fields.Nested(UserRoleSchema, many=True, only='role_name')
user = {
'id': 1,
'roles': [
{'user_id': 1, 'role_name': 'admin'},
{'user_id': 1, 'role_name': 'test'},
]
}
serialized_user = UserSchema().dumps(user)
The issue appears to be that only is being used by Schema.__filter_fields() as a key into the unserialized data and by Nested._serialize() as a key into the serialized data. Nested_serialize() needs to use the same data_key logic as Marshaller.serialize().
Would it make sense to remove this support entirely for v3?
I think modulating the Nested functionality this much based on the type of a parameter is confusing. The feature is useful though.
What about a dedicated field type to encapsulate this behavior?
class UserSchema(Schema):
id = fields.Number()
roles = fields.Pluck(UserRoleSchema, 'role_name', many=True)
I think that interface would communicate the behavior without needing to dig into the docs.
Sorry, I don't mean removing only, I mean removing supporting a string for only, as this isn't supported on Schema. In other words, instead of supporting:
child = fields.Nested(ChildSchema, only='field_name')
Support only:
child = fields.Nested(ChildSchema, only=('field_name',))
@taion That is how I interpreted your original comment. Using only as a string isn't a shortcut for keeping one field, it actually replaces the object with a field value. I agree with getting rid of the string option for only, but I think we should add an interface to replace and continue supporting the pluck use case.
It's confusing that only can be a list of strings or a string, and it is even more confusing that a string is not just a shortcut for a list containing a single string.
Plucking a single field is a different feature than embedding a schema, and using the same Field is strange. If that use case must be supported, a different Field like Pluck type sounds better suited.
I think modulating the Nested functionality this much based on the type of a parameter is confusing. The feature is useful though.
It's confusing that only can be a list of strings or a string, and it is even more confusing that a string is not just a shortcut for a list containing a single string...Plucking a single field is a different feature than embedding a schema, and using the same Field is strange.
Valid points. In retrospect, the current API is awkward.
I think @deckar01 's fields.Pluck proposal makes sense. Let's go with it.
Oh, I totally misread. Sorry!
As a side note, if we go ahead with https://github.com/marshmallow-code/marshmallow/pull/810, then __filter_fields is gone anyway. Though I agree that fields.Pluck makes more sense regardless.
Most helpful comment
I think modulating the
Nestedfunctionality this much based on the type of a parameter is confusing. The feature is useful though.What about a dedicated field type to encapsulate this behavior?
I think that interface would communicate the behavior without needing to dig into the docs.