I'm trying to create a very flexible serializer, such that users can generate additional fields in the future. Let's say that today they only need the defaults I've provided
class PostSerializer(Serializer):
id = fields.String()
title = fields.String(default="Untitled")
body = fields.String(default=None)
author = fields.List(fields.String)
The user creates several posts, and they decide they want a field for "category." I provide an interface where they set a new category field. Now perhaps I store this field in a dictionary.
additional_fields = {
"category" : "list"
}
When I modify the serializer on the fly (the only way that seems to work is via Meta.additional, setattr never seems to work)
s = PostSerializer
PostSerializer.Meta.additional = additional_fields.keys()
Posts which were created without the 'category' field will cause the following AttributeError:
AttributeError: "category" is not a valid field for {'id': '123456', 'title': 'Cool Post', 'body': 'Lorem Ipsum...', 'author': ['John', 'Steve']}
How can I maintain flexibility to add user generated fields, but also protect myself in the future? Is there a way to set a global default for additional fields?
I suppose I can check through the list of posts before serializing and add the missing attributes (set to None), but if there is a way to set a global default, that would be awesome.
There are a couple of ways to handle your use case. Which one you choose will depend on your desired output.
Let's start with our "model" and a few instances:
class Post(object):
def __init__(self, title):
self.title = title
post_no_categories = Post('A post with no categories')
post_with_categories = Post('A post with categories')
post_with_categories.categories = ['music', 'video']
If you want the serialized output to always contain the additional fields and it's fine if they are None, you can use optional fields, e.g., set required=False on the fields.
from marshmallow import Serializer, fields, pprint
class PostSerializer(Serializer):
title = fields.String(default='Untitled')
categories = fields.List(fields.String, required=False)
PostSerializer(post_no_categories).data
# {"categories": null, "title": "A post with no categories"}
PostSerializer(post_with_categories).data
# {"categories": ["music", "video"], "title": "A post with categories"}
If you want to only include the additional fields if they are defined on an instance, you could do so with a custom data handler. This is perhaps more flexible than option 1. See the docs on Transforming Data.
class PostSerializer2(Serializer):
title = fields.String(default='Untitled')
additional_fields = {
'categories': fields.List(fields.String())
}
# Register a custom data handler that will add the extra fields
@PostSerializer2.data_handler
def add_additional_fields(serializer, data, obj):
for name, field_obj in additional_fields.items():
if hasattr(obj, name):
data[name] = field_obj.output(name, obj)
return data
PostSerializer2(post_no_categories).data
# {"title": "A post with no categories"}
PostSerializer2(post_with_categories).data
# {"title": "A post with categories", "categories": ["music", "video"]}
Hope that helps!
Yes! Option 2 is fantastic. However, I'm finding the data_handlers are really persistent.
I'm creating instances of the PostSerializer on the fly and altering the fields via url arguments (ie. ?include=category) and I notice that the data_handler sticks around.
So if on one query I request /posts?include=category so that the posts include the category, but on the next query I want to include only the subcategory /posts?include=subcategory, the category will still persist.
Is there a way to "zero out" the data_handler between requests?
It seems as though applying the decorator the new instance of the class passed it to the parent. I'm creating a new child class now and setting the decorators on the child, and that seems to be working much better. Thank you.
Glad to hear that worked!
And what would we be doing after data_handler was removed in 2.0, @sloria ?
Thanks!
@skqr You can use a @post_load method to do the same thing as data_handler.
Yet another alternative is to update a schema's fields on __init__.
from marshmallow import Schema, fields
class MySchema(Schema):
def __init__(self, additional_fields=None, **kwargs):
super().__init__(**kwargs)
self.declared_fields.update(additional_fields)
additional_fields = {
'foo': fields.Int()
}
sch = MySchema(additional_fields=additional_fields)
print(sch.dump({'foo': '123'}).data) # {'foo': 123}
So tired. I can not access original object via post_dump.
My case: I have many models in my django project with various text data, and I going to integrate django-modeltranslation to i18ning my models. And I would not add manually all fields with translations to schemas, but I want to do it on fly depending on settings.LANGUAGES, for example.
Some code:
class InterfaceI18N(models.Model):
label = models.CharField('string id', max_length=150, db_index=True,
# after django-modeltranslation integration this field will have copies for i18n purposes and this model will be extended by fields: value_de, value_en, value_fr and others
value = models.TextField('value', default='')
class Meta:
verbose_name = 'interface string'
verbose_name_plural = 'interface strings'
def __str__(self):
return self.label
class InterfaceI18NSchema(Schema):
label = fields.String()
value = fields.String()
@post_dump
def add_i18n_fields(self, *args, **kwargs):
raise Exception('Can not add all value_XX fields for all languages :(')
What do you think about this case, @sloria ? I can not find any right way to do it avoiding write all i18n fields in schema manually.
@MrYoda I would create a custom metaclass that would take a list of I18n field names and creates clones and adds post_dump processors.
Although, unless you're writing a translation tool the whole idea looks weird.
@MrYoda Here is an example of how to do that (if I understand problem correctly):
import marshmallow as m
import marshmallow.fields as mf
from marshmallow.compat import with_metaclass
from collections import namedtuple, OrderedDict
LANGUAGES = ['en', 'de', 'fr']
class I18NMeta(type):
def __new__(cls, name, bases, attrs):
new_attrs = OrderedDict(attrs)
if 'Meta' in attrs:
for name in getattr(attrs['Meta'], 'i18n_fields', []):
if name not in attrs:
continue
field = attrs[name]
for lang in LANGUAGES:
new_attrs['%s_%s' % (name, lang)] = field
return type(name, bases, new_attrs)
class ModelSchema(with_metaclass(I18NMeta, m.Schema)):
class Meta:
ordered = True
i18n_fields = ['value']
value = mf.String()
Model = namedtuple('Model', ['value', 'value_en', 'value_de', 'value_fr'])
print ModelSchema().dump(Model('Hello', 'Hello', 'Hallo', 'Bonjour'))
# => MarshalResult(data=OrderedDict([(u'value', u'Hello'), (u'value_en', u'Hello'), (u'value_de', u'Hallo'), (u'value_fr', u'Bonjour')]), errors={})
Alternatively, you can use special type to mark localized strings:
import marshmallow as m
import marshmallow.fields as mf
from marshmallow.compat import with_metaclass, iteritems
from collections import namedtuple, OrderedDict
LANGUAGES = ['en', 'de', 'fr']
class LocalizedString(mf.String):
pass
class I18NMeta(type):
def __new__(cls, name, bases, attrs):
new_attrs = OrderedDict(attrs)
for field_name, field in iteritems(attrs):
if not isinstance(field, LocalizedString):
continue
for lang in LANGUAGES:
new_attrs['%s_%s' % (field_name, lang)] = field
return type(name, bases, new_attrs)
class ModelSchema(with_metaclass(I18NMeta, m.Schema)):
value = LocalizedString()
Model = namedtuple('Model', ['value', 'value_en', 'value_de', 'value_fr'])
print ModelSchema().dump(Model('Hello', 'Hello', 'Hallo', 'Bonjour'))
@maximkulkin Thank you for very good snippet!
Another data point for the record, in case someone else runs into this problem.
I had the same AttributeError: description is not a valid field for <some.Model object>. For me the problem was that the object had a description @property and this method tried to get a related item from a database and get the description from there. When no object was found, this gave an AttributeError. I fixed it by returning None in that case.
In other words: the real error might be hiding behind the AttributeError raised by marshmallow.
Most helpful comment
@skqr You can use a @post_load method to do the same thing as
data_handler.Yet another alternative is to update a schema's fields on
__init__.