For example, suppose we have the following JSON schema:
...
"metrics": {
"type": "object",
"patternProperties": {
"^[a-zA-Z]+$": {
"type": "object",
"required": ["v", "date"],
"properties": {
"v": {"type": "number"},
"date": {"type": "string"}
},
"additionalProperties": false
}
}
}
...
How to define marshmallow's schema for it?
I can validate it like this:
@MetricsSchema.validator
def validate_metrics(schema, input_data):
metric_key, metric_value = input_data['metrics'].items()[0]
if not re.compile(r'^[a-zA-Z]+$').match(metric_key):
raise ValidationError('Metric\'s name must be alphabetical symbols.')
errors = PredefinedMetricSchema().validate(metric_value)
if errors:
raise ValidationError(errors)
But how to make schema for serializing? I think it should have clear decision, because this case can be occured in schemas in your new project named smore.
Just to clarify the issue: you want to be able to generate a marshmallow Schema object from a JSON schema. Is this correct?
smore currently has some functionality to generate Swagger objects (which are based off the JSON schema spec) from marshmallow schemas. It shouldn't be too much of a stretch to go in the other direction (JSON schema -> marshmallow).
If this is the requested feature, could you please open an issue on the smore issue tracker?
No, I want to validate and serialize JSON data object, where fields can be named whatever (r'^[a-zA-Z]+$'), but each of these fields have a predictable and predefined structure. For example:
One object can be:
{
"metrics": {
"firstMetricName": {
"v": 100,
"date": "2015-01-09T04:33:17+00:00"
},
"secondMetricName": {
"v": 110,
"date": "2015-01-09T04:33:17+00:00"
}
}
}
Another object can be:
{
"metrics": {
"somethingOne": {
"v": 100,
"date": "2015-01-09T04:33:17+00:00"
},
"somethingAgain": {
"v": 110,
"date": "2015-01-09T04:33:17+00:00"
}
}
}
JSON schema allow to define structures like this. How to deal with it with marshmallow? Field's name in Schema can't be set as variable range of symbols or something like this.
I saw the smore project code and don't find anything about this case.
To be clear, structures like this can be defined in MongoEngine as:
class Metric(db.EmbeddedDocument):
value = db.FloatField(required=True, db_field='v')
date = db.DateTimeField(required=True)
class User(db.Document):
metrics = db.DictField(db.StringField, db.EmbeddedDocumentField(Metric))
@vovanbo Apologies for the delayed response. Here's a stab at a DictField that might meet your use case:
from marshmallow import Schema, fields, validate
class DictField(fields.Field):
def __init__(self, key_field, nested_field, *args, **kwargs):
fields.Field.__init__(self, *args, **kwargs)
self.key_field = key_field
self.nested_field = nested_field
def _deserialize(self, value):
ret = {}
for key, val in value.items():
k = self.key_field.deserialize(key)
v = self.nested_field.deserialize(val)
ret[k] = v
return ret
def _serialize(self, value, attr, obj):
ret = {}
for key, val in value.items():
k = self.key_field._serialize(key, attr, obj)
v = self.nested_field.serialize(key, self.get_value(attr, obj))
ret[k] = v
return ret
class MetricSchema(Schema):
value = fields.Float()
date = fields.DateTime()
class UserSchema(Schema):
metrics = DictField(
fields.Str(validate=validate.Regexp(r'^[a-zA-Z]+$')),
fields.Nested(MetricSchema)
)
metrics = {
"metrics": {
"firstMetricName": {
"value": 100,
"date": "2015-01-09T04:33:17+00:00"
},
"secondMetricName": {
"value": 110,
"date": "2015-01-09T04:33:17+00:00"
}
}
}
s = UserSchema(strict=True)
s.load(metrics).data
# {'metrics': {'firstMetricName': {'value': 100.0,
# 'date': datetime.datetime(2015, 1, 9, 4, 33, 17, tzinfo=tzutc())},
# 'secondMetricName': {'value': 110.0,
# 'date': datetime.datetime(2015, 1, 9, 4, 33, 17, tzinfo=tzutc())}}}
Thank you, @sloria! Good example of custom field. What about including this as one of standard fields in marshmallow 2.0?
I'm going to hold off on adding the Dict field to marshmallow in order to reduce maintenance burden and get 2.0-a out the door as soon as possible. Perhaps we can add it to the docs though. Closing this for now.
Would you reconsider adding a Dict field type as part of the standard library?
Would you reconsider adding a Dict field type as part of the standard library?
+1
I noticed you added a regular Dict(), but are there any plans to support the DictField() as presented above? I have lots of maps of named objects for which this functionality would be useful.
+1 for having built-in support for defining dict fields with prescribed schema for values.
I had the same requirement. I created my own NamedObjectMap field based on the code example above. The main additions are:
Chris
class NamedObjectMap(fields.Field):
default_error_messages = {
'invalid': 'Not a valid mapping type.'
}
def __init__(self, nested_field, *args, **kwargs):
fields.Field.__init__(self, *args, **kwargs)
self.name_field = fields.Str()
self.nested_field = nested_field
def _add_to_schema(self, field_name, schema):
super(NamedObjectMap, self)._add_to_schema(field_name, schema)
self.nested_field.parent = self
self.nested_field.name = field_name
def _deserialize(self, value, attr, data):
# Make sure we have a map
if not isinstance(value, collections.Mapping):
self.fail('invalid')
ret = {}
errors = {}
for key, val in value.items():
k = self.name_field.deserialize(key)
if val==None and self.nested_field.missing:
v = self.nested_field.missing
else:
try:
v = self.nested_field.deserialize(val)
except ValidationError as e:
errors[key] = e.messages
continue
if errors:
raise ValidationError(errors)
return ret
def _serialize(self, value, attr, obj):
ret = {}
for key, val in value.items():
k = self.name_field._serialize(key, attr, obj)
v = self.nested_field._serialize(val, key, obj)
ret[k] = v
return ret
Most helpful comment
@vovanbo Apologies for the delayed response. Here's a stab at a
DictFieldthat might meet your use case: