I might not know the right term for what I am trying to achieve here, but basically I want to json-serialize data coming from a query and group data together in schemas. My stack is Python Flask + SQL Alchemy + Marshmallow. Let me explain:
Imagine you have the following result from a query:
category_id | type_id | value
------------ | ------------- | -------------
animal | cat | 2
animal | dog | 10
plant | cactus | 5
plant | flower | 15
What I want to get as a response:
```{json}
[
{
"category_id": "animal"
"types": [
{
"type_id": "cat",
"value": 2
},
{
"type_id": "dog",
"value": 10
}
]
},
{
"category_id": "plant"
"types": [
{
"type_id": "cactus",
"value": 5
},
{
"type_id": "flower",
"value": 15
}
]
}
]
**What I get**
```{json}
[
{ "category_id": "animal" },
{ "category_id": "animal" },
{ "category_id": "plant" },
{ "category_id": "plant" }
]
My schemas:
class CategorySchema(Schema):
types = fields.Nested(TypeSchema, many=True)
class Meta:
fields = (
'category_id',
'types'
)
class TypeSchema(Schema):
class Meta:
fields = (
'type_id',
'value'
)
Is there a way to say: "create one CategorySchema, group by category_id"?
Thanks!
I would start by partitioning the data into the desired structure before passing it into marshmallow. If you need to reuse that operation in a generic way you can start decomposing the functionality into custom fields.
from marshmallow import Schema, fields
data = [
{'category_id': 'animal', 'type_id': 'cat', 'value': 2},
{'category_id': 'animal', 'type_id': 'dog', 'value': 10},
{'category_id': 'plant', 'type_id': 'cactus', 'value': 5},
{'category_id': 'plant', 'type_id': 'flower', 'value': 15},
]
class CategorySchema(Schema):
types = fields.Nested(TypeSchema, many=True)
class Meta:
fields = (
'category_id',
'types'
)
class TypeSchema(Schema):
class Meta:
fields = (
'type_id',
'value'
)
groups = [
{
'category_id': category,
'types': [item for item in data if item['category_id'] == category],
}
for category in set(item['category_id'] for item in data)
]
CategorySchema(many=True).dump(groups)
Actually this might be difficult to achieve with custom fields due to how much the structure changes. A @pre_dump hook with a generic partition function would be easier.
from marshmallow import Schema, fields, pre_dump
def partition(items, key, container_name):
return [
{
key: index,
container_name: [item for item in items if item[key] == index],
}
for index in set(item[key] for item in items)
]
data = [
{'category_id': 'animal', 'type_id': 'cat', 'value': 2},
{'category_id': 'animal', 'type_id': 'dog', 'value': 10},
{'category_id': 'plant', 'type_id': 'cactus', 'value': 5},
{'category_id': 'plant', 'type_id': 'flower', 'value': 15},
]
class CategorySchema(Schema):
types = fields.Nested(TypeSchema, many=True)
class Meta:
fields = (
'category_id',
'types'
)
@pre_dump(pass_many=True)
def partition_categories(self, data, many):
return partition(data, 'category_id', 'types')
class TypeSchema(Schema):
class Meta:
fields = (
'type_id',
'value'
)
CategorySchema(many=True).dump(data)
Thanks, the pre_dump hook does marvels!
Most helpful comment
Actually this might be difficult to achieve with custom fields due to how much the structure changes. A
@pre_dumphook with a generic partition function would be easier.