Marshmallow: Serializing schemas per group

Created on 10 Oct 2018  路  3Comments  路  Source: marshmallow-code/marshmallow

I might not know the right term for what I am trying to achieve here, but basically I want to json-serialize data coming from a query and group data together in schemas. My stack is Python Flask + SQL Alchemy + Marshmallow. Let me explain:

Imagine you have the following result from a query:

category_id | type_id | value
------------ | ------------- | -------------
animal | cat | 2
animal | dog | 10
plant | cactus | 5
plant | flower | 15

What I want to get as a response:

```{json}
[
{
"category_id": "animal"
"types": [
{
"type_id": "cat",
"value": 2
},
{
"type_id": "dog",
"value": 10
}
]
},
{
"category_id": "plant"
"types": [
{
"type_id": "cactus",
"value": 5
},
{
"type_id": "flower",
"value": 15
}
]
}
]

**What I get**
```{json}
[
  { "category_id": "animal" },
  { "category_id": "animal" },
  { "category_id": "plant" },
  { "category_id": "plant" }
]

My schemas:

class CategorySchema(Schema):
    types = fields.Nested(TypeSchema, many=True)
    class Meta:
        fields = (
            'category_id',
            'types'
        )

class TypeSchema(Schema):
    class Meta:
        fields = (
            'type_id',
            'value'
        )

Is there a way to say: "create one CategorySchema, group by category_id"?

Thanks!

question

Most helpful comment

Actually this might be difficult to achieve with custom fields due to how much the structure changes. A @pre_dump hook with a generic partition function would be easier.

from marshmallow import Schema, fields, pre_dump


def partition(items, key, container_name):
    return [
        {
            key: index,
            container_name: [item for item in items if item[key] == index],
        }
        for index in set(item[key] for item in items)
    ]

data = [
    {'category_id': 'animal', 'type_id': 'cat', 'value': 2},
    {'category_id': 'animal', 'type_id': 'dog', 'value': 10},
    {'category_id': 'plant', 'type_id': 'cactus', 'value': 5},
    {'category_id': 'plant', 'type_id': 'flower', 'value': 15},
]

class CategorySchema(Schema):
    types = fields.Nested(TypeSchema, many=True)
    class Meta:
        fields = (
            'category_id',
            'types'
        )
    @pre_dump(pass_many=True)
    def partition_categories(self, data, many):
        return partition(data, 'category_id', 'types')

class TypeSchema(Schema):
    class Meta:
        fields = (
            'type_id',
            'value'
        )

CategorySchema(many=True).dump(data)

All 3 comments

I would start by partitioning the data into the desired structure before passing it into marshmallow. If you need to reuse that operation in a generic way you can start decomposing the functionality into custom fields.

from marshmallow import Schema, fields


data = [
    {'category_id': 'animal', 'type_id': 'cat', 'value': 2},
    {'category_id': 'animal', 'type_id': 'dog', 'value': 10},
    {'category_id': 'plant', 'type_id': 'cactus', 'value': 5},
    {'category_id': 'plant', 'type_id': 'flower', 'value': 15},
]

class CategorySchema(Schema):
    types = fields.Nested(TypeSchema, many=True)
    class Meta:
        fields = (
            'category_id',
            'types'
        )

class TypeSchema(Schema):
    class Meta:
        fields = (
            'type_id',
            'value'
        )

groups = [
    {
        'category_id': category,
        'types': [item for item in data if item['category_id'] == category],
    }
    for category in set(item['category_id'] for item in data)
]

CategorySchema(many=True).dump(groups)

Actually this might be difficult to achieve with custom fields due to how much the structure changes. A @pre_dump hook with a generic partition function would be easier.

from marshmallow import Schema, fields, pre_dump


def partition(items, key, container_name):
    return [
        {
            key: index,
            container_name: [item for item in items if item[key] == index],
        }
        for index in set(item[key] for item in items)
    ]

data = [
    {'category_id': 'animal', 'type_id': 'cat', 'value': 2},
    {'category_id': 'animal', 'type_id': 'dog', 'value': 10},
    {'category_id': 'plant', 'type_id': 'cactus', 'value': 5},
    {'category_id': 'plant', 'type_id': 'flower', 'value': 15},
]

class CategorySchema(Schema):
    types = fields.Nested(TypeSchema, many=True)
    class Meta:
        fields = (
            'category_id',
            'types'
        )
    @pre_dump(pass_many=True)
    def partition_categories(self, data, many):
        return partition(data, 'category_id', 'types')

class TypeSchema(Schema):
    class Meta:
        fields = (
            'type_id',
            'value'
        )

CategorySchema(many=True).dump(data)

Thanks, the pre_dump hook does marvels!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

symonk picture symonk  路  3Comments

j4k0bk picture j4k0bk  路  3Comments

imhoffd picture imhoffd  路  3Comments

jayennis22 picture jayennis22  路  4Comments

lassandroan picture lassandroan  路  3Comments