Marshmallow: JOIN and GROUP_BY for nested schema

Created on 31 May 2018  路  4Comments  路  Source: marshmallow-code/marshmallow

I'm trying to achieve what I think would be the equivalent of a JOIN and GROUP_BY for nested schema. I don't want to use the employee.accounts relationship because in this context it makes more sense for them to sit under services.accounts. My data model is effectively:

employees (1->n) employees_services (n <-1) services (1->n) accounts

The EmployeeSchema:

class EmployeeSchema(marshmallow.Schema):
    class Meta:
        fields = ('employee_id', 'services')

    employee_id = fields.Str(dump_only=True)
    services = marshmallow.Nested(
        'ServiceSchema',
        many=True,
        dump_only=True,
        only=('id', 'name', 'description', 'accounts.privilege_level', 'accounts.username', 'accounts.employee.employee_id' ))

employee_schema = EmployeeSchema()
employees_schema = EmployeeSchema(many=True)

Currently, this produces all accounts for all employees rather than the accounts for the employee from the parent schema, like so (employee2/PW2 should not be in the accounts list):

[
  {
    "employee_id": "PW1", 
    "login_account": {
      "email": "[email protected]", 
    },
    "services": [
      {
        "accounts": [
          {
            "employee": {
              "employee_id": "PW2"
            },
            "username": "[email protected]"
          }, 
          {
            "employee": {
              "employee_id": "PW1"
            },
            "username": "[email protected]"
          }
        ], 
        "description": "Manage user accounts", 
        "id": 1, 
        "name": "User Account Manager"
      }
    ],
    ...
  }, 
  ...

The relevant code from my models:

class Employee(ResourceMixin, db.Model):
    __tablename__ = 'employees'

    employee_id = db.Column(db.String(10), primary_key=True)
    services = association_proxy(
        'employee_services',
        'service',
        creator=lambda srv: EmployeeService(service=srv))
    accounts = db.relationship('Account',
                               lazy='select',
                               backref=db.backref('employee',
                                                  lazy='joined'))

    def __init__(self, **kwargs):
        super(Employee, self).__init__(**kwargs)
class EmployeeService(ResourceMixin, db.Model):
    __tablename__ = 'employees_services'

    employee_id = db.Column(db.String(10),
                        db.ForeignKey('employees.employee_id'),
                        primary_key=True)
    service_id = db.Column(db.Integer,
                           db.ForeignKey('services.id'),
                           primary_key=True)
    employee = db.relationship('Employee',
        backref=db.backref('employee_services',
            cascade='all, delete-orphan'))
    service = db.relationship('Service',
        backref=db.backref('service_employees',
            cascade='all, delete-orphan'))


    def __init__(self, **kwargs):
        super(EmployeeService, self).__init__(**kwargs)



md5-f52418adee95e015a84a5caa26cbf662



class Service(ResourceMixin, db.Model):
    __tablename__ = 'services'

    id = db.Column(db.Integer, primary_key=True)
    name = db.Column('service', db.String(255),
                     unique=True,
                     index=True,
                     nullable=False,
                     server_default='')
    employees = association_proxy(
        'service_employees',
        'employee',
        creator=lambda emp: EmployeeService(employee=emp))
    accounts = db.relationship('Account',
                               lazy='select',
                               backref=db.backref('service', lazy='joined'))

    def __init__(self, **kwargs):
        super(Service, self).__init__(**kwargs)



md5-f52418adee95e015a84a5caa26cbf662



class Account(ResourceMixin, db.Model):
    __tablename__ = 'accounts'

    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(50),
                         index=True,
                         nullable=False,
                         server_default='')
    employee_id = db.Column(db.String(10),
                            db.ForeignKey('employees.employee_id'),
                            index=True,
                            nullable=False)
    service_id = db.Column(db.Integer,
                           db.ForeignKey('services.id'),
                           index=True,
                           nullable=False)

    def __init__(self, **kwargs):
        # Call Flask-SQLAlchemy's constructor.
        super(Account, self).__init__(**kwargs)



md5-30f2a60943a458362f9813036bfe74e1



    @post_dump
    def remove_non_parent_employee_accounts(self, data):
        # Remove service accounts that don't belong to the parent employee
        # record. This is due to accessing as employee.services.accounts
        # when there isn't a direct relationship and I can't find a way to
        # do a JOIN and GROUP_BY in marshmallow.
        if 'services' in data:
            employee_id = data['employee_id']
            orig_services = data['services']
            new_services = []
            for orig_service in orig_services:
                new_service = {
                    'id': orig_service['id'],
                    'description': orig_service['description'],
                    'name': orig_service['name']
                }
                orig_accounts = orig_service['accounts']
                new_accounts = []
                for orig_account in orig_accounts:
                    if orig_account['employee']['employee_id'] == employee_id:
                        new_account = {
                            'privilege_level': orig_account['privilege_level'],
                            'username': orig_account['username']
                        }
                        new_accounts.append(new_account)
                new_service['accounts'] = new_accounts
                new_services.append(new_service)
            data['services'] = new_services
            return data

Is there a less fragile way to achieve limiting the nested schema to just show the accounts of the parent employee?

Most helpful comment

Would make sense to me to do the grouping and joining at the data level and then expose a schema that worked with that data, rather than attempting to have the schema do that work for you.

All 4 comments

That is a lot of code. Can you provide a smaller example and describe more concisely the fragile nesting you are trying to avoid?

Would make sense to me to do the grouping and joining at the data level and then expose a schema that worked with that data, rather than attempting to have the schema do that work for you.

Yes, I can look at a smaller example and update later.

I'll also have a look at the query I am using to get the data - I just remembered that I do actually have to query the database and pass that data to marshmallow. Obviously having a moment!

Closing this for now, as it has gone stale. Feel free to re-open if further discussion is needed.

Was this page helpful?
0 / 5 - 0 ratings