Sqlalchemy: Support filter on subqueryload or selectinload

Created on 26 Dec 2019  路  1Comment  路  Source: sqlalchemy/sqlalchemy

I read how to load custom-filtered relationship objects by using join and contains_eager.

https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html?highlight=contains_eager#using-contains-eager-to-load-a-custom-filtered-collection-result

But, it has a problem when I use limit and it has one to many(or many to many) relationship.

For example,

session.query(User).join(User.addresses).\
            filter(Address.email.like('%ed%')).\
            options(contains_eager(User.addresses))

This query is example in doc. If I wanna select 10 users,

session.query(User).join(User.addresses).
            filter(Address.email.like('%ed%')).
            options(contains_eager(User.addresses)).
            limit(10)

But it can return less than 10 users even if there are over 10 users who has email accounts with containing "ed" words, because user and addresses are one to many relationship, so the result can have many rows for one user.

I found other ways to filter on subqueryload by using primaryjoin like,

class User(Base):
  ....
  addresses = relationship('Address', uselist=True, primaryjoin='and_(User.user_id == Address.user_id, Address.email.like("%ed%"))')

It works well, but I can't filter the addresses dynamically. (If I wanna filter address containing "ee", I have to add additional relationship in class)

Is there any solution for filtering on relationship like below?

session.query(User)
    .options(selectinload(User.addresses).filter(Address.email.like('%ed%')))
    .limit(10)
question

Most helpful comment

hi there -

for your contains_eager,when you want to use limit, you need to use a subquery, which the special from_self() modifier can get for you with a very low level of verbosity:

session.query(User).limit(10).from_self().join(User.addresses).
        filter(Address.email.like('%ed%')).
        options(contains_eager(User.addresses))

to run a query such as selectinload, do it manually:



users = session.query(User).limit(10).all()

addresses = session.query(Address).filter(Address.email.like("%ed%")).filter(Address.user_id.in_([u.id for u in users])).order_by(Address.user_id).all()

import itertools
from sqlalchemy.orm.attributes import set_committed_value

user_dict = {u.id: u for u in users}
for key, collection in itertools.groupby(addresses, key=lambda address: address.user_id):
    set_committed_value(user_dict[key], list(collection))

the use case of "eagerly load partial collections" is not a main use case in the SQLAlchemy ORM and it is generally not considered to be that great an idea, because that object you are creating is now in the Session and is not accurately representing the database state it claims to represent. additionally it would be a lot of effort to create a feature for selectinload and especially subqueryload to be able to add arbitrary filters to these, it would basically have to be on the option itself and would have to work for any kind of eager loading, it would be a major effort for a behavior that is seldom needed and SQLAlchemy doesn't really recommend anyway so I think the manual approach above is what you need to do if you stick with this pattern.

>All comments

hi there -

for your contains_eager,when you want to use limit, you need to use a subquery, which the special from_self() modifier can get for you with a very low level of verbosity:

session.query(User).limit(10).from_self().join(User.addresses).
        filter(Address.email.like('%ed%')).
        options(contains_eager(User.addresses))

to run a query such as selectinload, do it manually:



users = session.query(User).limit(10).all()

addresses = session.query(Address).filter(Address.email.like("%ed%")).filter(Address.user_id.in_([u.id for u in users])).order_by(Address.user_id).all()

import itertools
from sqlalchemy.orm.attributes import set_committed_value

user_dict = {u.id: u for u in users}
for key, collection in itertools.groupby(addresses, key=lambda address: address.user_id):
    set_committed_value(user_dict[key], list(collection))

the use case of "eagerly load partial collections" is not a main use case in the SQLAlchemy ORM and it is generally not considered to be that great an idea, because that object you are creating is now in the Session and is not accurately representing the database state it claims to represent. additionally it would be a lot of effort to create a feature for selectinload and especially subqueryload to be able to add arbitrary filters to these, it would basically have to be on the option itself and would have to work for any kind of eager loading, it would be a major effort for a behavior that is seldom needed and SQLAlchemy doesn't really recommend anyway so I think the manual approach above is what you need to do if you stick with this pattern.

Was this page helpful?
5 / 5 - 1 ratings