Orm: DDC-885: Allow LIMIT and OFFSET in DQL queries

Created on 21 Nov 2010  路  20Comments  路  Source: doctrine/orm

Jira issue originally created by user nd987:

Currently, the only way to limit a DQL query is to use

Query::setMaxResults($maxResults)
Query::setFirstResult($offset)

for manipulating LIMIT's and OFFSETs. It would be much more user friendly to make these part of DQL.

Improvement

Most helpful comment

Sorry to dig that issue out but if there is no way to limit a subquery in Doctrine then this is a unacceptable limitation.

When doing a JOIN with a subrequest, one can need to limit the number of returned results to 1, because the subrequest orders result in any way, and only the first ordered result is needed for the JOIN condition.

Is it possible to do that?
select * from users join widgets on widgets.id = ( select id from widgets where widgets.user_id = users.id order by created_at desc limit 1 )

All 20 comments

Comment created by romanb:

This was a conscious design decision. It simplifies Dql and avoids one of the most common causes for SQL/dql injections. Limit/Offset are not needed in dql and are SQL terms.

Comment created by romanb:

I will try to clarify the reasoning more for reference in the future. Let's assume we would have limit/offset in DQL. Since limit/offset are almost always dynamic parameters in real applications this would encourage the following, undesired practice:

$em->createQuery("select ... from ... where ... limit $limit offset $offset")

The proper way via binding the parameters would look like this:

$em->createQuery("select ... from ... where ... limit ?1 offset ?2")
       ->setParameter(1, $limit)
       ->setParameter(2, $offset)

This results in limit/offset in DQL being just boilerplate, considering that you can do just this:

$em->createQuery("select ... from ... where ...")
       ->setMaxResults($limit)
       ->setFirstResult($offset)

As can be seen, limit/offset would be just boilerplate in DQL that would encourage a bad practice. Furthermore, LIMIT/OFFSET is something that only makes "sense" to people coming from Mysql or Postgres and re-using a vendor-specific SQL fragment in DQL is not a good idea either.

Comment created by nd987:

Thanks for the explanation Roman...that makes sense. Not being familiar with Oracle and Mongo, I was unaware that it did not support LIMIT and OFFSET natively.

I agree that the best way to create a query is with createQuery() and then setting parameters, and in that case it is just as easy to ->setMaxResults.

I'm working on a CMS and was attempting to allow a way to execute straight DQL without having to build a query (basically just created a function that accepts DQL and parameters as arguments and does all the Doctrine work for you (create query, replace params, execute, return results), in which case the end user would not have access to the query object directly, and has no way to set limits.

I suppose the best way is to force users to create a query object, which I will do.

Are there any plans for creating some convenience methods on the Doctrine Query object? For example...currently you must call

$query->add('orderBy', 'column')

but it would be more user friendly to have convenience methods that just proxy to add(), ala Zend Framework:

$query->orderBy('column');
$query->where('u.username = :username');

Are there plans for this currently? Would you consider adding these functions?

Comment created by @beberlei:

  1. These functions already exist on the QueryBuilder object (orderBy, where and such).
  2. Another reason for LIMIT is not IN DQL is caching the DQL Query. With the LIMIT values outside of the DQL you can cache the query for any given inputs of limit and offset.
  3. You can still have your CMS users execute straight dql without having to work with the query object. you just have to add additional parameters for limit and offset. I don't see no reason why to give up this approach if you want that.

Issue was closed with resolution "Won't Fix"

Sorry to dig that issue out but if there is no way to limit a subquery in Doctrine then this is a unacceptable limitation.

When doing a JOIN with a subrequest, one can need to limit the number of returned results to 1, because the subrequest orders result in any way, and only the first ordered result is needed for the JOIN condition.

Is it possible to do that?
select * from users join widgets on widgets.id = ( select id from widgets where widgets.user_id = users.id order by created_at desc limit 1 )

@amorel-ljsl you can always use native query api to achieve that.

@lcobucci Sure, obviously i can, i even wrote the native query, but that wasn't my point.

@amorel-ljsl's point is a valid one.

Sure, in text book examples DQL syntax for limit/offset is not required but once you start using joins or sub-selects this is a huge limitation.

Using a native query each time (especially since we use both MS Sql and MySql) would mean lots of duplication.

Is this considered at all in the future?

@iluuu1994 no, it's not currently considered. For the scenarios where an explicit limit is used, native SQL is indeed a fair solution, compared to all the quirks of OFFSET and LIMIT.

@Ocramius Unfortunately I do not have any knowledge of Doctrine's code base and my database knowledge doesn't go much beyond MS SQL/MySQL. What are the quirks when it comes to OFFSET and LIMIT?

I can safely say that I've run into this limitation plenty of times in the last two years that I've worked with Doctrine. We have a pretty complex database schema with lots of relations and nested sets. Joining and sub-selecting rows with a limit clause would be extremely useful.

This limitation often caused developers at our company to overuse lazy-loading (obviously our fault). Thus I think people are more likely to work-around the issue using business logic than using the database that is much more sophisticated at this task.

Furthermore, creating a native query makes it very hard to use Doctrine extensions like Translatable. Filters are not applied either which opens the door for bugs.

Also, @romanb mentioned that this could lead to SQL injections:

$em->createQuery("select ... from ... where ... limit $limit offset $offset")

I'm not sure if this is a good argument since the same could be said for any parameter.

Sorry for being annoying, this is just one of the very few issues I've encountered that forces me to write native queries. If this one could be avoided that would be a huge plus for me 馃憤

Anyway, thanks for your hard work, Doctrine is awesome 馃

@iluuu1994 Quite simple. Take as one example this schema data:

Table: groups

| ID | Name          |
+----+---------------+
|  1 | Administrator |
|  2 | Moderator     |
|  3 | User          |
|  4 | Ghost         |
+----+---------------+

Table: users

| ID | Group ID | Name             |
+----+----------+------------------+
|  1 |        1 | John Rambo       |
|  2 |        1 | Chuck Norris     |
|  3 |        2 | Guilherme Blanco |
|  4 |        2 | Marco Pivetta    |
|  5 |        2 | Benjamin Eberlei |
|  6 |        3 | Luis Cobucci     |
|  7 |        3 | Ilija Tovilo     |
|  8 |        3 | Adrien           |
+----+----------+------------------+

Now assume I decide to run the following query:
SELECT g.*, u.* FROM groups g LEFT JOIN users u ON u.group_id = g.id.
I'd have the following result set back:

| ID | Name          | ID   | Group ID | Name             |
+----+---------------+------+----------+------------------+
|  1 | Administrator |    1 |        1 | John Rambo       |
|  1 | Administrator |    2 |        1 | Chuck Norris     |
|  2 | Moderator     |    3 |        2 | Guilherme Blanco |
|  2 | Moderator     |    4 |        2 | Marco Pivetta    |
|  2 | Moderator     |    5 |        2 | Benjamin Eberlei |
|  3 | User          |    6 |        3 | Luis Cobucci     |
|  3 | User          |    7 |        3 | Ilija Tovilo     |
|  3 | User          |    8 |        3 | Adrien           |
|  4 | Ghost         | NULL |     NULL | NULL             |
+----+---------------+------+----------+------------------+

Now let's add your LIMIT and OFFSET:

SELECT g.*, u.* FROM groups g LEFT JOIN users u ON u.group_id = g.id LIMIT 2 OFFSET 2

You'll now have:

| ID | Name          | ID   | Group ID | Name             |
+----+---------------+------+----------+------------------+
|  2 | Moderator     |    3 |        2 | Guilherme Blanco |
|  2 | Moderator     |    4 |        2 | Marco Pivetta    |
+----+---------------+------+----------+------------------+

Remember we're talking about OO here. Whenever you return Group->userList, the collection needs to be complete (and potentially initialized). However you're just returning 2 Users in the userList collection. It's not complete (basically it's missing Benjamin Eberlei). No matter how much you try to initialize the Collection again, you'll always only have 2 Users now. A deep refreshmight be your only solution.

But let's suppose we don't care about it and we blindly add LIMIT/OFFSET support. I'd say as part of the same request, I want to change the Group->name. From _Moderator_ to _Developer_. Routine Doctrine, right? change, persist, flush.

UnitOfWork will now compute changes in your object graph. And it'll "realize" you removed a User from the collection. With orphanRemoval configured, you'll just issue a DELETE query to the users table.

See the problem now?

@guilhermeblanco I might be wrong but wouldn't adding a WHERE clause on one of the joined entities cause the exact same situation?

SELECT g.*, u.* 
FROM groups g 
LEFT JOIN users u ON u.group_id = g.id 
WHERE u.name = 'Guilherme Blanco'

The group Moderator's user list would now only contain the user Guilherme Blanco.

@guilhermeblanco @Ocramius

I've examined the problem @guilhermeblanco has mentioned and have found a possible bug.

$result = $em->getRepository('AppBundle:Many')->createQueryBuilder('many')
    ->select('many, one')
    ->leftJoin('many.one', 'one', 'WITH', 'true = false')
    ->getQuery()->getResult();

foreach ($result as $many) {
    $many->setName($many->getName().' renamed');
    $many->setOne(null);
}

$em->flush();

Doctrine conveniently ignores the fact that one has been set to null. The change is simply ignored. The code snippet works as expected as soon as the , 'WITH', 'true = false' is removed from the left join.

Would you agree that this doesn't behave as expected?

Here is my use-case where I also miss the LIMIT support in DQL. I will try to simplify the example to the core problem.

This is the schema (cardinality is specified in parentheses):

------           ---------           ---------
|Vote| (n)---(1) |Article| (1)---(n) |Comment|
------           ---------           ---------

Notes:

  • Vote entity has value attribute that can be either 1 (for up-vote) or -1 (for down-vote).
  • Also, articles have category.

Now, I want to get all articles from certain category ordered by number of votes (most positive first) along with the latest comment for the article.

This is the PHP code:

$lastCommentTextQueryBuilder = $this->em->createQueryBuilder()
    ->select('c.text')
    ->from(Comment::class, 'c')
    ->where('c.article = a')
    ->orderBy('c.createdAt', 'DESC')
    ->setMaxResults(1);

$articleListQueryBuilder =  $this->em->createQueryBuilder()
    ->select('a, SUM(v.value) AS votes')
    ->addSelect('(' . $lastCommentTextQueryBuilder->getDQL() . ') AS lastCommentText')
    ->from(Article::class, 'a')
    ->join(Vote::class, 'v')
    ->where('a.category = :category')
    ->setParameter('category', $someCategory)
    ->groupBy('a')
    ->orderBy('votes', 'DESC');

$articleListQueryBuilder->getQuery()->getResult();

If I execute the query I get the following error:

SQLSTATE[21000]: Cardinality violation: 7 ERROR: more than one row returned by a subquery used as an expression")

This is referring to the $lastCommentTextQueryBuilder->getDQL() subquery that does not contain LIMIT 1 even when setMaxResults(1) was specified.

There are more use-cases described on Stack Overflow:

Any update on this? Currently I'm facing the same issues as described in the above comments.

Same thing here. Cannot limit a subquery, so I'm stuck while using DQL.

To the lost souls which landed here in search for LIMIT of sub selects:
Use custom function:

final class First extends FunctionNode
{

    /**
     * @var Subselect
     */
    private $subselect;

    /**
     * {@inheritdoc}
     */
    public function parse(Parser $parser)
    {
        $parser->match(Lexer::T_IDENTIFIER);
        $parser->match(Lexer::T_OPEN_PARENTHESIS);
        $this->subselect = $parser->Subselect();
        $parser->match(Lexer::T_CLOSE_PARENTHESIS);
    }

    /**
     * {@inheritdoc}
     */
    public function getSql(SqlWalker $sqlWalker)
    {
        return '(' . $this->subselect->dispatch($sqlWalker) . ' LIMIT 1)';
    }
}

Then you can use it like this:
->addSelect('FIRST(' . $yourSubquery->getDQL() . ') AS subQueryResult')

@escapeboy Nice solution!

@Ocramius Anyway this is not normal, i don't quite understand why "limit" is supposed to be "platform specific" functionality, while every major database have it.

Doctrine is anyway using setMaxResults() incorrectly in terms of common sense, because it is using limit internally, while should return real ENTITY limit, not the actual limit of raw rows for joined queries where subset can be boosted N times on OneToMany joins, and this will result in loss of data, if not handled correctly by developers who are not aware of this "feature" in 80% cases.

For example if you execute select EntityX left join EntityY, where x -> y is one to many, and try getting data using -setMaxResults, doctrine will limit the actual raw database rows, not real EntityX entities, which will result in a loss, which is not a logical way to do it in terms of "entity" semantics, not row semantics which is the case in NativeQuery for example.

So considering above, u should really just rename setMaxResults to limit and allow the use in subquery. It will, at least be correct in terms of semantics, because making setMaxResults limiting real EntityX thus counting or using distinct is not possible or quite difficult in reality.

Was this page helpful?
0 / 5 - 0 ratings