Pomelo.entityframeworkcore.mysql: LINQ generated SQL changes with method call order for Take() and OrderByDescending()

Created on 17 Nov 2020  路  6Comments  路  Source: PomeloFoundation/Pomelo.EntityFrameworkCore.MySql

Steps to reproduce

The issue

The following code

Code 1:

var RssNewsQuery = _repositoryContext.Article.Take(5).OrderByDescending(article => article.PublishDate);
var allArticleIds = RssNewsQuery.Distinct().ToList();
return Ok(allArticleIds);

Generated SQL 1:

SELECT DISTINCT `t`.`Id`, `t`.`CreatedBy`, `t`.`CreatedDate`, `t`.`Description`, `t`.`ImageURL`, `t`.`IsDeleted`, `t`.`Link`, `t`.`ModifiedDate`, `t`.`PublishDate`, `t`.`RSSSourceId`, `t`.`Title`, `t`.`UpdatedBy`, `t`.`UpdatedDate`
      FROM (
          SELECT `a`.`Id`, `a`.`CreatedBy`, `a`.`CreatedDate`, `a`.`Description`, `a`.`ImageURL`, `a`.`IsDeleted`, `a`.`Link`, `a`.`ModifiedDate`, `a`.`PublishDate`, `a`.`RSSSourceId`, `a`.`Title`, `a`.`UpdatedBy`, `a`.`UpdatedDate`
          FROM `Article` AS `a`
          LIMIT @__p_0
      ) AS `t`

Code 2 (Take(2) and OrderByDescending replaced with each other):

var RssNewsQuery = _repositoryContext.Article.OrderByDescending(article => article.PublishDate).Take(5);
var allArticleIds = RssNewsQuery.Distinct().ToList();
return Ok(allArticleIds);

Generated SQL 2:

SELECT DISTINCT `t`.`Id`, `t`.`CreatedBy`, `t`.`CreatedDate`, `t`.`Description`, `t`.`ImageURL`, `t`.`IsDeleted`, `t`.`Link`, `t`.`ModifiedDate`, `t`.`PublishDate`, `t`.`RSSSourceId`, `t`.`Title`, `t`.`UpdatedBy`, `t`.`UpdatedDate`
      FROM (
          SELECT `a`.`Id`, `a`.`CreatedBy`, `a`.`CreatedDate`, `a`.`Description`, `a`.`ImageURL`, `a`.`IsDeleted`, `a`.`Link`, `a`.`ModifiedDate`, `a`.`PublishDate`, `a`.`RSSSourceId`, `a`.`Title`, `a`.`UpdatedBy`, `a`.`UpdatedDate`
          FROM `Article` AS `a`
          ORDER BY `a`.`PublishDate` DESC
          LIMIT @__p_0
      ) AS `t`

As can be seen on SQL1 there is not even an "ORDER BY" clause. Based on the result I get, the ordering seems to be done in client side. But I couldn't find any warning doing so. As per the changelog of EFCore 3.x LINQ queries are no longer evaluated on the client

Note: I don't even know why there is an inner sql in the first place.

Further technical details

MySQL version: 5.7
Operating system: Windows 10
Pomelo.EntityFrameworkCore.MySql version: 3.1.1
Microsoft.AspNetCore.App version: 3.1.4

type-question

Most helpful comment

@kocburak Yes, that is correct. From the Remarks section of Enumerable.Distinct Method:

[...] The Distinct(IEnumerable) method returns an unordered sequence that contains no duplicate values. It uses the default equality comparer, Default, to compare values. [...]

It is true though, that using LINQ to Objects, while it is not guaranteed that the sequence will be in order, it is in practice when you order it first and then call Distinct().

I have not done real research on how other databases handle a combination of DISTINCT and ORDER BY, but a quick search found these two StackOverflow questions that show, that database handling of DISTINCT and ORDER BY is more complex than it looks like:

So considering all of that, it makes sense to me that Distinct() does not preserve any ordering.

From a practical point of view, you usually order the data as the last operation (unless you use Skip() and/or Take()).
Also, since DISTINCT is basically just a fancy GROUP BY, and GROUP BY is handled before ORDER BY, you want to put your Distinct() call before the OrderBy() call.

All 6 comments

See upstream issue at dotnet/efcore#23267.

The contract of Distinct, does not guarantee the ordering hence once Distinct is applied the inner order by is gone. If ordering is important after applying Distinct then please add OrderBy operation after Distinct.

As pointed out by @smitpatel in the upstream issue linked by @mguinness, you need to apply the OrderByDescending() call _after_ the Destinct() call, because DISTINCT does not explicitly preserve the previous order.

@kocburak Depending on what you want the result to be, one of the following two queries should be what you are looking for:

```c#
var topFiveArticles = _repositoryContext.Article
.Distinct()
.Take(5)
.OrderByDescending(article => article.PublishDate)
.ToList();

var topFiveOrFewerArticles = _repositoryContext.Article
.Take(5)
.Distinct()
.OrderByDescending(article => article.PublishDate)
.ToList();
```

On side-note, there is no ordering done on the client, it is just whatever implicit ordering database gave to us. That ordering could change based on database and its implementation of Distinct operation.

@smitpatel so what you are saying the following two codes are not different from each other from LINQ perspective ? This seems a little bit odd to me.

 _repositoryContext.Article.Take(5).OrderByDescending(article => article.PublishDate).Distinct().ToList();
 _repositoryContext.Article.Take(5).Distinct().ToList()

@kocburak Yes, that is correct. From the Remarks section of Enumerable.Distinct Method:

[...] The Distinct(IEnumerable) method returns an unordered sequence that contains no duplicate values. It uses the default equality comparer, Default, to compare values. [...]

It is true though, that using LINQ to Objects, while it is not guaranteed that the sequence will be in order, it is in practice when you order it first and then call Distinct().

I have not done real research on how other databases handle a combination of DISTINCT and ORDER BY, but a quick search found these two StackOverflow questions that show, that database handling of DISTINCT and ORDER BY is more complex than it looks like:

So considering all of that, it makes sense to me that Distinct() does not preserve any ordering.

From a practical point of view, you usually order the data as the last operation (unless you use Skip() and/or Take()).
Also, since DISTINCT is basically just a fancy GROUP BY, and GROUP BY is handled before ORDER BY, you want to put your Distinct() call before the OrderBy() call.

Thank you very much everyone for taking time to help. I thought this was some kind of bug but since the behavior is design intent, issue can be closed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mason-chase picture mason-chase  路  4Comments

lauxjpn picture lauxjpn  路  3Comments

IonRobu picture IonRobu  路  3Comments

Toemsel picture Toemsel  路  3Comments

cetubig picture cetubig  路  3Comments