Efcore: Query: query with group by in subquery produces invalid SQL

Created on 8 Apr 2019  路  18Comments  路  Source: dotnet/efcore

I am running a Query with a GroupBy and it evaluates on the client.

I get the exception because I am logging all queries that evaluates on the client.

Exception message:

InvalidOperationException: Error generated for warning 'Microsoft.EntityFrameworkCore.Query.QueryClientEvaluationWarning: The LINQ expression 'GroupBy(1, [y])' could not be translated and will be evaluated locally.'. This exception can be suppressed or logged by passing event ID 'RelationalEventId.QueryClientEvaluationWarning' to the 'ConfigureWarnings' method in 'DbContext.OnConfiguring' or 'AddDbContext'.

Stack trace:

Microsoft.EntityFrameworkCore.Diagnostics.EventDefinition<TParam>.Log<TLoggerCategory>(IDiagnosticsLogger<TLoggerCategory> logger, WarningBehavior warningBehavior, TParam arg, Exception exception)
Microsoft.EntityFrameworkCore.Internal.RelationalLoggerExtensions.QueryClientEvaluationWarning(IDiagnosticsLogger<Query> diagnostics, QueryModel queryModel, object queryModelElement)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.VisitResultOperator(ResultOperatorBase resultOperator, QueryModel queryModel, int index)
Remotion.Linq.Clauses.ResultOperatorBase.Accept(IQueryModelVisitor visitor, QueryModel queryModel, int index)
Remotion.Linq.QueryModelVisitorBase.VisitResultOperators(ObservableCollection<ResultOperatorBase> resultOperators, QueryModel queryModel)
Remotion.Linq.QueryModelVisitorBase.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.LiftSubQuery(IQuerySource querySource, SubQueryExpression subQueryExpression)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.CompileMainFromClauseExpression(MainFromClause mainFromClause, QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.VisitMainFromClause(MainFromClause fromClause, QueryModel queryModel)
Remotion.Linq.Clauses.MainFromClause.Accept(IQueryModelVisitor visitor, QueryModel queryModel)
Remotion.Linq.QueryModelVisitorBase.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.ProjectionExpressionVisitor.VisitSubQuery(SubQueryExpression expression)
Remotion.Linq.Clauses.Expressions.SubQueryExpression.Accept(ExpressionVisitor visitor)
System.Linq.Expressions.ExpressionVisitor.VisitAndConvert<T>(ReadOnlyCollection<T> nodes, string callerName)
Remotion.Linq.Parsing.RelinqExpressionVisitor.VisitNew(NewExpression expression)
Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.RelationalProjectionExpressionVisitor.VisitNew(NewExpression newExpression)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.VisitSelectClause(SelectClause selectClause, QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.VisitSelectClause(SelectClause selectClause, QueryModel queryModel)
Remotion.Linq.Clauses.SelectClause.Accept(IQueryModelVisitor visitor, QueryModel queryModel)
Remotion.Linq.QueryModelVisitorBase.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.VisitQueryModel(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.CreateAsyncQueryExecutor<TResult>(QueryModel queryModel)
Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQueryCore<TFunc>(object cacheKey, Func<Func<QueryContext, TFunc>> compiler)
Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync<TResult>(Expression query)
Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable<TResult>.System.Collections.Generic.IAsyncEnumerable<TResult>.GetEnumerator()
System.Linq.AsyncEnumerable.Aggregate_<TSource, TAccumulate, TResult>(IAsyncEnumerable<TSource> source, TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> accumulator, Func<TAccumulate, TResult> resultSelector, CancellationToken cancellationToken) 

Steps to reproduce

The original query is more complex but the following one replicates the error:

c# var result = await context.Packages.Select(x => new { Value = x.Products.GroupBy(y => 1).Select(y => new { Result = 20 }) }).ToListAsync();

Further technical details

EF Core version: 2.2.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: macOs Mojave 10.14.4
IDE: Visual Studio Code 1.33.0

area-query customer-reported regression type-bug

Most helpful comment

Well, here is the point of view of someone earning money with making software:

I HAVE TO SHIP. You obviously do not, at least not something working, but I do have to ship.

Yes, this whole post will come out as a rant - but please understand it is arant build on months of frustration with a team that seems to not understand at all that their product is USED in commercial software and contineus to not care about fixing bugs, constantly delaing them repeatedly to the next version.

You talk about breaking things, for me EfCore 3.1 is in the same state ANY VERSION WAS: UNUSABLE. Seriously. Unusable except for the most simplistic cases and please do not use any feature - up before 3.1 global query filters broke the whole stack. NICE. There is nothing to break.

I have projects moved to .NET Core (now 3.1) and the only way we could ever use EfCore was to load the whole table into memory on every request, then filter in memory. EfCore was fundamentally broken since I started using it in 2.1. Now you tell me that this will stay so for the foreseeable future - what are you afraid of breaking in a product that is unusable? What is the logic there? Why not make a beta release so people can try it out, like - modern software development?

3.1 solved most of the issues that forced my workaround - mostly the total inability to use global query filters by fixing the bug in there that was overlooked then deemed not worthy fixing. It introduced a "one step forward, 5 steps back" approach by ripping out core functionality and STILL not generating valid SQL and then telling people to wait. Well, I release next week.

So, here is what I will do. I will rip out EfCore, the sperior product (except: it is not) and replace it with Ef non core - because THAT ONE WORKS. It may be architecturally inferior (but working), it is slower (except it is not), it has less support for non sql sources (which we do not use), it has less options for migrations (which are an antipattern for enterprise software), it has less features (but they actually WORK, not like EfCore features that are broken and should not even count) and it lacks global query filters (except there is a third party branch EntityFramework Clasic that does have them) and it has less linq features (which actually is wrong beacuse try doing a union or intersect in efcore and you realize that it only supports a VERY small subset of LINQ operations). Generally it is now - with 3 MAJOR releases of EfCore, STILL the superior product and one actually usable.

You REALLY need to get a point of view outside of the "we are super, why do people complain, lets delay critical fixes" and start thinking with the view of someone who has to ship. I have tried to wait as long as I could - time to move back and accept that the best thing which happened to EfCore now is the ability to dump it.

ANY bad sql bug is critial because you totally invalidate the use case for an ORM mapper if you are unable to actually use it.

In any sensible world, you would put out a beta branch with fixes and see whether people complain. You would actually fix our product and not tell people ALL OVER FOR EVERY RELEASE TO WAIT FOR THE NEXT ONE. I waited for fixes to EfCore 2.1, then was told in 2.2 to wait for 3.0 then was told ithe fixes where delayed for 3.1 then NOW I read that yeah, maybe in November we fix our crappy SQL generation, please delay your products AGAIN. EfCore turned from a rework of the overengineered but working EF product into the joke of the whole stack. Maybe it turns usable before it reaches version 10 - but I actually have to ship.

For anyone else who is in the same situation:
in Dotnetcore 3.1 you CAN use Ef - non core - and that one actually works.
If you prefer to have global query filters and a lot more functionality you can head over to https://entityframework-classic.net/ which is a branch that works in netstandard 2.0 and incorporates global query filters.

I personally - i have a test project now and for every major release that is released I will check whether it FINALLY is usable there, if not - no need for me to waste my time even opening bug reports.

All 18 comments

Have you considered reworking it? FIRST A group by, THEN a select based on the returned data into the new object?

@NetTecture What do you mean? My real query is something closer to the following:

var result = await context.Packages.Select(x => new {
  Value = x.Products
      .Select(y => new {
          Price = y.SalePrice * y.Coefficient / 2.4,
          Potential = y.SalePrice * y.Valorization / 4.8
      })
      .GroupBy(y => 1)
      .Select(y => new {
         Result = y.Sum(z => z.Price),
         Next = y.Sum(z => z.Potential)
      }).ToListAsync()
  })

Basically, I am doing the following:

1 - Use a first Select to calculate the Price and Potential with values of the Entity.

2 - Group By 1 to create only one group.

3 - Apply a second Select to calculate Result and Next
Result and Next are the SUM of the Price and Potential of all records.

I tried a few variations but I was never able to solve the client evaluation.

If you have an alternative please let me know ...

I would like to solve this before a fix is pushed.

Thank You

The issue here is the nested collection. You are not running group by query on products directly. You are running it on packages. i.e. for each package run Group by query. It cannot be translated to SQL GROUP BY directly and requires outer apply at best case. (Currently EF Core fails to lift it in single query and generates N+1 queries).
Since you are grouping all products of given package into one, it is very easy to group by PackageId instead.
Following query will generate same result and does server evaluation. The shape of result is slightly different. In your case, it will be List<List<'a>> where each inner list is going to contain only 1 entry. In following query the result is List<'a> which is easy to convert to desired shape on client side.
```C#
var query = db.Set()
.Select(y => new
{
PackageId = EF.Property(y, "PackageId"),
Price = y.SalePrice * y.Coefficient / 2.4,
Potential = y.SalePrice * y.Valorization / 4.8
})
.GroupBy(p => p.PackageId)
.Select(y => new
{
Result = y.Sum(z => z.Price),
Next = y.Sum(z => z.Potential)
}).ToList();

Generated SQL
```SQL
      SELECT SUM(([y].[SalePrice] * [y].[Coefficient]) / 2.3999999999999999E0) AS [Result], SUM(([y].[SalePrice] * [y].[Valorization]) / 4.7999999999999998E0) AS [Next]
      FROM [Product] AS [y]
      GROUP BY [y].[PackageId]

@smitpatel Thank you for the answer. I have been testing your suggestion in different scenarios and it seems to solve the problem.

Will GroupBy with Sum and conditions be supported by EF Core 3.0 ?


SELECT Id, Count(Num) as  ICount, 
   CASE WEHN Count(Id) == 0 
    Then 1.0
    ELSE   SUM(CASE WHEN Num % 2 = 0 THEN 1.0 ELSE 0.0 END) / Count(Num)
    END as  IRate
FROM tb
GROUP BY Id

the linq should be

```C#

ctx.Tb.GroupBy(x=>x.Id)
.Select(x=> new
{
Id = x.Key,
ICount = x.Count(),
IRate = x.Count() == 0 ? 1.0 : g.Sum(y=>y.Num % 2 == 0 ? 1.0 : 0.0) / x.Count()
//OR
IRate = x.Count() == 0 ? 1.0 : g.Count(y=>y.Num % 2 == 0) / x.Count()
// Count should also be translate to Sum when condition
})
```

Basic Sum in GroupBy projection is tested by GroupBy_Property_Select_Sum_Min_Key_Max_Avg. Conditionality in GroupBy is tested on key by GroupBy_aggregate_projecting_conditional_expression_based_on_group_key, #17442 also adds conditional on the aggregate function itself.

The first query is still untested.
C# var result = await context.Packages.Select(x => new { Value = x.Products.GroupBy(y => 1).Select(y => new { Result = 20 }) }).ToListAsync();

This scenario is not working in 3.0 - we produce incorrect SQL.

query1:

ss.Set<Customer>()
                    .Select(c => new
                    {
                        Key = c.CustomerID,
                        Subquery = c.Orders
                            .Select(o => new
                            {
                                First = o.CustomerID,
                                Second = o.OrderID
                            })
                            .GroupBy(x => x.First)
                            .Select(g => new
                            {
                                Count = g.Count(x => x.First.StartsWith("A")),
                                Sum = g.Sum(x => x.Second),
                            }).ToList()

sql1 (group by is missing):

SELECT [c].[CustomerID], COUNT(*), SUM([o].[OrderID]), [o].[OrderID]
FROM [Customers] AS [c]
LEFT JOIN [Orders] AS [o] ON [c].[CustomerID] = [o].[CustomerID]
ORDER BY [c].[CustomerID], [o].[OrderID]

exception:

Column 'Customers.CustomerID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

query2:

ss.Set<Customer>()
                    .Select(c => new
                    {
                        Key = c.CustomerID,
                        Subquery = c.Orders
                            .Select(o => new
                            {
                                First = o.OrderID,
                                Second = o.Customer.City + o.CustomerID
                            })
                            .GroupBy(x => x.Second)
                            .Select(g => new
                            {
                                Sum = g.Sum(x => x.First),
                                Count = g.Count(x => x.Second.StartsWith("Lon"))
                            }).ToList()
                    })

sql2 (group by is there but additional columns are added to the projection):

SELECT [c].[CustomerID], [t].[c], [t].[c0], [t].[OrderID]
FROM [Customers] AS [c]
LEFT JOIN (
    SELECT SUM([o].[OrderID]) AS [c], COUNT(*) AS [c0], [o].[OrderID], [o].[CustomerID]
    FROM [Orders] AS [o]
    LEFT JOIN [Customers] AS [c0] ON [o].[CustomerID] = [c0].[CustomerID]
    GROUP BY [c0].[City] + [o].[CustomerID]
) AS [t] ON [c].[CustomerID] = [t].[CustomerID]
ORDER BY [c].[CustomerID], [t].[OrderID]

exception:

Message: 
Column 'Orders.OrderID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

@smitpatel wrt case 1. after making the change the scenario is still broken - we now produce group by clause, but still there are extra columns in the subquery projection:

SELECT [c].[CustomerID], [t].[c], [t].[OrderID]
FROM [Customers] AS [c]
LEFT JOIN (
    SELECT SUM([o].[OrderID]) AS [c], [o].[OrderID], [o].[CustomerID]
    FROM [Orders] AS [o]
    GROUP BY [o].[CustomerID]
) AS [t] ON [c].[CustomerID] = [t].[CustomerID]
ORDER BY [c].[CustomerID], [t].[OrderID]

related to/duplicate of #15873

I see that the milestone 3.1.0 got removed. Does this mean we shouldn't expect the issue to be fixed in 3.1.0?

Thank you!

@SebastiaanPolfliet Correct.

@ajcvickers Any chance to get a fix into 3.1.2? This bug blocks us writing some projections which are used by OData endpoints.

@prochnowc It's pretty unlikely that we will patch this in a 3.1.x release given the current information. Changes to the query pipeline are risky, so the customer impact of the issue would have to be very large to offset that.

Well, here is the point of view of someone earning money with making software:

I HAVE TO SHIP. You obviously do not, at least not something working, but I do have to ship.

Yes, this whole post will come out as a rant - but please understand it is arant build on months of frustration with a team that seems to not understand at all that their product is USED in commercial software and contineus to not care about fixing bugs, constantly delaing them repeatedly to the next version.

You talk about breaking things, for me EfCore 3.1 is in the same state ANY VERSION WAS: UNUSABLE. Seriously. Unusable except for the most simplistic cases and please do not use any feature - up before 3.1 global query filters broke the whole stack. NICE. There is nothing to break.

I have projects moved to .NET Core (now 3.1) and the only way we could ever use EfCore was to load the whole table into memory on every request, then filter in memory. EfCore was fundamentally broken since I started using it in 2.1. Now you tell me that this will stay so for the foreseeable future - what are you afraid of breaking in a product that is unusable? What is the logic there? Why not make a beta release so people can try it out, like - modern software development?

3.1 solved most of the issues that forced my workaround - mostly the total inability to use global query filters by fixing the bug in there that was overlooked then deemed not worthy fixing. It introduced a "one step forward, 5 steps back" approach by ripping out core functionality and STILL not generating valid SQL and then telling people to wait. Well, I release next week.

So, here is what I will do. I will rip out EfCore, the sperior product (except: it is not) and replace it with Ef non core - because THAT ONE WORKS. It may be architecturally inferior (but working), it is slower (except it is not), it has less support for non sql sources (which we do not use), it has less options for migrations (which are an antipattern for enterprise software), it has less features (but they actually WORK, not like EfCore features that are broken and should not even count) and it lacks global query filters (except there is a third party branch EntityFramework Clasic that does have them) and it has less linq features (which actually is wrong beacuse try doing a union or intersect in efcore and you realize that it only supports a VERY small subset of LINQ operations). Generally it is now - with 3 MAJOR releases of EfCore, STILL the superior product and one actually usable.

You REALLY need to get a point of view outside of the "we are super, why do people complain, lets delay critical fixes" and start thinking with the view of someone who has to ship. I have tried to wait as long as I could - time to move back and accept that the best thing which happened to EfCore now is the ability to dump it.

ANY bad sql bug is critial because you totally invalidate the use case for an ORM mapper if you are unable to actually use it.

In any sensible world, you would put out a beta branch with fixes and see whether people complain. You would actually fix our product and not tell people ALL OVER FOR EVERY RELEASE TO WAIT FOR THE NEXT ONE. I waited for fixes to EfCore 2.1, then was told in 2.2 to wait for 3.0 then was told ithe fixes where delayed for 3.1 then NOW I read that yeah, maybe in November we fix our crappy SQL generation, please delay your products AGAIN. EfCore turned from a rework of the overengineered but working EF product into the joke of the whole stack. Maybe it turns usable before it reaches version 10 - but I actually have to ship.

For anyone else who is in the same situation:
in Dotnetcore 3.1 you CAN use Ef - non core - and that one actually works.
If you prefer to have global query filters and a lot more functionality you can head over to https://entityframework-classic.net/ which is a branch that works in netstandard 2.0 and incorporates global query filters.

I personally - i have a test project now and for every major release that is released I will check whether it FINALLY is usable there, if not - no need for me to waste my time even opening bug reports.

This is probably not be the best place to discuss EF Core policy, but I want to answer to some points @NetTecture made that I consider valid :

  • Releases are too few and far between. For a highly-used product like this (NuGet says 3.1.0 was downloaded over 1.3 million times), having to wait over a month for a critical fix https://github.com/dotnet/efcore/milestone/78 doesn't cut it. I don't expect a release every day, but every other week would be much more acceptable.
  • Too many bugs get punted to oblivion. As someone who has been using .Net Core and EF Core every day since 1.0.0-rc1-update1 in 2016, and who opened a lot of issues here, more often than not we are told to wait until the next major release. This was especially a problem in the transition from 2.0 to 2.1 and 2.2 where some issues were stuck for months. If there is no workaround, you gotta re-engineer everything and wait 6 months.
  • The policy is unclear at best. 3.1.0 is supposed to be LTS, yet many bug fixes are punted to 5.0.0, which will no doubt bring a whole lot of new bugs, and the cycle will restart again. This already happened with the 2.1 LTS branch where many issues were punted to 3.0, which defeats the purpose of having LTS.
  • Rewriting rewriting rewriting. 1.0 was the "beta test", that's OK. Then, everything got rewritten for 2.0, and then again for 3.0, each time breaking stuff that had to be fixed after the fact. 5.0 will probably be the same.

All of that being said, for me, EF Core works. I've been using it continuously since 2016 and I'm very happy about it. Most of the time, it's transparent to use and doesn't get in the way. When it does, usually there's a fix, and unless you get some edge case (which happened to me) you can get your code working fast.

I wouldn't go back to .Net 4 to save my life. The EF and .Net Core team are doing a tremendous job and I hope that this will continue improving going forward 馃憤

Feel free to move this post to a more appropriate place if needed.

@maumar - Can you clean this up when you get time? We need to figure out action item here.

Was this page helpful?
0 / 5 - 0 ratings