Efcore: Query: GroupBy with Where produces invalid SQL

Created on 4 Feb 2018  路  5Comments  路  Source: dotnet/efcore

Using GroupBy with Where produces invalid SQL

Exception message:
   Microsoft.Data.Sqlite.SqliteException: 'SQLite Error 1: 'misuse of aggregate function COUNT()'.'
Stack trace:
   at Microsoft.Data.Sqlite.SqliteException.ThrowExceptionForRC(Int32 rc, sqlite3 db)
   at Microsoft.Data.Sqlite.SqliteCommand.<PrepareAndEnumerateStatements>d__62.MoveNext()
   at Microsoft.Data.Sqlite.SqliteCommand.ExecuteReader(CommandBehavior behavior)
   at Microsoft.Data.Sqlite.SqliteCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at System.Data.Common.DbCommand.ExecuteReader()
   at Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommand.Execute(IRelationalConnection connection, DbCommandMethod executeMethod, IReadOnlyDictionary`2 parameterValues)
   at Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommand.ExecuteReader(IRelationalConnection connection, IReadOnlyDictionary`2 parameterValues)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.BufferlessMoveNext(DbContext _, Boolean buffer)
   at Microsoft.EntityFrameworkCore.Storage.Internal.NoopExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.Enumerator.MoveNext()
   at Microsoft.EntityFrameworkCore.Query.QueryMethodProvider.<_GroupBy>d__19`3.MoveNext()
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at Microsoft.EntityFrameworkCore.Query.Internal.LinqOperatorProvider.ExceptionInterceptor`1.EnumeratorExceptionInterceptor.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at MegaDedup.Program.Main(String[] args) in Program.cs:line 24

Steps to reproduce

```c#
class Program
{
public static readonly LoggerFactory MyLoggerFactory = new LoggerFactory(new[] { new ConsoleLoggerProvider((_, __) => true, true) });

static void Main(string[] args)
{
    var context = new FileContext();
    context.Database.EnsureCreated();

    context.FileEntries
        .GroupBy(x => x.Size)
        .Where(x => x.Count() > 1)
        .Select(x => x.Key)
        .ToList();
}

}

class FileContext : DbContext
{
public DbSet FileEntries { get; set; }

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    var builder = new SqliteConnectionStringBuilder();
    builder.DataSource = "db.sqlite";
    optionsBuilder
        .UseLoggerFactory(Program.MyLoggerFactory)
        .UseSqlite(builder.ToString());
}

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<FileEntry>()
        .HasIndex(x => x.Size);
}

}

class FileEntry
{
public long Id { get; set; }
public long Size { get; set; }
public string Path { get; set; }
}


### Produced SQL
```sql
SELECT "x"."Id", "x"."Path", "x"."Size"
FROM "FileEntries" AS "x"
WHERE COUNT(*) > 1
ORDER BY "x"."Size"

Expected SQL

SELECT "x"."Size"
FROM "FileEntries" AS "x"
GROUP BY "x"."Size"
HAVING COUNT(*) > 1

Further technical details

EF Core version: 2.1.0-preview1-28226
Database Provider: Microsoft.EntityFrameworkCore.Sqlite
Operating system: Windows 10 Pro
IDE: Visual Studio 2017 15.6 p3

closed-fixed type-bug

Most helpful comment

It seems like almost all relational providers have Having clause supported in SelectExpression. We will try to translate to Having clause instead of causing a subquery (like EF6).

All 5 comments

@Suchiman - Would following SQL work?

SELECT "t"."Size"
FROM (
    SELECT "x"."Size", COUNT(*) AS c
    FROM "FileEntries" AS "x"
    GROUP BY "x"."Size"
) AS "t"
WHERE "t"."c" > 1

Related #10012

@smitpatel Of course, thats pretty much the SQL EF6 did generate (for MSSQL, but works for SQLite as well).

SELECT 
[GroupBy1].[K1] AS [Size]
FROM ( SELECT 
    [Extent1].[Size] AS [K1], 
    COUNT(1) AS [A1]
    FROM [dbo].[FileEntries] AS [Extent1]
    GROUP BY [Extent1].[Size]
)  AS [GroupBy1]
WHERE [GroupBy1].[A1] > 1

It seems like almost all relational providers have Having clause supported in SelectExpression. We will try to translate to Having clause instead of causing a subquery (like EF6).

We generate SQL like this now

SELECT [o].[CustomerID] AS [Key], COUNT(*) AS [Count]
FROM [Orders] AS [o]
GROUP BY [o].[CustomerID]
HAVING COUNT(*) > 4"

Unless I'm mistaken, HAVING is part of the SQL standard so it should be fine to use it. It's great that EF Core will be using it.

Was this page helpful?
0 / 5 - 0 ratings