Efcore: Pagination is broken when Include is used.

Created on 27 Mar 2020  路  4Comments  路  Source: dotnet/efcore


A query using an Include and Skip and Take creates SQL with the pagination keywords nested into the first Include JOIN query.
This makes pagination work wrong, skipping e.g. 100 rows when Skip(1) is called (if there are 100 rows that would join with that 1 skipped base row).

```C#
dbContext.Set().AsQueryable()
.Include(x => x.Manager)
.Skip(10).Take(20)


This gets roughly translated to _nested pagination_:
```SQL
SELECT * FROM (
    SELECT * FROM Workplace w
    ORDER BY (SELECT 1)
    OFFSET 10 ROWS FETCH NEXT 20 ROWS ONLY
) AS [t0]
LEFT JOIN Manager m ON (...)

Which returns different results than the expected _outer pagination_ form:

SELECT * FROM (
    SELECT * FROM Workplace w
) AS [t0]
LEFT JOIN Manager m ON (...)
ORDER BY (SELECT 1)
OFFSET 10 ROWS FETCH NEXT 20 ROWS ONLY

The problem is that when executing the EF query subsequently for next pages of data, it can return empty or smaller pages, while the following pages would still have data.
Also since skip is applied too soon, a Skip of 1 can result in skipping all rows it would then join with

I guess one could first get a COUNT and SKIP forever until getting enough rows.
But wouldn't that be vulnerable to a row being deleted by someone else at the same time?

Shouldn't the pagination clauses be on the outside of the query, at least to be intuitive?

The EF Core query uses the default ORDER BY (SELECT 1) discussed in #19870, but that does not seem to cause the problem. Even with a column to ORDER BY the query still returns less results than specified in TAKE, while returning thousands without the paging.

Further technical details

EF Core version: 3.1.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019 16.3

closed-question customer-reported

All 4 comments

@maumar to take a look.

@JakubFojtik can you share code listings of your entity classes and DbContext (especially OnModelCreating)?

Based on what you provided so far I reverse engineered the model to something like this:

        public class Workplace
        {
            public int Id { get; set; }
            public int? ManagerId { get; set; }
            public Manager Manager { get; set; }
        }

        public class Manager
        {
            public int Id { get; set; }
            public List<Workplace> Workplaces { get; set; }
        }


        public class MyContext : DbContext
        {
            public DbSet<Workplace> Workplaces { get; set; }
            public DbSet<Manager> Managers { get; set; }


            protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
            {
                optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Repro20431;Trusted_Connection=True;MultipleActiveResultSets=true");
            }
        }

It does produce the sql that you are reporting, but that is expected and correct sql for the query. There can be only one Manager corresponding to any Workplace so actually putting OFFSET before of after the join yields the same results, but if you query the other side of relationship (assuming manager can manage multiple workplaces):

                var query = ctx.Set<Manager>().AsQueryable()
                    .Include(x => x.Workplaces)
                    .Skip(10).Take(20).ToList();

you want to apply OFFSET operation on the manager's table, before joining the workspaces - i.e. the query is supposed to paginate over the managers and include all of their workspaces.

If your intention is (looking at the query I wrote above) to return all the managers but only return their paginated workspaces, then it's a filtered include scenario, which is currently not supported but we just implemented it and it will be available in the next preview - details here: #1833

I am sorry, bug was in my code as I should have expected in the first place.
The cause was simply me forgetting to define FK relationships as nullable.

This understandably resulted in INNER joins instead of LEFT joins and the loss of rows observed.
I was wondering how could LEFT join possibly lose rows, and today I found a later commit correcting the relationship nullability.

I was also looking for the paginated relationship feature, thank you for that!

Sorry again for the useless issue report.

Hey, I want to open this issue again. I'm having the same issue. I want to set the pagination after the joins and sub queries.

Code:

```c#
var users = await _enlineaDbContext.Usuario
.Where(u => !u.Borrado)
.Include(u => u.Provincia)
.Include(u => u.Perfil.PerfilRole)

        .Include(u => u.UsuarioRole)

           .OrderBy(x => x.Activo)
           .Skip(pageSize * (pageNumber - 1))
           .Take(pageSize)
          .ToListAsync().ConfigureAwait(false);

This generates this code [columns removed]: 

```SQL
SELECT *
FROM (
    SELECT *
    FROM [Usuario] AS [u]
    WHERE [u].[Borrado] <> CAST(1 AS bit)
    ORDER BY [u].[Activo]
    OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]

Expected:

SELECT *
FROM (
    SELECT *
    FROM [Usuario] AS [u]
    WHERE [u].[Borrado] <> CAST(1 AS bit)
    ORDER BY [u].[Activo]

) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]
 OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY

I Can replicate what I want doing the following:

```c#
List users = await _enlineaDbContext.Usuario
.Where(u => !u.Borrado)
.Where(filtro)//Properties custom
.Include(u => u.Provincia)
.Include(u => u.Perfil.PerfilRole)
.ThenInclude(pr => pr.Role)
.Include(u => u.UsuarioRole)
.ThenInclude(ur => ur.Role)
.OrderBy(u => u.Activo)
.ToListAsync().ConfigureAwait(false);

        users = users.Skip(pageSize * (pageNumber - 1))
        .Take(pageSize).ToList();

```
But here we already made the fetch, and I don't want items than I'm not going to use.

Just want to use Take() and Skip() in the outside query.


Technical details
EF Core version: 3.1.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019

Was this page helpful?
0 / 5 - 0 ratings