Hey, I want to open this issue again. I'm having the same issue. I want to set the pagination after the joins and sub queries.
Code:
```c#
var users = await _enlineaDbContext.Usuario
.Where(u => !u.Borrado)
.Include(u => u.Provincia)
.Include(u => u.Perfil.PerfilRole)
.Include(u => u.UsuarioRole)
.OrderBy(x => x.Activo)
.Skip(pageSize * (pageNumber - 1))
.Take(pageSize)
.ToListAsync().ConfigureAwait(false);
This generates this code [columns removed]:
```SQL
SELECT *
FROM (
SELECT *
FROM [Usuario] AS [u]
WHERE [u].[Borrado] <> CAST(1 AS bit)
ORDER BY [u].[Activo]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]
Expected:
SELECT *
FROM (
SELECT *
FROM [Usuario] AS [u]
WHERE [u].[Borrado] <> CAST(1 AS bit)
ORDER BY [u].[Activo]
) AS [t]
INNER JOIN [Provincias] AS [p] ON [t].[ProvinciaId] = [p].[Id]
INNER JOIN [Perfil] AS [p0] ON [t].[PerfilId] = [p0].[Id]
LEFT JOIN [PerfilRole] AS [p1] ON [p0].[Id] = [p1].[PerfilId]
LEFT JOIN [UsuarioRole] AS [u0] ON [t].[Id] = [u0].[UsuarioId]
ORDER BY [t].[Activo], [t].[Id], [p].[Id], [p0].[Id], [p1].[Id], [u0].[Id]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
I Can replicate what I want doing the following:
```c#
List
.Where(u => !u.Borrado)
.Where(filtro)//Properties custom
.Include(u => u.Provincia)
.Include(u => u.Perfil.PerfilRole)
.ThenInclude(pr => pr.Role)
.Include(u => u.UsuarioRole)
.ThenInclude(ur => ur.Role)
.OrderBy(u => u.Activo)
.ToListAsync().ConfigureAwait(false);
users = users.Skip(pageSize * (pageNumber - 1))
.Take(pageSize).ToList();
```
But here we already made the fetch, and I don't want items than I'm not going to use.
Just want to use Take() and Skip() in the outside query.
Technical details
EF Core version: 3.1.2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019
_Originally posted by @EnmanuelParedesR in https://github.com/dotnet/efcore/issues/20431#issuecomment-676509323_
It is not the same issue. The original issue had user error in it and there was no OrderBy in the query (causing ORDER BY (SELECT 1))
Based on generated SQL, you are including some collection navigations which will causes results of outer to be repeated for every inner element of collection. Generated SQL is accurate. The expected SQL would actually give incorrect results.
If you are seeing error in client side result objects then please share a complete repro otherwise everything looks correct here.
@AndriySvyryd thanks for the rename. That's actually what I need to fix.
@smitpatel yes, your right, but I'm having issues trying to what I explained above. Can you help me with it? How can I do all this in just one execution?
Thanks for the time.
Why the offset is unexpected on inner query here? Is it invalid SQL?
No, it's not an error. But, I want to accomplish the offset in the outer query, not in the inner one. As I show in the example, I want to make the Pagination [skip and take] after the result set. Now, with the query, the 'pagination' is inside the inner query, this gives wrong results.
I want something like this:
```c#
List
.Where(u => !u.Borrado)
.Where(filtro)//Properties custom
.Include(u => u.Provincia)
.Include(u => u.Perfil.PerfilRole)
.ThenInclude(pr => pr.Role)
.Include(u => u.UsuarioRole)
.ThenInclude(ur => ur.Role)
.OrderBy(u => u.Activo)
.ToListAsync().ConfigureAwait(false);
users = users.Skip(pageSize * (pageNumber - 1))
.Take(pageSize).ToList();
```
But with one sentence, I just want to prevent calling more items than what i really need.
But pagination on outer query is wrong results.
If you remove all the includes then you are asking for pagination over Usario with some filter and ordering. This determines all the Usario supposed to be in result set.
Now since you are doing collection include, EF Core needs to join additional tables to fetch related data. This additional tables need to be joined after pagination so that related data is only fetched for Usario which are in result set. That is what pagination on inner query achieves.
Since you are doing collection include EF Core repeats outer element (Usario) for each unique inner element, that is how collection include works in SQL for a relational database. If your result set contains 1 row only and you are including one collection which contains 4 rows for associated Usario in result set then SQL result set will be 4 rows but EF Core query will combine those 4 rows to generate 1 result (which will contain 4 records for collection) in result set. If you put pagination on outer then you get wrong results. Since EF Core combines multiple rows for collection include in single result on client, the pagination you are applying on client is exactly equivalent of pagination on inner query on Usario (since client result or query both contains enumerable of Usario only).
Yes, I understand that first paragraph, but, in my case, I'm not getting the same results.
This is my scenario:
I have an Usuario[users] table with approximately 30k users but a lot of them have wrong or null data. The query [without the pagination] give me just 70 rows [this is correct because those users are the only one with data in the other relationship tables]. What's happening then with the pagination?
The inner query
SELECT *
FROM [Usuario] AS [u]
WHERE [u].[Borrado] <> CAST(1 AS bit)
ORDER BY [u].[Activo]
OFFSET @__p_0 ROWS FETCH NEXT @__p_1 ROWS ONLY
Extract, from example, the first 10 users order by active, without being sure if they have their relationship data, this may give fewer results than what is. If just 6 of those users have the relationship it will give me just those 6, but there are 4 users left with good relationships, they just were out of the offset because of their position in the order by.
Sorry if it is difficult to understand, but that's the reason why the code in the client site works as I want, but I'm getting more info than what I actually want. Any workaround with this?
Thanks for the time!
The pagination on inner query gives you back 10 users correctly.
The only way related data could reduce the result size if any of the INNER JOIN does not have matching data (Perfil and Provincias tables).
Since you haven't shared model yet, based on query my guess is
So for each row [t].[ProvinciaId] & [t].[PerfilId] must have a non-null value which exists in corresponding table too.
If the result set is getting shortened by the inner joins then either your relationship configuration in EF Core model is incorrect or the data in database is incorrect.
In either case, if the model does not match the database schema and data in database does not create a consistent graph from model perspective, the query cannot give correct results. The only way it could work is if you make that your model matches data in database, otherwise there is no work-around in just query.
If any of above is wrong then please provide full repro with data in database too.
Yes, you're correct. I understand your point. What I should do is have the data consistent in my database, and also, right now those users can't be used in the platform [we are migrating those users right now, that's the big issue]. I think we can close this issue, I'm going to create a way [script with default values for those users or idk] to make the user's data consistent so the query generated can work.
In the mid-time, I'm going to do this client-side code. This shouldn't happen if the users were correct in the new platform.
Thanks for the time and support, this makes me understand some stuff.
Most helpful comment
The pagination on inner query gives you back 10 users correctly.
The only way related data could reduce the result size if any of the
INNER JOINdoes not have matching data (PerfilandProvinciastables).Since you haven't shared model yet, based on query my guess is
So for each row
[t].[ProvinciaId]&[t].[PerfilId]must have a non-null value which exists in corresponding table too.If the result set is getting shortened by the inner joins then either your relationship configuration in EF Core model is incorrect or the data in database is incorrect.
In either case, if the model does not match the database schema and data in database does not create a consistent graph from model perspective, the query cannot give correct results. The only way it could work is if you make that your model matches data in database, otherwise there is no work-around in just query.
If any of above is wrong then please provide full repro with data in database too.