I looked up for this problem and didn't find anything. Sorry if I'm making a dup or I messed up something.
This
```c#
context.Blogs
.Select(t => new BlogDto
{
Url = t.Url
})
.Distinct()
.ToList();
resulted in the following query:
```sql
SELECT [t].[Url]
FROM [Blogs] AS [t]
and I expected:
SELECT DISTINCT [t].[Url]
FROM [Blogs] AS [t]
So, now I'm using the following workaround:
```c#
context.Blogs
.Select(t => new
{
Url = t.Url
})
.Distinct()
.Select(t => new BlogDto
{
Url = t.Url
})
.ToList();
which results in:
```sql
SELECT [t1].[Url]
FROM (
SELECT DISTINCT [t0].[Url]
FROM [Blogs] AS [t0]
) AS [t1]
```c#
public class BlogDto
{
public string Url { get; set; }
}
public class BlogContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.UseSqlServer(@"Data Source=.;Initial Catalog=Blogging;Integrated Security=True;MultipleActiveResultSets=True");
//optionsBuilder.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=CollectionsTest;Trusted_Connection=True;ConnectRetryCount=0");
}
}
public class Program
{
private static void Main()
{
using (var context = new BlogContext())
{
var results = context.Blogs
.Select(t => new BlogDto
{
Url = t.Url
})
.Distinct()
.ToList();
}
}
}
Just to be clear, the following works as expected:
```c#
context.Blogs
.Select(t => new
{
Url = t.Url
})
.Distinct()
.ToList();
produces:
SELECT DISTINCT [t].[Url]
FROM [Blogs] AS [t]
I have one simple question regarding terminology. When you say 'projection' ('projecting' to some object different from the model), does that include anonymous objects? If so, the title of this issue should be "Projecting to a specific object(class) with Distinct is not working as expected"
Thank you!
EF Core version: 1.1.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Win 8.1 64bit
IDE: Visual Studio 2015 Update 3
Based on documentation of Distinct()
The Distinct
(IEnumerable ) method returns an unordered sequence that contains no duplicate values. It uses the default equality comparer, Default, to compare values.
The default equality comparer, Default, is used to compare values of the types that implement the IEquatablegeneric interface. To compare a custom data type, you need to implement this interface and provide your own GetHashCode and Equals methods for the type.
For anonymous type in C#,
Because the Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashCode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.
Select Clause in T-Sql
DISTINCT
Specifies that only unique rows can appear in the result set. Null values are considered equal for the purposes of the DISTINCT keyword.
For database, the rows are the same if all columns have same values.
The key here is the Equals function.
For anonymous type, Equals compares all property values, which is same as database, hence it is possible to evaluate Distinct on server.
When it comes to a class type (like BlogDto above), we don't really know the details of Equals on the class. 2 BlogDto can be equal even if their corresponding URLs are different or 2 BlogDto can be different even with same URL. Therefore doing Distinct URL on server may not give you correct results and it will require client side evaluation.
Therefore Distinct is actually working as expected to give you correct results.
Thanks for the response! You are right, but when working with a DbContext, I personally think of Distinct(IQueryable) as a way to force a distinct to be executed on the server(query provider). I'm sure that a lot of people are used to this way of query construction.
Also, if you run the following using EF 6
```c#
context.Blogs
.Select(t => new BlogDto
{
Url = t.Url
})
.Distinct()
.ToList();
you will get the following translation:
```sql
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[Url] AS [Url]
FROM ( SELECT DISTINCT
[Extent1].[Url] AS [Url],
1 AS [C1]
FROM [dbo].[Blogs] AS [Extent1]
) AS [Distinct1]
which is the _same_ query as the one I mentioned in the workaround and again I think a lot of people are used to this mindset when constructing a query. It's just confusing for me that somehow the server should be aware of comparing c# classes.
I saw the following
EF Core keeps the developer experience from EF6.x
at https://docs.microsoft.com/en-us/ef/efcore-and-ef6/
which is not true in this case. If you are transitioning from EF6 to EFCore you will have to refactor some of your queries including Distinct.
Even more, if you run:
```c#
context.Blogs
.Distinct()
.ToList();
you will get
```sql
SELECT DISTINCT [b].[BlogId], [b].[Url]
FROM [Blogs] AS [b]
and I didn't find an override of the Equals method in the DbSet.
Maybe I'm missing something conceptually or I'm just too used to EF6.
Thanks, again!
EF Team Triage: We discussed this and we believe we should try to support this query and the translation to DISTINCT on the server.
I think I may have addressed this at least somewhat with #7543:
https://github.com/aspnet/EntityFramework/pull/7543/files#diff-2c68636402e93e86a3d0108bbd4ba1ddR2131
@tuespetre - If you think the PR resolves this thing then can you please add a test case.
@smitpatel the linked spot in the diff shows two applicable tests, looks like they would need to be renamed
Most helpful comment
EF Team Triage: We discussed this and we believe we should try to support this query and the translation to
DISTINCTon the server.