Efcore: ToQueryString() "optimizes" the Where clause in an undesirable way

Created on 21 Mar 2020 · 14Comments · Source: dotnet/efcore

When calling ToQueryString() with a parameter that is known to be invalid (for example, null on a required column), that Where clause is excluded.

Steps to reproduce

```C#
var paramInt = 42;
string? paramString = null;
var q = db.Set()
.Where(e => e.MyInt == paramInt || e.MyRequiredString == paramString)
.ToQueryString();


The output is:

```SQL
.param set @__p_0 42

SELECT "k"."MyInt", "k"."MyRequiredString"
FROM "MyEntity" AS "k"
WHERE "k"."MyInt" = @__p_0

Note that e.MyRequiredString == paramString is missing.

This is great when the query is meant to be executed by EF Core, but when wanting to export the query in order to run it somewhere else, we want all the parameters to be preserved so that we can change their values.

In this case, we could "blame" the developer for providing a parameter that is clearly forbidden, but there may be other (more subtle) cases... For example, it is more confusing with complex AND clauses that end up being optimized as SELECT ... FROM ... WHERE 0.

By the way, I stumbled on this issue when writing testing code: The query was always meant to be turned into SQL with ToQueryString(), so I just set all its parameters as default (which is null for strings).

Edit: If this behavior is desirable in some scenarios, maybe we can add a ToQueryString(optimize: false) option.

Further technical details

EF Core version: .NET Core 5.0 Preview 2
Database provider: I tried with SQLite, but I expect this behavior to be the same for all providers
Target framework: .NET Core 3.1
Operating system: Windows 10
IDE: Visual Studio 2019 16.5

closed-question customer-reported

Source

KPixel

All 14 comments

ToQueryString simply prints out the command that would have been sent to the database. At this point redundant term in the filter has already been optimized. So this would have to be a query specific flag.

But this also is opening a can of worms, we perform a lot of different optimizations, some are purely "cosmetic" but we rely on others for the correctness of the sql (e.g. #20357). How granular should the flag(s) be? etc...

maumar on 21 Mar 2020

👍1

There is an API inside EF Core that outputs the SQL text without optimizations. (Edit: At least, without this specific optimization.)
It can be accessed through reflection.

That's what I was using before the addition of ToQueryString().

Can you take a look?

KPixel on 21 Mar 2020

@KPixel

By the way, I stumbled on this issue when writing testing code:

What kind of testing code are you writing?

ajcvickers on 23 Mar 2020

I am working on a library that uses reflection to benchmark queries in the data access layer:

It records the speed of the query method as well as the speed of running the SQL command directly (using a DbCommand).
If the SQL command is slower than a certain threshold or the delta between the query and the SQL command is high enough (unusual overhead), a warning is triggered.

KPixel on 23 Mar 2020

@KPixel In that case wouldn't it mean that your testing gives misleading results if it is not using the actual queries that EF generates?

ajcvickers on 23 Mar 2020

I don't think so because these optimizations are probably also done by the database engine.
And I still need to be able to tests different scenarios by injecting different values for the parameters (which is not possible if they have already been "optimized away").

By the way, as I said in my initial post, now that I have identified that this optimization only happens when the parameter is "invalid", I can manually take steps to avoid that. But I still find that less than ideal, and I'm concerned that there will be another optimization that I haven't stumbled on yet in the future.

This approach works already, so I assumed that it wouldn't be a big hurdle to make it publicly available.

KPixel on 23 Mar 2020

these optimizations are probably also done by the database engine

That's certainly not true in general, and I doubt it is true even for the majority of queries.

this optimization only happens when the parameter is "invalid"

This is not true. Sniffing and eliminating nulls is a normal part of creating correct queries. It's incorrect to characterize this kind of query difference as an optimization, even if it appears to be in this specific case.

assumed that it wouldn't be a big hurdle to make it publicly available.

This demonstrates how these internal hacks can be misleading. This approach does not always generate correct queries.

ajcvickers on 23 Mar 2020

I think we are not talking about the same thing:

I want an API that allows me to convert an IQueryable into a SQL text where the values of the parameters are not taken into consideration (they should be considered unknown).

Excluding my current scenario, there is the basic scenario where you have a big query (let's say one with a lot of joins and stuff), and you want to debug/analyse it in SQL Server Management Studio.

Having to think about what parameters you are passing to not violate any "rule" seems like extra work. For me, in this scenario, I would pass default to all these parameters because I don't care about their values at that point.

KPixel on 23 Mar 2020

I want an API that allows me to convert an IQueryable into a SQL text where the values of the parameters are not taken into consideration

Then ToQueryString is not the right API to use, and EF Core in general is not the right thing to use because it doesn't work in this way.

ajcvickers on 23 Mar 2020

The release notes clearly mentioned that debugging/analysis scenario...
Can you please elaborate? Why is evaluating the parameters values compulsory?

By the way, to be clearer, here is a simplified example of what a query method in our data access layer looks like (internally):

```C#
internal static IQueryable MyQuery(DbContext db, int paramInt = default, string? paramString = default)
{
return from e in db.MyEntity where e.MyInt == paramInt || e.MyRequiredString == paramString select e;
}


Then, my library can just call:

```C#
var sql = MyQuery().ToQueryString();

KPixel on 23 Mar 2020

@KPixel Yes, it is intended for debugging and analysis _of the queries that EF actually runs against the database_. You are asking for queries that EF would never generate.

ajcvickers on 23 Mar 2020

@KPixel Yes, it is intended for debugging and analysis _of the queries that EF actually runs against the database_. You are asking for queries that EF would never generate.

But EF is capable of generating that "generic" SQL text (as proven by that reflection hack)...

I am not asking to change the current behavior of ToQueryString(). I am hoping that you will add another API.
Edit: And if not, why would that feature be a bad thing?

KPixel on 23 Mar 2020

That reflection hack fails if you are using Contains over a client side enumerable which translates to column IN (a, b, c) to server or when you use FromSql API with parameters. EF generates a query for given IQueryable with given set of parameters. EF does not generate query which would work for any values of parameter. Further if the query has binary short-circuiting behavior based on client side variables then different parameter values would generate quite a different query.
Essentially, what you are asking is not something EF does, not it intend to do that ever.