I've started to inline queries more often to reduce round-trips to the server for fetching IDs that later are used in other queries.
This is partly due to the limitations of using .Contains() on an int-array.
Instead I use something similar to the following (stupid) example, of course my real queries are different in terms of complexity. Still, this demonstrates the issue clearly.
```C#
var offerId = 1;
var inlineQuery = context.Offers.Where(o => o.Id == offerId);
var firstQuery = context.OfferActions.Where(oa => inlineQuery.Contains(oa.Offer) && oa.Action == OfferActions.Established);
var secondQuery = context.OfferActions.Where(oa => inlineQuery.Contains(oa.Offer) && oa.Action != OfferActions.Established);
var query = firstQuery.Concat(secondQuery);
await query.ToListAsync();
SELECT [o].[Id], [o].[Action], [o].[OfferId], [o].[TimeCreatedUtc]
FROM [OfferActions] AS [o]
INNER JOIN [Offers] AS [o0] ON [o].[OfferId] = [o0].[Id]
WHERE EXISTS (
SELECT 1
FROM [Offers] AS [o1]
WHERE ([o1].[Id] = @__offerId_0) AND ([o1].[Id] = [o0].[Id])) AND ([o].[Action] = 1)
UNION ALL
SELECT [o2].[Id], [o2].[Action], [o2].[OfferId], [o2].[TimeCreatedUtc]
FROM [OfferActions] AS [o2]
INNER JOIN [Offers] AS [o3] ON [o2].[OfferId] = [o3].[Id]
WHERE EXISTS (
SELECT 1
FROM [Offers] AS [o4]
WHERE ([o4].[Id] = @__offerId_1) AND ([o4].[Id] = [o3].[Id])) AND ([o2].[Action] <> 1)
```
As you can see in the above SQL output, I now have two parameters __offerId_0 and __offerId_1 even though I used the same local variable offerId.
Is there any way to force EF Core to reuse the parameters so that I only get a single parameter __offerIdused in both subqueries?
In my real-world example, sometimes I end up with >10 parameters of the exact same value and I think this gives the SQL server a poor chance to optimize the query properly.
EF Core version: 5.0 rc2
Database provider: Microsoft.EntityFrameworkCore.SqlServer
If you inline the inlineQuery then it would generate one parameter I believe.
The reason it generated different parameters is the way visitation worked for parameterizing.
query causes parameterization firstQuery/secondQuery tree. At this point, inlineQuery is actually a parameter since it is coming from closure. But since this parameter value is IQueryable, we inline it and extract parameters recursively, each of this has no details about the other trees so ends up generating different parameters.
It would also depend on what expression tree we are getting, if the expression for offerId is same in both then may be we can generate one parameter.
The inlineQuery is in real life a query that is used for authorization, so it's pretty verbose and would like to benefit from code reuse. Still, for experimenting with this, I tried _inlining_ the inlineQuery, but the result is actually the same - I still get two parameters.
```c#
var firstQuery = context.OfferActions.Where(oa => context.Offers.Where(o => o.Id == offerId).Contains(oa.Offer) && oa.Action == OfferActions.Established);
var secondQuery = context.OfferActions.Where(oa => context.Offers.Where(o => o.Id == offerId).Contains(oa.Offer) && oa.Action != OfferActions.Established);
I'm not enough versed in the inner workings of EF Core, but could there be a way to have a special constant parameter expressed somehow in the query? Like (pseudocode)
```c#
var inlineQuery = context.Offers.Where(o => o.Id == EF.MakeConstantParameter(offerId));
and it would somehow hash the argument and maintain some sort of hashmap to check if the parameter is exactly the same as some other parameter in the tree and then link them together? Or just check by reference if they point to the same memory or something similar?
Funny, I even inlined the inlines;
```c#
var query = context.OfferActions.Where(oa => context.Offers.Where(o => o.Id == offerId).Contains(oa.Offer) && oa.Action == OfferActions.Established).Concat(context.OfferActions.Where(oa => context.Offers.Where(o => o.Id == offerId).Contains(oa.Offer) && oa.Action != OfferActions.Established));
and still got
```sql
SELECT [o].[Id], [o].[Action], [o].[OfferId], [o].[TimeCreatedUtc]
FROM [OfferActions] AS [o]
INNER JOIN [Offers] AS [o0] ON [o].[OfferId] = [o0].[Id]
WHERE EXISTS (
SELECT 1
FROM [Offers] AS [o1]
WHERE ([o1].[Id] = @__offerId_0) AND ([o1].[Id] = [o0].[Id])) AND ([o].[Action] = 1)
UNION ALL
SELECT [o2].[Id], [o2].[Action], [o2].[OfferId], [o2].[TimeCreatedUtc]
FROM [OfferActions] AS [o2]
INNER JOIN [Offers] AS [o3] ON [o2].[OfferId] = [o3].[Id]
WHERE EXISTS (
SELECT 1
FROM [Offers] AS [o4]
WHERE ([o4].[Id] = @__offerId_1) AND ([o4].[Id] = [o3].[Id])) AND ([o2].[Action] <> 1)
So my first guess that it was about the inlined IQueryable seems wrong, it seems to be wider than that.
Seems like this is happening because Concat argument is not a lambda. so generating a separate tree.
Would that be the same for a similar construct to the following, which is more like the one I have in real life where it builds an anonymous dto object;
(the Concat was just for the simple case)
```c#
var query = context.Ads
.Where(ad => ad.Id == 1)
.Select(ad => new
{
HasSomeOffers = firstQuery.Where(oa => oa.Offer.AdId == ad.Id).Any(),
HasSomeOtherOffers = secondQuery.Where(oa => oa.Offer.AdId == ad.Id).Any()
});
var item = await query.SingleOrDefaultAsync();
This also generates two different `offerId` parameters:
```sql
SELECT 1
FROM [OfferActions] AS [o]
INNER JOIN [Offers] AS [o0] ON [o].[OfferId] = [o0].[Id]
WHERE (EXISTS (
SELECT 1
FROM [Offers] AS [o1]
WHERE ([o1].[Id] = @__offerId_0) AND ([o1].[Id] = [o0].[Id])) AND ([o].[Action] = 1)) AND ([o0].[AdId] = [a].[Id])) THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END AS [HasSomeOffers], CASE
WHEN EXISTS (
SELECT 1
FROM [OfferActions] AS [o2]
INNER JOIN [Offers] AS [o3] ON [o2].[OfferId] = [o3].[Id]
WHERE (EXISTS (
SELECT 1
FROM [Offers] AS [o4]
WHERE ([o4].[Id] = @__offerId_1) AND ([o4].[Id] = [o3].[Id])) AND ([o2].[Action] <> 1)) AND ([o3].[AdId] = [a].[Id])) THEN CAST(1 AS bit)
ELSE CAST(0 AS bit)
END AS [HasSomeOtherOffers]
FROM [Ads] AS [a]
WHERE [a].[Id] = 1
But still, do you think it would be possible to get around this parameter generation somehow?
Let me investigate the expression trees. My understanding is, anytime a queryable object from closure is used, a different scope for parameters starts generating new parameters always. In all above example that is consistent. I will verify and see if it can be changed to generate one parameter only.
@smitpatel out of curiosity, started experimenting a bit on main branch.
Changing row 331 to 337 in https://github.com/dotnet/efcore/blob/0b3165096d6b55443fc06ae48404c2b037dd73e7/src/EFCore/Query/Internal/ParameterExtractingExpressionVisitor.cs
```c#
parameterName
= QueryCompilationContext.QueryParameterPrefix
+ parameterName
+ "_"
+ _parameterValues.ParameterValues.Count;
_parameterValues.AddParameter(parameterName, parameterValue);
var parameter = Expression.Parameter(expression.Type, parameterName);
to the following
```c#
var parameterPrefix = QueryCompilationContext.QueryParameterPrefix
+ parameterName
+ "_";
ParameterExpression parameter;
var matchingParameters = _parameterValues.ParameterValues.Where(p => p.Key.StartsWith(parameterPrefix, StringComparison.InvariantCulture) && p.Value.Equals(parameterValue));
if (matchingParameters.Any())
{
var lastParameter = matchingParameters.Last();
parameter = Expression.Parameter(expression.Type, lastParameter.Key);
}
else
{
parameterName
= parameterPrefix + _parameterValues.ParameterValues.Count;
_parameterValues.AddParameter(parameterName, parameterValue);
parameter = Expression.Parameter(expression.Type, parameterName);
}
.. will actually resolve this issue.
Also added this test to GearsOfWarQueryTestBase.cs to prove it
[ConditionalTheory]
[MemberData(nameof(IsAsyncData))]
public virtual Task Query_reusing_parameter_with_concat_doesnt_declare_duplicate_parameter(bool async)
{
var squadId = 1;
return AssertQuery(
async,
ss =>
{
var innerQuery = ss.Set<Squad>().Where(s => s.Id == squadId);
var outerQuery = ss.Set<Gear>().Where(g => innerQuery.Contains(g.Squad));
return outerQuery.Concat(outerQuery).OrderBy(g => g.FullName);
},
assertOrder: true);
}
public override async Task Query_reusing_parameter_with_concat_doesnt_declare_duplicate_parameter(bool async)
{
await base.Query_reusing_parameter_with_concat_doesnt_declare_duplicate_parameter(async);
AssertSql(
@"@__squadId_0='1'
SELECT [t].[Nickname], [t].[SquadId], [t].[AssignedCityName], [t].[CityOfBirthName], [t].[Discriminator], [t].[FullName], [t].[HasSoulPatch], [t].[LeaderNickname], [t].[LeaderSquadId], [t].[Rank]
FROM (
SELECT [g].[Nickname], [g].[SquadId], [g].[AssignedCityName], [g].[CityOfBirthName], [g].[Discriminator], [g].[FullName], [g].[HasSoulPatch], [g].[LeaderNickname], [g].[LeaderSquadId], [g].[Rank]
FROM [Gears] AS [g]
INNER JOIN [Squads] AS [s] ON [g].[SquadId] = [s].[Id]
WHERE EXISTS (
SELECT 1
FROM [Squads] AS [s0]
WHERE ([s0].[Id] = @__squadId_0) AND ([s0].[Id] = [s].[Id]))
UNION ALL
SELECT [g0].[Nickname], [g0].[SquadId], [g0].[AssignedCityName], [g0].[CityOfBirthName], [g0].[Discriminator], [g0].[FullName], [g0].[HasSoulPatch], [g0].[LeaderNickname], [g0].[LeaderSquadId], [g0].[Rank]
FROM [Gears] AS [g0]
INNER JOIN [Squads] AS [s1] ON [g0].[SquadId] = [s1].[Id]
WHERE EXISTS (
SELECT 1
FROM [Squads] AS [s2]
WHERE ([s2].[Id] = @__squadId_0) AND ([s2].[Id] = [s1].[Id]))
) AS [t]
ORDER BY [t].[FullName]");
}
And it succeeds, with most other tests except for a failing test due to Swedish culture (see #22901). Only other remaining failing test seems to be Ternary_should_not_evaluate_both_sides but I think it could be made working with just a little bit more effort.
Is this a totally wrong angle to resolve this issue, or should I try pursuing further and perhaps try to wrap it together in a PR?
It doesn't look incorrect fix but there is also opposite side of it in https://github.com/dotnet/efcore/issues/22524 where 2 values are kinda same but should generate different parameters. Parameter extraction is a "best-guess" solution and what and how to reuse the query. Without having any particular example in mind, I feel trying to find existing parameter matching value to be somewhat fragile. A more robust solution should be to identify that both expression trees (member access on closure) are the same, so we don't need multiple parameters for it. I need to inspect actual tree to evaluate how to achieve those.
Perhaps the reentrancy case should not clear out evaluated values? https://github.com/dotnet/efcore/blob/0b3165096d6b55443fc06ae48404c2b037dd73e7/src/EFCore/Query/Internal/ParameterExtractingExpressionVisitor.cs#L91
Could experiment a bit further on that track instead. Pursued a bit further on parameter matching solution, but seems like Ternary_should_not_evaluate_both_sides fails due to some optimization possibly in RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.cs when several fields are mapped to the same parameter, but debugging the expression trees are a bit overwhelming for me. It should produce Data1, Data2 and Data3 all equal to @__p_0, but the last two are dropped.
@smitpatel only removing the line where evaluated values are cleared in https://github.com/dotnet/efcore/blob/0b3165096d6b55443fc06ae48404c2b037dd73e7/src/EFCore/Query/Internal/ParameterExtractingExpressionVisitor.cs#L91 will resolve this issue and successfully run my new test for inlined queries. Sometimes removing code is very satisfactory. 馃槃
But now I get problems with Microsoft.EntityFrameworkCore.Query.NorthwindIncludeQuerySqlServerTest.Include_duplicate_collection where the constants in OFFSET 2 ROWS FETCH NEXT 2 ROWS ONLY suddenly gets translated to OFFSET @__p_0 ROWS FETCH NEXT @__p_0 ROWS ONLY for some reason. @__p_0='2' so it probably won't affect actual results - but it was a bit unexpected side-effect. Is this an acceptable change?
edit: ah, it's probably because there is also a SELECT TOP(@__p_0) in the query so the constant 2 value is probably reused from there.
Likely that part shouldn't get reused. Though no functional impact. But bottom line is that we need to clear that dict. Just skip it in reentrancy case.
I've changed behaviour to actually clear the dictionary when not leaving a re-entry, but the tests I mentioned still failed since they actually are re-entrant and affected by this new code.
I believe this new implementation is correct and suggest changing the tests. Since the row limiting operation is parameterized for TOP, I think it should also be for CROSS APPLY otherwise a lot of different query plans will be cached in SQL for different row counts.
PR draft linked.
Eager to solve this since I've done perf tests on my solution and queries that heavily inline queries will benefit a query time decrease around 30% by using the same parameter instead of introducing 10+ parameters with the same value.
@joakimriedel - This won't be released before 6.0 (would certainly be in earlier preview). But 6.0 RTM is next year November timeframe.
@smitpatel So I've heard, but I still hold my thumbs for a 5.1 before next summer. :)