An example from https://github.com/aspnet/EntityFramework/issues/6649 is being able to append OPTION (MAXRECURSION 2000)
to the end of a query.
While the idea of integrating arbitrary provider-specific hints in queries seems useful, treating this as some raw SQL to be tacked at the end of the query may be too limited. A specific provider's SQL dialect may require the additional SQL to be integrated somewhere in the middle, etc...
Some arbitrary annotations on the query itself which would be picked up by the provider-specific query SQL generator may cover more cases.
@roji yeah I wasn't intending to limit it to appending text to the end of the query - I removed the word "append" from the title 😄 .
Ok great :)
Note from triage: we need to investigate what commonality exists here across relational database systems. For example, where can hints be attached? Is there non-SQL Server value in a query-hint only (i.e. always at the end of the query) solution?
/cc @roji @caleblloyd @ErikEJ
Here's the PostgreSQL documentation page on querying.
I'm not sure how many things it has which can be tacked on without impacting the shape of the results (in which case they'd be beyond the scope of this issue). Some ideas that come to mind:
ORDER BY ... NULLS { FIRST | LAST }
, impacting the sort order.FOR UPDATE
). Users would then be able to do stuff like issue updates in raw SQL, keeping the rows locked until the end of the transaction. It's nice to be able to express the query (i.e. set of rows to lock) with EF Core LINQ.Seems like ideally providers would be able to define an extension method on the IQueryable (like AsNoTracking()
), which would be picked up by the SQL generator.
Design Notes:
No Lock
on tables will apply to all tables in the query, however they are brought into the SQL. Based on customer feedback more granular API can be determined.Changes to relational level remain pre-requisite for provider level work but the issue remains in backlog for now. Community contributions appreciated.
cc: @ralmsdeveloper
Another use case: specifying per-query collation. PostgreSQL allows the following on an expression:
SELECT a < b COLLATE "de_DE" FROM test1;
or on a clumn:
SELECT a COLLATE "de_DE" < b FROM test1;
@smitpatel agree that the way forward here is by allowing annotations to be specified by the user in the query, and for those annotations to flow to the provider's query SQL generator. The question is indeed granularity and which elements of the query expression tree can be thus annotated.
Thoughts:
What @roji spoke makes sense to me, I believe we could create a method for annotation on EFCore.Ralational to meet the following needs, following SQL Server as the basis, but the providers would be responsible for consolidating the end information .
We would have to have one more property in TableExpression (public virtual string Hint {get;set;}).
Remember that when using the annotation, this would be replicated in the "Include" tables:
x.Include(y => y.Table)
Or have one more parameter in the extension method: "NoInclude"
public static IQueryable<TEntity> With<TEntity>(
this IQueryable<TEntity> source,
string hint,
string optionalParameter = null);
var query = _db
.Blogs
.With("NOLOCK", "INDEX(myIndex)") // Second parameter optional
.ToList();
SELECT [p].[Id], [p].[Date], [p].[Name]
FROM [Blogs] AS [p] WITH (NOLOCK, INDEX(myIndex))
--or
SELECT [p].[Id], [p].[Date], [p].[Name]
FROM [Blogs] AS [p] WITH (NOLOCK)
This would prevent user typing error.
PAGLOCK, NOLOCK, READCOMMITTEDLOCK, ROWLOCK, TABLOCK ou TABLOCKX.
var query = _db
.Blogs
.With(Hints.NOLOCK)
.ToList();
For the @roji example:
https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-397403491
I believe we can do this in EF.Functions.
Or just:
public static IQueryable<TEntity> Hint<TEntity>(
this IQueryable<TEntity> source,
string hint,
bool repeatForInclude = true);
var query = _db
.Blogs
.Hint("WITH (NOLOCK)")
.ToList();
This is left under the responsibility of the user.
In my thinking I believe that this also becomes more flexible, the user can theoretically use all available hints, since initially this would be designed only for queries!
@ralmsdeveloper - That is what we decided explicitly not to do it. Unless there is any common query hint across all providers.
Especially we cannot use any method with string parameter where string is going to be appended in SQL, that is just asking for SQL injection.
And the enum may require different values based on provider.
Hence in the current design, there will not be any method in relational provider. Providers have to write methods in whichever way they want. It could be single method with enum values or it could be individual methods for each hint.
Relational layer will just provide the storage facility in terms of query annotations. So provider methods can add query annotations, (whatever shape), and those annotations will be flowed to SQL generator by relational, where provider can look at annotations and determine what to write in SQL. Further it also avoids breaking TableExpression since there is no need to add anything like Hint
in any expression. "Hints" will be available globally to SelectExpression while printing, providers can determine where each hint will be printed.
In addition to what @smitpatel wrote above, a simple API to tack strings inside the query is also a pretty bad user experience - the user would need to have knowledge about specific strings and where they go inside the query. It's much better to have provider-defined extension methods which add structured annotations, which are then picked up by the query SQL generator and rendered in SQL. This is also in line with the current way of specifying provider-specific settings on the model, which get translated to SQL by the migrations generator.
@smitpatel if I understand correctly the initial plan is to only allow annotations on the SelectExpression? While it's true this should cover most basic cases, it's a good idea to at least consider what the more flexible/advanced version of this would look like. As I've written above there are viable cases where a column expression, or indeed even a comparison expression could be annotated (see the collation example). I'm not saying these should be supported by day one (in fact I've received no actual request for anything of the sort) but it's good to keep it in mind.
I get it
Especially we cannot use any method with string parameter where string is going to be appended in SQL, that is just asking for SQL injection.
And the enum may require different values based on provider.
Hence in the current design, there will not be any method in relational provider. Providers have to write methods in whichever way they want. It could be single method with enum values or it could be individual methods for each hint.
Like I said before, it's just thoughts.
My view was just in annotation to table, so I talked about creating another property, why providers could override "VisitTable
"
However, if we can retrieve anottations from SelectExpression
would be enough, so the writers of the providers can simply do what they think best, to meet the more specific needs of each provider.
@roji - The current plan is to flow QueryAnnotations to SelectExpression. While it is obvious that query hints (SelectExpression level annotations) are covered by that, you can still use the same infra to have table hints (or other constructs like ordering null) since you have select expression available while printing out table or order by clause. So it is not restricted to just query hints, it could be any part of SQL. I believe it would be better way to transport the data instead of creating properties on individual expressions (like TableExpression/JoinExpression etc.). Perhaps the restriction is on API side, in terms of providers can add API on IQueryable only. Though the methods could take whatever parameters needed to locate the piece of data in SQL.
For the column/comparison example you gave above, how granular it becomes in the query? Do you have any ideas of API. One thing we have in mind for specifying store type for column (since current inference is not great, and perhaps EF6 had similar AsUnicode API), is to have EF.Functions.StoreType
kind API, which can be handled as MethodCallTranslator, the same way how providers can currently add custom methods on EF.Functions.
@roji - The current plan is to flow QueryAnnotations to SelectExpression. While it is obvious that query hints (SelectExpression level annotations) are covered by that, you can still use the same infra to have table hints (or other constructs like ordering null) since you have select expression available while printing out table or order by clause. So it is not restricted to just query hints, it could be any part of SQL. I believe it would be better way to transport the data instead of creating properties on individual expressions (like TableExpression/JoinExpression etc.). Perhaps the restriction is on API side, in terms of providers can add API on IQueryable only. Though the methods could take whatever parameters needed to locate the piece of data in SQL.
I understand, but unless I'm mistaken that would make it impossible (or very awkward) to apply different hints to different parts of the query, right? For example, if I want want ORDER BY
to sort nulls first, and another to sort nulls last, it would be very difficult to express that, won't it?
For the column/comparison example you gave above, how granular it becomes in the query? Do you have any ideas of API. One thing we have in mind for specifying store type for column (since current inference is not great, and perhaps EF6 had similar AsUnicode API), is to have EF.Functions.StoreType kind API, which can be handled as MethodCallTranslator, the same way how providers can currently add custom methods on EF.Functions.
Just to be extra clear - this seems to be a pretty rare requirement and I'm fairly certain 95% of needs will be met by specifying an annotation "globally" at the query level.
For the column/comparison example you gave above, how granular it becomes in the query? Do you have any ideas of API. One thing we have in mind for specifying store type for column (since current inference is not great, and perhaps EF6 had similar AsUnicode API), is to have EF.Functions.StoreType kind API, which can be handled as MethodCallTranslator, the same way how providers can currently add custom methods on EF.Functions.
EF.Functions.StoreType()
does seem like a simple case that can probably be handled with a simple method call translator (although maybe consider a shorter name such as EF.StoreType()
?). In the more general case I'm not sure exactly what you're asking with respects to granularity... collation can be specified on each single column, or on a comparison expression involving two columns.
It indeed difficult to design an API for such cases. Table hints is also one such case. What if the user wants to apply No Lock
on 1 table and not on the other. Due to difficulty in arriving at design of it, we differed to feedback from users. The No Lock
at query level could/should/would append No Lock
to each table appearing in the query. Order By falls under same umbrella. Especially if user wants to specify nulls first/last differently on different order bys then perhaps a better options would be overload of OrderBy
(just throwing out some ideas)`. Of course how to flow that information is still unknown but certainly cannot be query level annotations.
As you said, 95% cases should be covered by "global" level. We can look into the API and data flow, once 5% comes up with feature request to add "granular" control.
By "granularity", what I mean is, if user needs to pin point the part of query it needs to applied (like particular column or comparison) then query level "hints" (methods) are not suitable for that API. Hence out of the scope of this issue.
If you have ideas for API for things which cannot be specified at "global" level, then please file a new issue with API ideas, we can explore how to flow information to SQL.
Because my English is bad, I prefer to write short texts.
@smitpatel then look at this here:
https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql?view=sql-server-2017
Not everyone has this need, but to make a provider much more powerful and flexible, some of these HINTS should be supported at least:
https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table?view=sql-server-2017#syntax
It's worth mentioning that this is a totally isolated SQL Server thing.
Actually now would be to know what kind of API would be global, since the cases are totally different for SQL Server and PostgreSQL.
Then we would have to have two methods:
I believe that the utility method for the PostgreSQL Provider would be only WithLock, which would translate something like:
SELECT * FROM BAR FOR UPDATE
When I thought about using ENUM I was thinking of supporting all this in one way:
| FORCESCAN
| FORCESEEK
| HOLDLOCK
| NOLOCK
| NOWAIT
| PAGLOCK
| READCOMMITTED
| READCOMMITTEDLOCK
| READPAST
| READUNCOMMITTED
| REPEATABLEREAD
| ROCKLOCK
| SERIALIZABLE
| SNAPSHOT
| TABLOCK
| TABLOCKX
| UPDLOCK
| XLOCK
@ralmsdeveloper - Yes, there can be methods like WithNoLock
& WithLock
but it would be in SqlServer provider. Same way Postgre can add more. Unless we identify hints which are "uniform" across multiple providers like SqlServer, PostgreSQL, SQLite, MySQL, Oracle etc, adding a relational surface may cause more burden on provider writers. (Also there would not be a good default for SQL which we need and use in DefaultQuerySqlGenerator.
@smitpatel & @roji, sorry for my poor communication.
But I'm working hard to communicate better with you.
@smitpatel Anyway, what you said was very clear to me now, that would be the way.
As a suggestion, there could be a parameter in these methods, as I mentioned above. repeatForInclude
for the same annotation if replicate the remaining query tables (INNER JOIN / SUBQUERY)
Looks like SQLite has just one hint: INDEXED BY
And NOT INDEXED
...for all those times you wish the query planner would just stop using all those pesky indexes. 😖⁉
Would this also be supported under this, or is it another issue, or is there a current way to implement it?
WITH CHANGE_TRACKING_CONTEXT (context)
Just to reiterate the importance of something like EF.StoreType(value, storeType)
, we've recently seen lots of issues where there's a need to explicitly specify store types when the same CLR type can be mapped to more than one store type (see bottom of issue for some examples).
I just realized that this is a bit more complicated than a simple method translator, since we need to do it for literals/constants as well. In other words, when I specify a literal in my query, EF Core looks up the mapping for that CLR type and uses that to render a literal SQL representation - but we need to be able to specify another type. This choice should affect the mapping that is selected, rather than add a server-side casting operator.
I'm not sure that this problem (both the parameter version and the literal version) is best tracked by this issue, which is more about flowing annotations to the provider's SQL generator. It's also more urgent than general-purpose query SQL hints. Maybe we should split it off and handle separately?
List of scenarios where this is needed:
macaddr
type and a newer 6- or 8-byte macaddr8
type, both mapped to CLR PhysicalAddress
. PhysicalAddress
is mapped to the older macaddr
by default (backwards compat), and working with the newer one has issues.DateTime
is mapped to PostgreSQL timestamp
but also to date
(since .NET has no date-only type). This again causes some severe issues when trying to work with the non-default mapping (i.e. date
). See https://github.com/npgsql/Npgsql.EntityFrameworkCore.PostgreSQL/issues/493#issuecomment-401086741 (fortunately NodaTime can be used to work around this).@roji Thanks for the input. We're tracking this feature as #4978. I have removed that issue from the backlog so we can discuss priority in triage.
Oh sorry, I forgot that #4978 already existed. Thanks for giving this your attention.
Hi, I'm not sure from the discussion where the current thinking on QUERY/TABLE hints has landed and whether a solution is in mind. IMO, the biggest need for many developers is simply control of locking during queries (i.e. table hints). Just that one feature would be really welcome.
For EF 6, we created a With() extension -- just like Rafael described (except with an enum to avoid injection issues). In our implementation, the With() call has to come after an entity collection so we know how to translate the expression and correctly associate the injected WITH statement to the right FROM or JOIN. We also support accepting an index hint, but the feature never gets used (and it's a potential injection issue because the caller passes the index name as a string).
Anyway, the lack of any way to set a table hint for locking in EF Core is our main blocker to migrating our entire commercial ERP app to .Net Core. So I'm super interested in how this issue progresses. Thanks,
-Erik
@erikj999 great to hear from you! I have a couple of questions:
Hi @divega, great to hear from you, too. Here's what worries me about propagating a lock hint throughout a query. With READ COMMITTED SNAPSHOT mode, which we use and I think is the default for SQL Azure and given the query:
BEGIN TRANSACTION -- Otherwise nothing gets locked
SELECT c.Company, c.CustNum, c.Name
FROM Customer AS [c] WITH(UPDLOCK)
JOIN Company AS [c1] ON (c.Company = c1.Company)
WHERE c1.Country = N'USA'
SQL will hold update intent locks on the Customer table, just as the hint prescribes, but won't hold any locks on the joined Company table. So if a Linq statement includes a lock hint and EF Core propagates that lock hint across all tables in the query, that behavior will be inconsistent with SQL and would cause undesired locking.
IMO, if we only get one table hint to specify in an EF query, then it should only be applied to the FROM table only. It would be better if we could have a table hint attached to any entity in the query, where it would be applied correctly to the FROM or JOIN when the T-SQL is expressed, to make the lock behavior more explicit. Hope that makes sense.
The lock hints we use are:
NOLOCK (does nothing but make us feel good, considering we use Read-Committed Snapshot)
HOLDLOCK
READPAST
UPDLOCK (We use this one a ton)
FORCESEEK
One hint we do not use, but I see used in the wild is NOWAIT which is used in logic with retry constructs.
-Erik
lol @ size~1-week tag!!
I am happy to build a cross-reference of query hints across database engines and versions. Could take some time. Note sure I understand why QueryAnnotations should be used for this. I think if @smitpatel could demo NOLOCK hint with SQL Server, that would be helpful, but there are just so many additional user stories I would like to see capture:
UseCoveringIndex<Entity>()
with the fields I wanted covered. I could then use reflection in an integration test with the generated database to ensure there exists a covering index for those columns.It's going to take >1 week just to review this thread :).
I looked for a query hint matrix online thinking someone must have done this before, but couldn't find much, especially for lock control. And after looking across the database engines with providers in EF Core, I came away thinking it's not really possible to put a useful query hint feature into EF Core at the "Relational" level. Even NOLOCK isn't universally supported.
I think table/query hints should be implemented as provider-specific extensions e.g.:
(from x in db.Things.With(LockHint.UpdLock) select new {x.this, x.that}).MaxDOP(4);
...where the table hint With() and MaxDOP are extensions method in the SqlServer provider namespace. And use either with a non-SQLServer provider would throw a runtime error. I agree it's good to keep away from hints that require string arguments (like index names) because of the potential for injection attacks. Maybe index hints could be expressed in the model, but that might preclude provider-agnostic models.
BTW, I think the code that creates the T-SQL syntax for the SQL Server provider is in the Relational library -- not the SqlServer provider itself. I'm guessing by having it there, it serves as a default syntax generator for provider projects. But if query hints get promoted to the road map and can't be generalized usefully from a relational POV, that code would be likely forked, right?
I looked for a query hint matrix online thinking someone must have done this before, but couldn't find much, especially for lock control. And after looking across the database engines with providers in EF Core, I came away thinking it's not really possible to put a useful query hint feature into EF Core at the "Relational" level. Even NOLOCK isn't universally supported.
The PostgreSQL people seem to think very differently about query tuning than SQL Server engineers. The PostgreSQL people do not hint at the "relational level", but rather believe in tuning
Planner Cost Constants. In effect, their approach is a lot like financial engineering and making a decision based on a "mark-to-market hurdle adjustment". They state a model factor, such as cpu_index_tuple_cost, and bump it up until it disables a decision tree in the Cost-Based Optimizer/Query Planner. For example, setting random_page_cost and the seq_page_cost both to 0 emulates a query where all the table data is already mmap'd (pinned to memory). In this sense, PostgreSQL simply thinks about locking differently than SQL Server:
Here is a link to MySQL's Optimization documentation, including specific links for 8.11.2 Table Locking Issues and 8.9 Controlling the Query Optimizer. MySQL seems fairly similar to SQL Server, especially 8.9.4 Index Hints:
tbl_name [[AS] alias] [index_hint_list] index_hint_list: index_hint [index_hint] ... index_hint: USE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list]) | IGNORE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] (index_list) | FORCE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] (index_list) index_list: index_name [, index_name] ...
The latest documentation is available here: Oracle Release 18c: Influencing the Optimizer
Does anyone know how far back Oracle releases are supported right now?
Here is a table from Donald Burleson (famous Oracle guy), summarizing the various hints:
Hints are expressed in special comments that start with +.
/*+ <hint_expression> */
and
--+ <hint_expression>
are both valid ways to express a hint.
| Hint | Meaning |
| ----- | --------- |
|ALL_ROWS | Use the cost based approach for best throughput.|
|CHOOSE | Default, if statistics are available will use cost, if not, rule.|
|FIRST_ROWS | Use the cost based approach for best response time.|
|RULE | Use rules based approach; this cancels any other hints specified for this statement.|
| Hint | Meaning |
| ----- | --------- |
|CLUSTER(table) | This tells Oracle to do a cluster scan to access the table.|
|FULL(table) | This tells the optimizer to do a full scan of the specified table.|
|HASH(table) | Tells Oracle to explicitly choose the hash access method for the table.|
|HASH_AJ(table) | Transforms a NOT IN subquery to a hash anti-join.|
|ROWID(table) | Forces a rowid scan of the specified table.|
|INDEX(table [index]) | Forces an index scan of the specified table using the specified index(s). If a list of indexes is specified, the optimizer chooses the one with the lowest cost. If no index is specified then the optimizer chooses the available index for the table with the lowest cost.|
|INDEX_ASC (table [index]) | Same as INDEX only performs an ascending search of the index chosen, this is functionally identical to the INDEX statement.|
|INDEX_DESC(table [index]) | Same as INDEX except performs a descending search. If more than one table is accessed, this is ignored.|
|INDEX_COMBINE(table index) | Combines the bitmapped indexes on the table if the cost shows that to do so would give better performance.|
|INDEX_FFS(table index) | Perform a fast full index scan rather than a table scan.|
|MERGE_AJ (table) | Transforms a NOT IN subquery into a merge anti-join.|
|AND_EQUAL(table index index [index index index]) | This hint causes a merge on several single column indexes. Two must be specified, five can be.|
|NL_AJ | Transforms a NOT IN subquery into a NL anti-join (nested loop).|
|HASH_SJ(t1, t2) | Inserted into the EXISTS subquery; This converts the subquery into a special type of hash join between t1 and t2 that preserves the semantics of the subquery. That is, even if there is more than one matching row in t2 for a row in t1, the row in t1 is returned only once.|
|MERGE_SJ (t1, t2) | Inserted into the EXISTS subquery; This converts the subquery into a special type of merge join between t1 and t2 that preserves the semantics of the subquery. That is, even if there is more than one matching row in t2 for a row in t1, the row in t1 is returned only once.|
|NL_SJ | Inserted into the EXISTS subquery; This converts the subquery into a special type of nested loop join between t1 and t2 that preserves the semantics of the subquery. That is, even if there is more than one matching row in t2 for a row in t1, the row in t1 is returned only once.|
| Hint | Meaning |
| ----- | --------- |
|ORDERED | This hint forces tables to be joined in the order specified. If you know table X has fewer rows, then ordering it first may speed execution in a join.|
|STAR | Forces the largest table to be joined last using a nested loops join on the index.|
|STAR_TRANSFORMATION | Makes the optimizer use the best plan in which a start transformation is used.|
|FACT(table) | When performing a star transformation use the specified table as a fact table.|
|NO_FACT(table) | When performing a star transformation do not use the specified table as a fact table.|
|PUSH_SUBQ | This causes nonmerged subqueries to be evaluated at the earliest possible point in the execution plan.|
|REWRITE(mview) | If possible forces the query to use the specified materialized view, if no materialized view is specified, the system chooses what it calculates is the appropriate view.
|NOREWRITE | Turns off query rewrite for the statement, use it for when data returned must be concurrent and can't come from a materialized view.|
|USE_CONCAT | Forces combined OR conditions and IN processing in the WHERE clause to be transformed into a compound query using the UNION ALL set operator.|
|NO_MERGE (table) | This causes Oracle to join each specified table with another row source without a sort-merge join.|
|NO_EXPAND | Prevents OR and IN processing expansion.
Oracle Hints for Join Operations: |
|USE_HASH (table) | This causes Oracle to join each specified table with another row source with a hash join.|
|USE_NL(table) | This operation forces a nested loop using the specified table as the controlling table.|
|USE_MERGE(table,[table, - ]) | This operation forces a sort-merge-join operation of the specified tables.|
|DRIVING_SITE | The hint forces query execution to be done at a different site than that selected by Oracle. This hint can be used with either rule-based or cost-based optimization.|
|LEADING(table) | The hint causes Oracle to use the specified table as the first table in the join order.
Oracle Hints for Parallel Operations: |
|[NO]APPEND | This specifies that data is to be or not to be appended to the end of a file rather than into existing free space. Use only with INSERT commands.|
|NOPARALLEL (table | This specifies the operation is not to be done in parallel.|
|PARALLEL(table, instances) | This specifies the operation is to be done in parallel.|
|PARALLEL_INDEX | Allows parallelization of a fast full index scan on any index.|
| Hint | Meaning |
| ----- | --------- |
|CACHE | Specifies that the blocks retrieved for the table in the hint are placed at the most recently used end of the LRU list when the table is full table scanned.|
|NOCACHE | Specifies that the blocks retrieved for the table in the hint are placed at the least recently used end of the LRU list when the table is full table scanned.|
|[NO]APPEND | For insert operations will append (or not append) data at the HWM of table.|
|UNNEST | Turns on the UNNEST_SUBQUERY option for statement if UNNEST_SUBQUERY parameter is set to FALSE.|
|NO_UNNEST | Turns off the UNNEST_SUBQUERY option for statement if UNNEST_SUBQUERY parameter is set to TRUE.|
|PUSH_PRED | Pushes the join predicate into the view.|
Adaptive Server Enterprise 12.5.1 > Performance and Tuning: Optimizer and Abstract Plans
This doesn't come from IBM documentation but from a Gold-level partner named Craig S. Mullins. I chose this over the docs because the docs are extremely full of IBM-speak and I don't speak IBM-speak.
lol @ how complex/obtuse this is (ho-lee cow!):
Influencing the DB2 Optimizer: Part 1
Influencing the DB2 Optimizer: Part 2 - Standard Methods
Influencing the DB2 Optimizer: Part 3 - Tweaking SQL Statements
lol @ this, too:
At times, you may need to disable a specific index from being considered by the optimizer. One method of achieving this is to append OR 0 = 1 to the predicate. For example, consider a query against the EMP table on which two indexes exist: one on EMPNO and one on WORKDEPT. Appending OR 0 = 1 (as shown next) to the WORKDEPT predicate will cause DB2 to avoid using an index on WORKDEPT.
SELECT EMPNO, WORKDEPT, EDLEVEL, SALARY FROM EMP WHERE EMPNO BETWEEN ‘000020’ AND ‘000350’ AND (WORKDEPT > ‘A01’ OR 0 = 1);
The
OR 0 = 1
clause does not change the results of the query, but it can change the access path chosen.There are other tweaks that can be used to dissuade the DB2 optimizer from choosing a specific index, such as multiplying or dividing by 1, adding or subtracting 0, or appending an empty string to a character column.
Influencing the DB2 Optimizer: Part 4 - OPTIMIZE FOR n ROWS
Influencing the DB2 Optimizer: Part 5 - Changing DB2 Catalog Statistics
Influencing the DB2 Optimizer: Part 6 - Using Optimization Hints
@erikj999 Given the above, how would you template a "query hint matrix"? You mentioned one category: lock control.
I have created a repo on GitHub to place such a matrix: https://github.com/jzabroski/SqlQueryHintMatrix
Here is the list of known EF Providers: https://docs.microsoft.com/en-us/ef/core/providers/
@jzabroski Well, that DB2 trick to disable an index is probably the only query hint in the bunch you can today execute in EF :). Seriously, that's crazy.
I'm still skeptical a rich set of query hints can be defined at a general level. But I've been wrong before and I'll go through the links you've provided to get a better sense of it.
But what if we started with a simple enumeration for locks -- NoLock, ShareLock, UpdateLock -- which could be applied at an entity level (both FROM and JOIN) or at the query level?
For index hints, we could start with just one extension method UseIndex(object index) which is also applied to the entity. I think the index would need to be codified in the model somehow so we can reference it without having to pass its name as a string (to avoid injection attacks).
I think those two ideas could be safely defined in EntityFrameworkCore.Relational. Providers could ignore hints that don't apply to the specific engine (like NoLock on Oracle) or throw a runtime error if so configured.
From a cursory look query hints seem sufficiently different across the different databases, that it doesn't necessarily make sense to include anything specific in relational. EF Core should definitely provide the infrastructure/plumbing to allow arbitrary hints to be defined and flowed to the provider, but I really see little value in defining a locking enum, which would be irrelevant for some (most?) databases, and not contain the right range (or terminology) of options for others...
There just seems to be too much diversity here, and there's enough infrastructure work needed in any case just to make hints possible (plus some discussions, such as somehow applying hints to individual components of the query rather than globally).
@roji Looking at this I found it very pertinent!
https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion
@ralmsdeveloper to be clear, I'm very much for query hints and think they're relevant to PostgreSQL, but I'm much less sure about a set of"standard" hints which live in EFCore.Relational and are supposedly valid for all/most databases...
@roji
I agree! If we push it into the Relational library, then it becomes impossible to detect at compile-time any hints that will break when changing database targets. It should be possible to statically link to hints, and guarantee that if you include a Hints library for your provider, you get relative safety that those hints are known to the provider (the provider may fail to generate valid SQL, but at least "blatantly invalid" won't happen.)
This also addresses the fact that Oracle puts hints in "hint comments", whereas Microsoft extends the T-SQL grammar. If you move from Oracle to Microsoft via raw SQL, then your code will "just work". However, if you move from Oracle to Microsoft via EFCore, then your code might not work.
One common ground here is to perhaps do the following:
@ralmsdeveloper This is off-topic, but basically, most databases existing in the modern world started off as research projects of a Stanford guy named Michael Stonebraker (look him up). Including PostgreSQL and Ingres - and Ingres co-developers later formed Sybase which eventually sold a co-license of the source code to Microsoft and eventually became Microsoft SQL Server 6.5 when Microsoft bought it out and then forked it.
Most of the concepts in relational databases and their individual philosophies stem from his "family tree" of projects and Ph.D. students. The reason some databases don't have lock hints is because lock hints don't really make a ton of sense to a Cost-Based Optimizer in a database configured for multi-version concurrency control (MVCC). MVCC by its very nature is intended to be a lock-free design, so hinting an UPDLOCK or NOLOCK is silly to a system that doesn't really "know" about locking. PostgreSQL in general does not have the contention issues Microsoft SQL Server has; SQL Server tends to "lock up" with a sneeze, but thankfully has a ton of tools for DBAs to reduce locking at the expense of maintainability. Yet, Microsoft SQL Server has a pretty amazing _Transactional_ DDL that is a gift from the Gods, and the ability to control locking on a per-table level combined with Transactional DDL makes upgrading systems under high loads not so painful.
If you want to learn more and talk further, ping me offline - my first and last name at gmail dot com. I love this academic stuff and am a bit of a computer science historian.
From a cursory look query hints seem sufficiently different across the different databases, that it
doesn't necessarily make sense to include anything specific in relational.
@roji That was my conclusion as well. I the enumeration idea was a thought exercise (and I almost didn't include ShareLock). Obviously, pushing all this to the provider level means using a hint of any sort causes the query to be statically bound to a specific provider. I wasn't sure how palatable that would be to the EF Core community.
I agree! If we push it into the Relational library, then it becomes impossible to detect at compile-time any hints that will break when changing database targets.
I don't think that is an issue and we need add all those behavior (Fail/Log/None etc.). At present, if you specify raw SQL in FromSql
and then change database provider then we don't do anything different. If the SQL does not work in the second database, it fails. While it is possible to use different database providers with EF Core, we don't provide fully database agnostic model. (FromSql for example). Further, hints may have option to ignore/fail but FromSql
got no choice like that, and has to fail. If we already have one blocking behavior then there is no need to work differently for other API. In the essence, if you are using "provider" specific things in your application, then you cannot change provider and expect everything to work.
It's going to take >1 week just to review this thread :).
True. :trollface:
@smitpatel I don't really agree with your strawman argument. That's like saying because the CLR allows unsafe IO then we shouldn't allow safe IO. We have safety when we can have it - and "un-safety" when we can't reasonably expect it.
In the essence, if you are using "provider" specific things in your application, then you cannot change provider and expect everything to work.
I agree here. What I'm suggesting is that if Providers want (and believe Provider writers should be encouraged to write it the way @erikj999 suggested) then they can write it as extension methods and therefore assist people when they want to write portable code.
Look, if you don't do this I'll just have a class with a bunch of static readonly strings that surface such an API. And that's kind of ugly, because then I have to chase down people who wrote a one off string and drill it in people's heads to use the static readonly strings.
There's a broad continuum of ok to great solutions here.
As already quoted by @erikj999 I have also been analyzing and researching something universal for the core "EFCore.Relational", I came to the conclusion that this is practically impossible, hints are very peculiar to each database.
So I believe that each provider must write its own extensions to meet the specific needs of users, in SQL Server I use NOLOCK and UPDLOCK a lot.
I've been coding something for my provider (this may not be the desired end result).
@smitpatel do you have any suggestions on what I did?
(Please do not consider my tests 🙈, I'm working to leave 100% aligned with EF Core)
https://github.com/ralmsdeveloper/EntityFrameworkCore.FirebirdSQL/commit/db46134f8e947f09736abd84e2a4b74ad2ac1249
@roji do you have any regrets too.
@roji you also have more thoughts on this?
When operations are not working for most cases, then there is no point of providing safety for subset of them. It is pit of failure and confuses users more than helping out. Some magic is fine, but a lot of magic and you are left with userbase who cannot figure out what works and what does not.
@smitpatel I think you need specific examples. You're just generalizing.
I gave a specific use case I like, and even explained how I might code around it if there is no static typing:
public static class QueryHints
{
public static string WithNoLock
{
get { return "WITH (NOLOCK)"; }
}
}
Now, see how @ralmsdeveloper implemented it with a method.
I guess your opposition would be some syntax like this:
var foo = (from p in dbContext.Set<Person>().AddAnnotation(QueryHints.WithNoLock)).ToList()
is somehow superior to this:
var foo = (from p in dbContext.Set<Person>().ToFbHint().WithNoLock()).ToList()
and that the "somehow" rationale is that the core API is in Relational.
Note: It is pretty unfair to call this "magic". Let's not romanticize a purely objective measure like type safety and Open-Closed Principle.
On subjective measures, rather than assume what users find confusing, please explain how it could possibly be confusing to have the following user story:
_I want to switch from Firebird to MSSQL in a large code base and have Visual Studio underline in red ink the breaking API changes for hints._
@jzabroski
I want to switch from Firebird to MSSQL in a large code base and have Visual Studio underline
in red ink the breaking API changes for hints.
This is the downside to having table/query hints implemented as extension methods via a provider-specific namespace. Attaching a query hint effectively kills the ability for that query to be portable to other providers. Worse, the only way Visual Studio would know what code to red-underline is if the developer replaces "using ...Firebird.QueryHints" with "using ...MSSQL.QueryHints" or some such.
I still think this is the best approach for EF Core, though. Developers shouldn't be required to use "FromSql" for something so essential and ubiquitous as basic lock management. Having said that, other ORMs haven't solved this problem either.
I'm curious what opinion @rowanmiller, @divega, or anyone else has on this. Given there is no way to abstract query hints from database engines, the only way forward is through provider-specific extension methods. But using such an extension means in a query means execution is limited to a specific provider. Is that tenable for EF Core, which in other respects is provider agnostic?
I guess another way forward is with an extensibility scheme like the EF 6 wrapping provider which at least lets developers build in their own query extensions. But it isn't easy.
@erikj999 just to mention that this is the pattern already used for database-specific customization in the model/migratons: you use a provider-specific extensions method on on the different model builders (e.g. ForSqlServerIsMemoryOptimized()
).
@jzabroski yes, if you switch providers you get red squigglies, but switching databases is a relatively rare and huge change to your program. Think about differences in type mapping, in client/server evaluation changes (i.e. what used to be server-evaluated is now suddenly client-evaluated), and all the other subtle difference between databases. Honestly, a red squigglie (which you can easily remove) is going to be the least of your concerns...
@roji
I thought about all that, and I think I still want as much help as possible. Thinking about differences in type mapping, client/server evaluation, etc. are outside the scope of this issue. User stories are not intended to solve every single problem a developer encounters. User stories can take on many forms, but one of the most beneficial is the User-Action-Benefit triple. The "As a, I want, so that" user story template is pretty common, because it helps people identify with strangers. Identifying with strangers is one of the things great API designers like Joshua Bloch say is the hardest part about designing APIs. I won't drag on about this, though. If you have any further comments let's go offline or use gitter.im to discuss.
@jzabroski it's an interesting and healthy discussion, I don't really see a reason to take it offline...
My point comparing to other areas where provider-specific instructions are implemented as strongly-typed extension methods, was that a product-level decision has already been made: if you want something specific to SQL Server, you use a strongly-typed method from the SQL Server provider assembly. This doesn't mean we can't be flexible, but if we start introducing provider-specific query hints via weakly-typed strings (or some other mechanism) we provide a very inconsistent API experience to users. IMHO one has to look at the larger context of how the product already looks like.
Your argument against this way of doing things is that migrating between databases creates red squigglies. In my opinion that is actually a benefit. It seems quite dangerous for query hints to silently stop having any effect, which is what would happen if they're weakly-typed: your program may indeed depend on behavior set up by some weakly-typed hint, that doesn't exist or is spelled differently in your new provider. It seems like a much better user story to have the compiler guide them through the places where they're using provider-specific code (e.g. hints), encouraging them to think about what the migration means in each context.
Also, since your argument concerns changing databases, it seems very relevant to see other issues that happen as part of that process (type mapping, client/server evaluation...). Since the red squigglies here are very likely to be a small hurdle compared to the others, and since it's easy to deal with - even if tedious - simply because it's a compile-time error (rather than a subtle runtime error), it just doesn't doesn't seem to be worth losing the benefits of strongly-typed hints.
Finally, I'm not sure in which direction your user story description is supposed to guide us :) I spend a lot of my time trying to listen and think about which APIs users want (but also which APIs are good for them despite what they say they want). My opinion is just different from yours...
@roji uh... re-read the thread. I am pro red squiggles. I described my user
story of wanting red squiggles. Smit seemed to be against it.
But, good to know you're on my side.
(At this point, I'm scratching my head wondering if I had a severe typo or
poor explanation on the order of "A panda, eats, shoots, and leaves.")
On Wed, Aug 22, 2018, 5:19 PM Shay Rojansky notifications@github.com
wrote:
@jzabroski https://github.com/jzabroski it's an interesting and healthy
discussion, I don't really see a reason to take it offline...My point comparing to other areas where provider-specific instructions are
implemented as strongly-typed extension methods, was that a product-level
decision has already been made: if you want something specific to SQL
Server, you use a strongly-typed method from the SQL Server provider
assembly. This doesn't mean we can't be flexible, but if we start
introducing provider-specific query hints via weakly-typed strings (or some
other mechanism) we provide a very inconsistent API experience to users.
IMHO one has to look at the larger context of how the product already looks
like.Your argument against this way of doing things is that migrating between
databases creates red squigglies. In my opinion that is actually a benefit.
It seems quite dangerous for query hints to silently stop having any
effect, which is what would happen if they're weakly-typed: your program
may indeed depend on behavior set up by some weakly-typed hint, that
doesn't exist or is spelled differently in your new provider. It seems like
a much better user story to have the compiler guide them through the places
where they're using provider-specific code (e.g. hints), encouraging them
to think about what the migration means in each context.Also, since your argument concerns changing databases, it seems very
relevant to see other issues that happen as part of that process (type
mapping, client/server evaluation...). Since the red squigglies here are
very likely to be a small hurdle compared to the others, and since it's
easy to deal with - even if tedious - simply because it's a compile-time
error (rather than a subtle runtime error), it just doesn't doesn't seem to
be worth losing the benefits of strongly-typed hints.Finally, I'm not sure in which direction your user story description is
supposed to guide us :) I spend a lot of my time trying to listen and think
about which APIs users want (but also which APIs are good for them despite
what they say they want). My opinion is just different from yours...—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-415183751,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbT_QXDBZxZgCtA5YysDlGcLctACGYrks5uTcrFgaJpZM4KRSfR
.
@jzabroski apologies, my bad... My head must have gotten turned around in this long thread.
Rereading and summarizing, there seems to be an agreement that query hints should be expressed via strongly-typed extension method calls which come from provider assemblies (@smitpatel seems to agree, https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-397738849).
I think there's also a consensus that there's no set of standard query hints that would justify any sort of cross-database extension methods living in EFCore.Relational.
The outstanding point of disagreement/misunderstood seems to be your suggestion in https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-411443194 to have an ExplicitQueryHintInvalidBehavior
enum which controls behavior in case hint is unknown to a provider. I indeed have trouble understanding this:
I'm guessing I might be misunderstanding your suggestion though. Once again, ignoring is what currently happens when a database-specific model extension (e.g. ForSqlServerIsMemoryOptimized()
) is used on a different database. That seems like the right thing to do.
Let me know if I'm somehow mis-summarized things.
Oh, wow, you're right. I see the confusion is all my fault., and completely tied to my enum idea.
Here is my thought process on this enum:
1) Above, I listed query hints for many different providers, and concluded that for Oracle, since their hints are comment-based, if EF had an AnnotationTranslators namespace similar to Microsoft.EntityFrameworkCore.Query.ExpressionTranslators, then that could be used to provide a generic core for EF to translate expressions. Then the concrete EF Query Provider could pattern match on the annotation to see if the contents were a hint or not. However, if you're migrating from Oracle to MSSQL, an MSSQL provider might see an annotation as /*+ as the start of a basic comment and just inject comment text into the query. However, if instead of just AnnotationTranslators there was HintTranslators, MSSQL provider could omit a warning or throw an exception when its pattern matcher detects it couldn't recognize the hint.
2) Besides just Oracle->MSSQL, I was thinking of cases like "SQL Server <2016"->"SQL Server >2016 SP1", where 2016 SP1 added the new USE HINT construct. My user story is as follows, and based on my past life working for erecruit:
As an ERP developer at Well-Tempered Software Company who has customers who self-host my software, and are on different versions of SQL Server, I want to maintain a single source line of code and still be able to hotfix urgent issues for customers whose databases don't understand newer hints. And user story 2: I want my customers to have painless upgrades.
To be clear, I personally can fulfill this user story through erecuit.Expr and calling Expand() and having Expand dig out query annotations at run-time that don't apply to the customer's database. The EF Query Provider would never even know I performed magic. However, I think this is something worth considering making internal to the code. Even last night I was watching Philip Carter's "Why you should use F#", and he talked about the Microsoft Notes and Tasks team needing to write code that solves syncing data across Exchange 2008, Outlook Mobile, etc. through current o365 releases. Which reminded me: The world is full of Speedy developers and a few Mr. Rogers.
Above, I listed query hints for many different providers, and concluded that for Oracle, since their hints are comment-based, if EF had an AnnotationTranslators namespace similar to Microsoft.EntityFrameworkCore.Query.ExpressionTranslators, then that could be used to provide a generic core for EF to translate expressions. Then the concrete EF Query Provider could pattern match on the annotation to see if the contents were a hint or not. However, if you're migrating from Oracle to MSSQL, an MSSQL provider might see an annotation as /*+ as the start of a basic comment and just inject comment text into the query. However, if instead of just AnnotationTranslators there was HintTranslators, MSSQL provider could omit a warning or throw an exception when its pattern matcher detects it couldn't recognize the hint.
Unless I'm mistaken, I think we're in agreement that however we represent hints in EF Core, they cannot simply be strings that the user specifies and the provider tacks on in some place in the SQL (this would create SQL injection vulnerabilities etc.). In other words, a hint annotation is something "abstract" (even if it's provider-specific), which the provider picks up and then renders into SQL in some way (which is irrelevant - comment, syntax extension, whatever). So I don't think anyone is suggesting a model where MSSQL sees some annotation from Oracle which then gets somehow injected as a comment into MSSQL T-SQL and silently ignored... Just like provider-specific model extensions, hints would be (structured) annotations which get interpreted and rendered into SQL, rather than SQL fragments that gets integrated somewhere in your SQL query.
However, if instead of just AnnotationTranslators there was HintTranslators, MSSQL provider could omit a warning or throw an exception when its pattern matcher detects it couldn't recognize the hint.
I think you're still disregarding the multiple database scenario. It's completely fine to use both Oracle and MSSQL in the same application, meaning hints from both databases may potentially exist on the same query. I think this is something we want to enable/allow, so we don't want providers to warn or throw when they see unrecognized hint annotations. We just want them to be ignored.
To be clear, you still get red squigglies when migrating from one database to another, because you've dropped the reference to the old provider's assembly, and the strongly-typed extension methods which produce the annotations no longer exist. The database migrations scenario is very different from the scenario where multiple databases are supported at the same time.
Hope this is clear, maybe we're still just agreeing with one another...
I don't think you addressed MSSQL2017 -> MSSQL2016 hotfix path.
I would like to ask about the possibility of at least providing a specific mechanism for UPDLOCK. I'm not sure how that would translate to DB engines other than SQL Server and would probably not make sense (?) for non-relational db types but it would be very helpful. Syntactically it could be used something like ctx.People.ForUpdate().Where(...) and ctx.People.ForUpdate().FirstOrDefaultAsync(...).
In serious scenarios the usage of transactions is critical and in many cases this translates to either SNAPSHOT or SERIALIZABLE.
A typical scenario is reading an object, updating it in memory based on business rules, and saving the changes. With SERIALIZABLE, if two transactions execute concurrently, the read will succeed for both, but when one tries to execute Save Changes the transactions would deadlock. Of course, that is a transient condition which could be retried but if we could explicitly indicate our intention to update the object then, rather than deadlocking, one of the concurrent transactions would wait its turn to read the object. I find this preferable.
Does this make sense? Should I post this as a separate issue?
Thank you!
@apacurariu this seems like exactly one of the scenarios enabled by query hints. This issue is currently in the backlog, so it will be considered for a future release.
@roji Thank you for getting back! It's great news that this will be considered.
````
public void UpdateThing(Thing clientThing) {
var foundThing = context.things
.Where(thing =>thing.id == clientThing.id
&& thing.lastUpdateCounter == clientThing.lastUpdateCounter)
.With(ExclusiveLock) //block the 2nd concurrent updater until the first completes
.Singleordefault();
if (foundThing ==null) {
throw new exception("sorry your thing has been updated since you last read it.");
}
foundThing .lastUpdateCounter++;
foundThing .Name = clientThing.Name;
context.Things.Update(foundThing );
context.SaveChanges();
}
````
I believe could be handled with a Concurrency Token, since EF will supplement the UPDATE with a WHERE condition on the value of the column attributed as Concurrency token, expect it only compares with the values just read, it only works for race conditions, it doesn't handle stale updates. ie: https://docs.microsoft.com/en-us/ef/core/saving/concurrency
Placing an intent lock allows solving both Stale Updates and Concurrent Race Conditions, imho.
@mikerains Update Locks are used to avoid deadlocks. The point of an Update Lock is literally to prevent a shared lock.
Concurrency tokens by comparison do not necessarily stop an update lock from being needed, such as bulk updates to a table with concurrency tokens might need to use an update lock to prevent deadlocks. In this sense, you don't normally see UPDATE LOCK requested as a feature in a SQL Server ORM, because normally it's code written by a SQL Database Engineer, not a C# engineer. If the code is sensitive enough to require an UPDATE LOCK, you probably don't want to leave the SQL to execute to depend on a c# code generator like an ORM.
They're two separate use cases, although it doesn't stop developers from using one when the other should be used.
Yes, I think I meant an Exclusive lock. In any event, a "With Hint" capability would allow control over the locking behavior and implement strategies such as Stale Update prevention in a way that doesn't suffer from Concurrency Updates.
I am going to edit my post to reflect an Exclusive Lock, as my intent was that the attempt to read would be blocked while another thread is updating. I was able to place Intent locks using EF6 some years ago to achieve this.
I read lot of posts here, and the basic problem is "to try adding something specific to providers in an universal way".
The first hasty answer is it is obviously not possible. An Enum never will be portable between providers. What comes in mind is Http's StatusCodes. Its a big list of const integers, because the list can be bigger or even customized (you can return a custom StatusCode if you want it OR you don't know what specific (or custom?) StatusCode comes from a request).
The question is, if you added something provider specific in YOUR CODE, you don't expect it will work when changing providers. The real question is WHEN the code will break. I imagine the 3 basic ways to code break is:
I really don't know the correct answer to this, but I can suggest EFCore developers must build some kind of API to help "Provider Designers" in Compile Time. The "Provider Designers" then ensure in Run Time everything is good and safe (by example, if the HINT constants exist from some list, if they are supported, etc). The "Provider Designers" will be obviously responsible to provide these constants someway, and OBVIOUSLY, if the user changes providers the code will not compile or run, and this is not EFCore developer's fault or provider designer's fault.
Another point to think about the specifics of Hints is some hints cannot be used together (like UPDLOCK and SERIALIZABLE in Sql Server). This logic can be handled only by Provider Developers. I cannot think an easy or simple way to EFCore devs handle this.
PS:
One interesting way to handle this, using inheritance:
Using SQL Server Query Hints with Entity Framework
The EFCore can provide "HintBase", "QueryHint", "TableHint", etc, base classes, and provider devs can then inherit and create specialized classes.
@GitClickOk I can follow your train of thought but this is not how computer scientists solve such problems as:
the basic problem is "to try adding something specific to providers in an universal way".
The concept you are looking for is expression problem. I will rephrase the basic problem ever so slightly so you can see that it matches the definition on Wikipedia:
The problem is "to be able to independently add new providers (data types), and new operations (query hints) in a universal way".
In this sense, most computer scientists would spit out their morning coffee if they saw the solution was to just keep a list of integers in some StatusCodes class. The reason is simple: Doing it this way requires updating (both code editing and recompiling) and redeploying the same assembly/nuget package. In computer science land, that's too easy and also too error prone. As you are aware, it requires making sure everyone coordinates and doesn't use the other programmer's integer codes. This "sort of works" for international standards like HTTP StatusCodes defined by an RFC, as well as your car's ODB status codes, but it doesn't really work for things where people have different release timelines and need to extend functionality at the very moment they need to extend it without having to get approval from some vendor to add their solution to a sacred list of integer codes.
I skimmed through your example article, and the basic problem is that it is using DbCommandInterceptor, which will intercept every query. This is not great for a number of reasons, the most basic being the caller needs to know their DbContext configuration in addition to their local requirements.
@jzabroski I understand you, sorry if I expressed myself bad. I'm not suggesting "keep a list of integers", I was just comparing this list with a list of strings like "NOLOCK" or "FOR UPDATE". I ever said "I really don't know the correct answer to this", I'm just adding topics for brainstorming.
About the article, I think a nice idea the "architecture" used with some base classes and the providers can then inherits to add new "Hint" classes. It does not need specifically use DbCommandInterceptor because we are talking with EFCore devs, they can do the way they want, the article writer is using what is available, but the EFCore devs can use the idea of base classes, inheritance (or even interfaces) and allow Provider devs to extend as will.
Another unrelated idea (I thought while I answered you) can be the way as .Net deals with SqlDbType and DbType. You can read it here. We have DbType (generic) and SqlDbType (SqlServer specific). Each provider can add custom DbTypes.
I'm sorry if I'm not being specific, as I said before, I don't know the solution, I'm just "brainstorming", listing ideas in order to maybe help to develop some solution.
Smit Patel has already explained what the EFCore team seems as the API design challenges: They view EFCore querying as two sets of APIs: a relational-level API that contains basic relational algebra concepts, and provider-level API that contains provider-specific extensions. Smit even covers a third API layer implicitly, which is, _how do you put query hints on things that are included implicitly?_ Quoting him:
Further API needs to consider what would happen for the tables which are indirectly brought in through auto-include of owned entities or nav expansion.
The other practical reason nothing is done here is that there is a clear workaround for this problem: Move logic into stored procedures, or encapsulate tables with views that apply hints internally. There are no workarounds for bugs in the product, however, so the EFCore team needs to prioritize those as well as getting people to move to .NET Core (which means porting EF6 to .NET Core rather than tackle fixing bugs and adding new slick features like this one).
Hope this helps.
@rowanmiller as #12263 said, it's critical in a high-performance environment, but it's been more than one year, there's still no progress. could you speed things up and add this to milestone 3.0?
Is there any consideration for option(optimize for (@p0 = 1))
and hints like it? I have some queries that are suffering from parameter sniffing. It'd be nice to turn it off selectively where it hurts rather than turn it off at the DB level
@mburbea The solution for now is to create a view and map an entity to the view, or move the generated query into a stored procedure after you capture it in profiler, and apply your hint inside the wrapper.
A view doesn't help me here as you can't stick an option clause on a view statement. I can switch to a store procedure route, but that's kind of ugly and removes the whole point of using EF.
I can go down the road of using a diagnostic listener and intercept it that way, but I'd prefer to avoid that mess. It'd be nice to end a query with a .Hint("option optimize for (@p0=1)")
.
@smitpatel do you have thoughts about this?
If so, I would like to work on it.
If you have nothing, I can play a few bits to restart this thread!
cc/ @divega
_Edited because of the lousy English hehehe!_
@ralmsdeveloper as a team, we are really focused on finishing EF Core 3.0. AFAIR, the general design was settled a long time ago and captured in a few of the 69 comments above, but at this point, even if you sent a perfect PR that addresses all the concerns, we would probably have to ask you to wait until we have a chance to review it, most likely in the next release after 3.0. Even trying to separate the noise from valuable insights to summarize what the design should be, has become such a challenge that I have to tell you it is out of scope for this release. Sorry.
@divega
We know the commitment that the team has, let's wait for more about it, if there is time left I will create an extension for version 3.0 so those who need this functionality could use it.
@divega Could I try to create a new, clean issue that tries to summarize some of the talking points?
I've started extracting some of my comments into sub-pages here: https://github.com/jzabroski/SqlQueryHintMatrix - my thought would be to continue to improve this repository, have your team review it, then create a new issue from my refactored README.md
I think Smit's thoughts are valuable, especially things like many-to-many tables and future features like not requiring many-to-many intermediary CLR types and how someone would query hint such things.
@jzabroski thanks for collecting that information. It doesn’t help now, but i may help when we revisit this issue in the future.
Hi guys, I see this has been opened for a while. I just raised an issue with ForceIndex for MySQL but I can see this is a more generic thing. Is there a way I can help?
Removing milestone to discuss in triage.
@neoaisac Thanks for your interest. There's a lot of discussion here which I am hoping @divega can distill down into some key points. However, it could be that this ends up being quite complicated to implement and it may end up being something that, realistically, needs to be done by the team.
@neoaisac thanks for your interest. As I mentioned in https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-491473311,
...as a team, we are really focused on finishing EF Core 3.0. AFAIR, the general design was settled a long time ago and captured in a few of the 69 comments above, but at this point, even if you sent a perfect PR that addresses all the concerns, we would probably have to ask you to wait until we have a chance to review it, most likely in the next release after 3.0. Even trying to separate the noise from valuable insights to summarize what the design should be, has become such a challenge that I have to tell you it is out of scope for this release. Sorry.
I understand there has been interest from multiple people in this issue, so I am going to try to describe some of the dimensions we have identified in this discussion that I believe will be relevant for a future design, even if we don't intend to take this for now.
We came to the conclusion some time ago that the feature would be divided in two layers:
A set of high-level and (most likely) strongly-typed APIs that would allow representing the hints that can be associated with queries when targeting a specific type of database. These APIs would most likely be defined as extension methods by providers.
A set of low-level and (most likely) loosely-typed support APIs defined in relational and/or core that would provide a unified mechanism for hints to flow from the high-level APIs to the bottom of the query pipeline, where providers can then process them and generate the corresponding SQL.
I think there is consensus that we don't need to try to identify common hints that could apply across multiple providers. The common code (e.g. the code in relational or core) only provides the _transport_ for the hints and each provider redefines the full set of hints it accepts.
Open issue - Naming: should these extension methods include a prefix in the name to make sure there are no collisions when multiple providers are in scope? E.g. ForSqlServerWithNoLock()
We also came to the conclusion that we should not conflate what are actually separate features. Hints at the whole query level, per subquery or specific to tables may require different treatment both on the user facing high level API and potentially on the low level implementation. Different databases also support different hints at different labels, although some are common.
The API for query hints results in hints that apply to the whole SQL query or to specific subqueries, hence regular extension methods on IQueryable
Table hints need to be constrained to a specific table participating in a query. When multiple tables participate from the same query, they can carry different hints. Even when the same table participates more than once in the same query (e.g. a self-join) it can carry different hints on each appearance. In general this can translate to having methods available on DbSet but also simple sugar methods that are easier to call when just one hint applies: In scenarios in which entities and tables don't map 1:1 (e.g. table splitting, entity splitting, TPT, and TPC), there might be some misalignment, and we need to come up with ways to handle that acceptably. For example:var pendingTransactions = context.Transactions
.WithTableHints(hints => hints.NoLock().ForceScan())
.Where(t => t.State == TransactionState.Pending && t.UserId == id);
var pendingTransactions = context.Transactions
.WithNoLock()
.Where(t => t.State == TransactionState.Pending && t.UserId == id);
query table hint but accepts the name of the table that the hint is associated with as an argument.
We could allow default table hints to be indicated in the model. In fact, hints can apply not just to SELECT queries but also to UPDATE, INSERT, DELETE operations, so having APIs that enable granular control and affect CUD SQL generation could enable new scenarios.
This happens to be true for query hints as well (i.e. they can apply to UPDATE, INSERT, and DELETE), but it is harder to think how this could be represented.
Low level API in relational or core that allows annotating a query with hints. We already other have things that look a lot like this, like query tags so we may be able to reuse infrastructure. As mentioned above, we could have one mechanism that covers both query hints and table hints, as long as for the latter it is possible to provide the specific table the hint applies to as an argument.
Most used query and table hints for some provider.
Table hints in the model.
@divega one small note... when talking about table vs. query hints, it may be worth thinking about query hints which apply to specific SELECT statements. A query may involve multiple subqueries, and there may be scenarios where specifying different hints for these may be desirable. I'm not sure whether there's enough real-world scenarios to justify this - just mentioning it.
@roji good point. We will need to survey how multiple database engines work to arrive to a conclusion on whether subqueries can carry they own hints. Do you have any cases in mind?
@divega This is an excellent write-up. One point:
In scenarios in which entities and tables don't map 1:1 (e.g. table splitting, entity splitting, TPT, and TPC), there might be some misalignment, and we need to come up with ways to handle that acceptably. For example:
[...]
- For cases that require complete control, we can provide an API that looks like a query hint (e.g. can appear anywhere in the query) but accepts the name of the table that the hint is associated with as an argument. In fact, it may very well be the case that this is the basic building block that we provide in the relational layer.
While this seems like the correct catch-all, I think one edge case @smitpatel mentioned is important:
My main observation is that many-to-many is still a work in progress, and I don't want hints to create technical debt that slows progress on many-to-many: I can workaround lack of table hints, I cannot easily workaround exposing many-to-many tables as classes in my domain model.
@divega I guess null ordering (see https://github.com/aspnet/EntityFrameworkCore/issues/6717#issuecomment-390110534) could be an example, where you want one null ordering in a subquery and another in the main. A possibly weaker one is could be SELECT ... FOR UPDATE ...
, which locks the selected rows for the duration of the transaction. In theory locking behavior is per SELECT per table (although you can leave out the table to lock rows coming from all tables), but in practice I'm not sure it really matters to have one subquery lock and another not to lock... It might be possible to imagine such a case if I think about it more.
One issue with all this is that in general, subqueries aren't really expressed in the LINQ query - they are formed internally by the query pipeline in order to translate certain operations. So it's not clear how someone would express anything on a subquery... Perhaps a hint method call that holds for all previous LINQ operator calls, up until the last method call.
I had started messing around with this, but the cover was exclusiem to a tableexclusively on a table, I liked the scenario that @divega pointed out about DbSet
@roji All of this just reinforces that we are nowhere near a final design :smile:
Anyway, I think in many cases subqueries are actually present in the LINQ version of the query as well.
Based on what you describe, I am inclined to support query hints anywhere in the query, just like regular LINQ operators. The provider may decide where to generate it. E.g. the Npgsql provider can decide to generate hints for the closest subquery boundary, while the SQL Server provider may always lift the query hint all the way to the whole query to generate OPTIONS correctly.
@jzabroski good point about hints for objects that are hidden like the many-to-many link table. We may need also leverage a low-level API for this.
Interestingly, I was reading that SQL Server (and I believe also Oracle) support specifying some table hints _globally_ (e.g. as query hints) by providing the identity of the table via its _exposed_object_name_ (essentially the alias used in the query to refer to that table). Not sure it will help though. It seems a design that includes these might already be too far down the path of diminishing returns.
Having an option to tell EFCore which transactions scope is requested, would be very useful.
@sunsnow-tail can you confirm that you mean transaction isolation (e.g. serializable, repeatable read), and not the System.Transactions.TransactionScope class? If so, that is not part of the query itself, but rather set when the transaction gets created. See the docs for controlling the transaction.
@roji transaction isolation it is.
@sunsnow-tail thanks, so yeah, as shown in the docs above, you can pass your desired isolation level to BeginTransaction, e.g.:
c#
using (var transaction = context.Database.BeginTransaction(IsolationLevel.Serializable))
{
// ...
transaction.Commit();
}
@roji EFCore 1.1.2.0 has no overloads for IDbContextTransaction BeginTransaction()
EFCore 2.2.6.0 is the same
The overload that accepts an isolation level is RelationalDatabaseFacadeExtensions.BeginTransaction.
@roji Thank you.
Install-Package : Could not install package 'Microsoft.EntityFrameworkCore.Relational 3.0.0'. You are trying to install this package into a project that targets '.NETFramework,Version=v4.8', but the package does not contain any assembly references or content
files that are compatible with that framework. For more information, contact the package author.
Do I try install the right package?
If not: what is the right one?
@sunsnow-tail please don't continue the conversation on this issue (it's unrelated) - open a new issue if you have further issues.
EF Core 3.0 dropped support for .NET Framework (see breaking change note).
please add to milestore 5
I fail to see why this would be part of the transaction mechanics... :(
Just wanted to voice my opinion that something like @divega posted above would be extremely helpful for me. Right now in my current assignment I either have to wrap everything in a transaction or set at the DB level and with the proposed API there it seems like it would be a LOT more natural using LINQ.
Any workaround until major updates?
I´m using ef 3.1, with some SQL background tasks that frequently update my tables, and I need ef queries to not being locked by that.
The database is configured with snapshot.
I appreciate any help or tip.
Thanks
@lixaotec Consider using an IDbCommandInterceptor.
Will give a look @ajcvickers , thank you!
Is there any chance this is going to make it into EF Core 5 so we will be able to start adding with (nolock) on our queries?
@wklingler EF Core 5.0 is shipped. The next release will be EF Core 6.0 in November 2021. We are currently going through the planning process for 6.0 and will consider this item. However, keep in mind that there are many other high priority features with which it will be competing for resources. Make sure to add your vote (👍) above.
That's too bad, thanks for such a quick response!
Voting for the feature having ran into https://github.com/dotnet/efcore/issues/23580
For now the only way is to create views on tables with nolock.
Most helpful comment
please add to milestore 5