Am I correctly deducing from SqlNullabilityProcessor, that a SQL function with arguments, that will always return NULL as long as at least one argument is NULL (a typical implementation for a SQL function), but that can also return NULL even if all arguments are NOT NULL, _must_ have the argumentsPropagateNullability array set to false for _all_ arguments?
With argumentsPropagateNullability set to to true:
WHERE SOMEFUNC('SomeConstantNonNullValue') IS NOT NULL
/* will simply be optimized to: */
WHERE TRUE
With argumentsPropagateNullability set to to false:
WHERE SOMEFUNC('SomeConstantNonNullValue') IS NOT NULL
/* will not be optimized: */
WHERE SOMEFUNC('SomeConstantNonNullValue') IS NOT NULL
So argumentsPropagateNullability doesn't just let EF Core know, that a function will return NULL, once that propagating argument IS NULL, but will also tell it the opposite, that a function will never return NULL, if all null propagating arguments are indeed NOT NULL.
This would also mean, that a function with 19 not-null propagating arguments and 1 null propagating argument, cannot return NULL, once that 1 null propagating argument is in fact NOT NULL.
And as initially stated, it also means that a function that can return NULL, even if all its arguments are NOT NULL, _must_ have argumentsPropagateNullability set to false for _all_ arguments.
Is that correct?
Am I correctly deducing from SqlNullabilityProcessor, that a SQL function with arguments, that will always return NULL as long as at least one argument is NULL (a typical implementation for a SQL function), but that can also return NULL even if all arguments are NOT NULL, must have the argumentsPropagateNullability array set to false for all arguments?
From my understanding, no - argumentsPropagateNullability only tells EF which argument propagates NULLs, i.e. arguments which, when null, would cause the entire function to be null. There's the separate flag for the nullability of the function as a whole, which expresses whether that function ever returns null. So argumentsPropagateNullability should only be false for arguments which, even when NULL, do not cause the function to return false (this is really rare in the SQL world).
Are you seeing other behavior? If so, can you post an exact function configuration plus the code you're using to test for the outcome?
From my understanding, no - argumentsPropagateNullability only tells EF which argument propagates NULLs, i.e. arguments which, when null, would cause the entire function to be null.
That is how I thought it works as well until about an hour ago.
Are you seeing other behavior? If so, can you post an exact function configuration plus the code you're using to test for the outcome?
Yes, then I believe there is a bug in the SqlNullabilityProcessor.ProcessNullNotNull() implementation (together with the SimplifyLogicalSqlBinaryExpression() call).
Our translation code is the following:
_(From https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/blob/b31b2acd4225cb927600282dff7f97d4eacddb79/src/EFCore.MySql/Query/ExpressionTranslators/Internal/MySqlJsonDbFunctionsTranslator.cs#L94-L106)_
```c#
nameof(MySqlJsonDbFunctionsExtensions.JsonSearchAny)
=> _sqlExpressionFactory.IsNotNull(
_sqlExpressionFactory.Function(
"JSON_SEARCH",
Array.Empty
.Append(json(args[0]))
.Append(_sqlExpressionFactory.Constant("one"))
.Append(args[1])
.AppendIfTrue(args.Length >= 3, () => args.Length >= 4
? args[3]
: _sqlExpressionFactory.Constant(null, RelationalTypeMapping.NullMapping))
.AppendIfTrue(args.Length >= 3, () => args[2]),
typeof(bool))),
We have got the following test:
_(From https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/blob/b31b2acd4225cb927600282dff7f97d4eacddb79/test/EFCore.MySql.FunctionalTests/Query/JsonMicrosoftDomQueryTest.cs#L405-L417)_
```c#
[Fact]
public void JsonSearchAny()
{
using var ctx = CreateContext();
var x = ctx.JsonEntities.Single(e => EF.Functions.JsonSearchAny(e.CustomerElement, "J%", "$.Name"));
Assert.Equal("Joe", x.CustomerElement.GetProperty("Name").GetString());
AssertSql(
@"SELECT `j`.`Id`, `j`.`CustomerDocument`, `j`.`CustomerElement`
FROM `JsonEntities` AS `j`
WHERE JSON_SEARCH(`j`.`CustomerElement`, 'one', 'J%', NULL, '$.Name') IS NOT NULL
LIMIT 2");
}
Instead of what we check for, it just generates the following SQL (and also returns multiple items):
SELECT `j`.`Id`, `j`.`CustomerDocument`, `j`.`CustomerElement`
FROM `JsonEntities` AS `j`
LIMIT 2
The WHERE clause is removed in SqlNullabilityProcessor, because all 3 null-propagating arguments are NOT NULL, resulting in 3 true values that are being aggregated/optimized to a single true value:
Logic implemented is somewhat faulty.
There are 2 flags
SqlFunctionExpression.IsNullable which tells if the function can take null value or not. It would optimize Function IS (not) NULL calls if set to false.
If function can be nullable, then we need to look at if Arguments are configured to propagate nulls (could use better name). The values of which implies that function is null is eqv. of those arguments is null checks. Which allows skipping function call in SQL.
Opposite of that - Function can take null value when arguments are non-null, requires passing propagates nullability to be fully false (or may be additional flag). So above optimization should only run when there is at least one argument which propagates nullability otherwise we don't have a way to decompose.
Here are two simple more commonly executable tests that should work, but don't:
```c#
[ConditionalTheory]
[MemberData(nameof(IsAsyncData))]
public virtual async Task Where_nullable_function_with_null_propagating_argument_is_not_null(bool async)
{
await AssertQuery(
async,
ss => ss.Set
entryCount: 92);
AssertSql(
@"SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE LOWER([c].[CustomerID]) IS NOT NULL");
}
[ConditionalTheory]
[MemberData(nameof(IsAsyncData))]
public virtual async Task Where_nullable_function_with_null_propagating_argument_is_null(bool async)
{
await AssertQuery(
async,
ss => ss.Set
entryCount: 0);
AssertSql(
@"SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE LOWER([c].[CustomerID]) IS NULL");
}
```
@lauxjpn - How are those tests are failing? If "LOWER" is set with correct argument propagates null then LOWER shouldn't be in AssertSql as it would go to argument null check.
The way it is currently implemented, "propagates nullability" is supposed to simplify/help with the most common case - function that is null if and only if any of the subset of it's arguments is null, i.e. f(a, b) == null <=> a == null || b == null For more complicated cases (function can be null despite all arguments being not null, nullability depending on more complicated interactions between arguments, say, f(a, b) == null <=> a == null && b == null, etc) all values for argumentsPropagateNullability (+instance) should be set to false, so that we skip the optimization.
@smitpatel Currently, LOWER is translated like the following for SQL Server:
c#
if (_toLowerMethodInfo.Equals(method)
|| _toUpperMethodInfo.Equals(method))
{
return _sqlExpressionFactory.Function(
_toLowerMethodInfo.Equals(method) ? "LOWER" : "UPPER",
new[] { instance },
nullable: true,
argumentsPropagateNullability: new[] { true },
method.ReturnType,
instance.TypeMapping);
}
So this SQL function is able to return NULL _for whatever reason_ and will definitely return NULL if its only argument actually IS NULL.
However, the "_for whatever reason_" part is important. While we both personally know, that LOWER _should_ only return NULL in case its only parameter is NULL, this does not mean that EF Core can make this connection. EF Core can currently only know, that LOWER() can return NULL, but it cannot know for sure all cases when it does.
Lets assume there is a character, that can only be written uppercase. A server could implement the LOWER function in a way, that it returns NULL when it encounters this character.
So there can be more (internal implementation) reasons for a function to return NULL (when nullable: true) than just the arguments. Therefore, as long as there is no flag explicitly specifying, that _only_ the arguments matter to decide, whether the function will return null or not, _all_ nullable functions need to be handled in a way that assumes they can return NULL at any time for any reason.
The way it is currently implemented, "propagates nullability" is supposed to simplify/help with the most common case - function that is null if and only if any of the subset of it's arguments is null, i.e.
f(a, b) == null <=> a == null || b == nullFor more complicated cases (function can be null despite all arguments being not null, nullability depending on more complicated interactions between arguments, say,f(a, b) == null <=> a == null && b == null, etc) all values for argumentsPropagateNullability (+instance) should be set to false, so that we skip the optimization.
@maumar So that would mean, that my initial assessment is correct then:
So
argumentsPropagateNullabilitydoesn't just let EF Core know, that a function will returnNULL, once that propagating argumentIS NULL, but will also tell it the opposite, that a function will never returnNULL, if all null propagating arguments are indeedNOT NULL.This would also mean, that a function with 19 not-null propagating arguments and 1 null propagating argument, cannot return
NULL, once that 1 null propagating argument is in factNOT NULL.And as initially stated, it also means that a function that can return
NULL, even if all its arguments areNOT NULL, _must_ haveargumentsPropagateNullabilityset tofalsefor _all_ arguments.
In other words, if a function can return NULL for other reasons than one if its arguments being NULL, _all_ arguments have to be marked as not-propagating-null by setting all argumentsPropagateNullability elements to false.
@lauxjpn correct. My understanding was that vast majority of nullable functions (at least on sql server) behave this way.
We will implement the following methods for Pomelo to more explicitly support the correct usage:
``c#
/// <summary>
/// Use for any function that could returnNULLfor *any* reason.
/// </summary>
/// <param name="name">The SQL name of the function.</param>
/// <param name="arguments">The arguments of the function.</param>
/// <param name="returnType">The CLR return type of the function.</param>
/// <param name="onlyNullWhenAnyNullPropagatingArgumentIsNull">
/// Set tofalseif the function can returnNULLeven if all of the arguments are notNULL`. This will disable null-related
/// optimizations by EF Core.
///
///
///
public virtual SqlFunctionExpression NullableFunction(
string name,
IEnumerable
Type returnType,
bool onlyNullWhenAnyNullPropagatingArgumentIsNull)
=> NullableFunction(name, arguments, returnType, null, onlyNullWhenAnyNullPropagatingArgumentIsNull);
/// <summary>
/// Use for any function that could return `NULL` for *any* reason.
/// </summary>
/// <param name="name">The SQL name of the function.</param>
/// <param name="arguments">The arguments of the function.</param>
/// <param name="returnType">The CLR return type of the function.</param>
/// <param name="typeMapping">The optional type mapping of the function.</param>
/// <param name="onlyNullWhenAnyNullPropagatingArgumentIsNull">
/// Set to `false` if the function can return `NULL` even if all of the arguments are not `NULL`. This will disable null-related
/// optimizations by EF Core.
/// </param>
/// <param name="argumentsPropagateNullability">
/// The optional nullability array of the function.
/// If omited and <paramref name="onlyNullWhenAnyNullPropagatingArgumentIsNull"/> is
/// `true` (the default), all parameters will propagate nullability (meaning if any parameter is `NULL`, the function will
/// automatically return `NULL` as well).
/// If <paramref name="onlyNullWhenAnyNullPropagatingArgumentIsNull"/> is explicitly set to `false`, the
/// null propagating capabilities of the arguments don't matter at all anymore, because the function will never be optimized by
/// EF Core in the first place.
/// </param>
/// <remarks>See https://github.com/dotnet/efcore/issues/23042</remarks>
/// <returns>The function expression.</returns>
public virtual SqlFunctionExpression NullableFunction(
string name,
IEnumerable<SqlExpression> arguments,
Type returnType,
RelationalTypeMapping typeMapping = null,
bool onlyNullWhenAnyNullPropagatingArgumentIsNull = true,
IEnumerable<bool> argumentsPropagateNullability = null)
{
Check.NotEmpty(name, nameof(name));
Check.NotNull(arguments, nameof(arguments));
Check.NotNull(returnType, nameof(returnType));
var typeMappedArguments = new List<SqlExpression>();
foreach (var argument in arguments)
{
typeMappedArguments.Add(ApplyDefaultTypeMapping(argument));
}
return new SqlFunctionExpression(
name,
typeMappedArguments,
true,
onlyNullWhenAnyNullPropagatingArgumentIsNull
? (argumentsPropagateNullability ?? Statics.GetTrueValues(typeMappedArguments.Count))
: Statics.GetFalseValues(typeMappedArguments.Count),
returnType,
typeMapping);
}
/// <summary>
/// Use for any function that will never return `NULL`.
/// </summary>
/// <param name="name">The SQL name of the function.</param>
/// <param name="arguments">The arguments of the function.</param>
/// <param name="returnType">The CLR return type of the function.</param>
/// <param name="typeMapping">The optional type mapping of the function.</param>
/// <remarks>See https://github.com/dotnet/efcore/issues/23042</remarks>
/// <returns>The function expression.</returns>
public virtual SqlFunctionExpression NonNullableFunction(
string name,
IEnumerable<SqlExpression> arguments,
Type returnType,
RelationalTypeMapping typeMapping = null)
{
Check.NotEmpty(name, nameof(name));
Check.NotNull(arguments, nameof(arguments));
Check.NotNull(returnType, nameof(returnType));
var typeMappedArguments = new List<SqlExpression>();
foreach (var argument in arguments)
{
typeMappedArguments.Add(ApplyDefaultTypeMapping(argument));
}
return new SqlFunctionExpression(
name,
typeMappedArguments,
false,
Statics.GetFalseValues(typeMappedArguments.Count),
returnType,
typeMapping);
}
// ...
}
```
We recently implemented the argument null propagation handling in https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/pull/1218.
So from our side, there is nothing left to be done with this issue anymore (unless you want to add our custom NullableFunction/NonNullableFunction methods or @smitpatel still sees issues with the the current implementation).
While checking every single SQL function translation I took the opportunity to document them all, so we can later provide users with a list of what gets translated and how.
While I did that, I also documented whether a function can return null and whether it will do so only when an argument is null or also in other cases.
Here are the results:
Category | Number of SQL Functions
-- | --
Total SQL Functions | 102
Functions that return NULL | 96
Functions with arguments propagating NULL | 96
Functions that can return NULL when all arguments are NOT NULL | 45
Of the Functions that can return NULL when all arguments are NOT NULL category, 27 functions are spatial related (and can usually return NULL when an empty geometry is used as a function argument).
After discussing with @maumar and re-examine the code, filtering around that happens earlier and there is at least no bug which I was thinking happening. @maumar will verify it once more and close the issue.