While working on nullability annotation for System.Data.Common, I ran across some odd and inconsistent behavior when GetSchemaTable and GetColumnSchema are called and a resultset isn't present (e.g. a non-SELECT statement was executed, or NextResult was called and returned false).
I vote for option 3. It would allow the method to cleanly return a non-nullable DataTable
c#
[Fact]
public virtual void GetSchemaTable_returns_null_when_no_resultset()
{
using var connection = CreateOpenConnection();
using var command = connection.CreateCommand();
command.CommandText = "SELECT 1";
using var reader = command.ExecuteReader();
reader.NextResult();
Assert.Null(reader.GetSchemaTable());
}
/cc @cheenamalhotra @David-Engel @saurabh500 @Wraith2 @bgrainger @bricelam @ajcvickers @AndriySvyryd
/cc @ErikEJ
/cc @FransBouma
The current behaviour in MySqlConnector was due to https://github.com/mysql-net/AdoNetApiTest/issues/28; see also https://github.com/mysql-net/MySqlConnector/issues/678.
Forgot about that conversation :)
How would you feel about my suggestion of returning an empty DataTable/column collection for these methods? It wouldn't be a breaking change for you (as the methods currently throw), and it seems consistent with the idea of no resultset having "zero columns" (as expressed by FieldCount).
I vote for option 3, too. Seems cleaner, more logical, and easier to code against.
- We could make SqlClient and Npgsql return an empty list
Do you mean "DataTable", not "list"?
How would you feel about my suggestion of returning an empty DataTable/column collection for these methods?
I have no objection; there's value in consistency and I doubt anyone is relying on the current behaviour (of throwing an exception).
Do you mean "DataTable", not "list"?
Yeah, thanks - corrected it (thinking about GetSchemaTable and GetColumnSchema at the same time).
I like the idea of an empty DataTable instead of a null. However I wonder if any code takes a decision based on the value of the schema object returned like don't show a UI grid if a null value is received. @roji has already mentioned that it is a breaking change.
I don't have data though. Putting a hypothetical situation here.
It might be worth following up with SMO.
SMO - SQL management objects library
If you execute queries without knowing whether they will return a resultset, you in general will do that with ExecuteReader(), and first examine the GetSchemaTable. If it's null, you can assume there wasn't a resultset returned, so the query likely was a DML query.
If there is a datatable with columns, you can assume there's a resultset.
But indeed the 'null' isn't reliable as there's no documentation that this should happen (in my code I assume it's null, but perhaps some obscure providers fail here). So I opt for option 3: the empty datatable is a good solution, as its emptiness (no columns nor rows) is a reliable indicator no resultset will be returned by the reader.
It would be a breaking change for my desktop facing designer system but it's a simple change to make. What's more important tho is: how to enforce all ADO.NET providers to do the right thing here. As there's code out there relying on this (has to be) for e.g. 'mysql's behavior', it won't be easy to persuade these provider writers to change their API too (as it enforces a breaking change onto their users they might not want to do that)
Thanks for your feedbacks everyone!
@saurabh500
It might be worth following up with SMO.
Sure. But it's important to note that this would only be a documented, breaking change in .NET 5 - a major part of the appeal of .NET Core is the side-by-side nature, which would only affect a product such as SMO if and when they actually switch to .NET 5. This is in contrast with the old, in-place .NET Framework dates, where we could probably never even have this conversation. But regardless it would be great to get more feedback from any other interested parted here.
@FransBouma
If you execute queries without knowing whether they will return a resultset, you in general will do that with ExecuteReader(), and first examine the GetSchemaTable. If it's null, you can assume there wasn't a resultset returned, so the query likely was a DML query.
Yes, anyone doing this would be broken by this change (if and when they switch to .NET 5). Interestingly I tend to think about the resultset checking more via FieldCount being zero - I was somewhat surprised to find that GetSchemaTable returns null.
What's more important tho is: how to enforce all ADO.NET providers to do the right thing here
One important resource here is @bgrainger's ADO.NET spec tests, which provide a relatively easy way to check behavior across many providers. In fact, as I'm working through the ADO.NET API surface for nullability annotation (which is what spawned this issue), I'm using that project and adding tests to it. Of course, not all providers are represented there.
Yes, anyone doing this would be broken by this change (if and when they switch to .NET 5).
I'm not following how the behaviour change would be linked with / limited to .NET 5. DbDataReader.GetSchemaTable is virtual, so the actual behaviour (returning null or an empty DataTable) would depend on the specific ADO.NET provider being used (and its version).
Unless you're proposing/assuming multitargeting where ADO.NET providers only implement the new behaviour for builds specifically targeting netcoreapp5.0 (or whatever the proper TFM is), which seems even harder to enforce.
You're partially right, I'm currently investigating other breaking changes in S.D.C itself that I'm thinking about (will come in a separate issue). However, since SqlClient specifically has been part of the Framework, side-by-side did not apply to it (and breaking changes could therefore almost never be introduced). Regardless, as .NET itself has (thankfully) is now side-by-side, it seems there's more general openness towards introducing some breaking changes in major versions, compared to the earlier reluctance.
Only newly compiled consumers will take notice of the nullable annotations and care if an old version of the library returns a null contrary to the annotation. It seems unlikely that anyone would try to backdate the in-box version of sqlclient specifically in that scenario. As long as the new expectation is documented for provider authors I don't see it as a big problem.
@Wraith2 the issue here specifically is that we're also proposing modifying the behavior of SqlClient and other providers to return an empty DataTable from GetSchemaTable - that's a visible behavior change regardless of whether you're opted into the new nullable feature.
To be clear, yeah - the idea is to annotate the return value of DbConnection.GetSchemaTable as non-nullable. But to have SqlClient and Npgsql compatible with that annotation they would also need to change behavior.
/cc @mgravell who I left out before (sorry!)
I'm also in favor of option 3, although I am concerned about the breaking change. Any option for consistency results in at least one provider being broken--if the standard behavior becomes "return null" then SQLite needs to make a breaking change. That being said, breaking SQLite is not as bad as breaking SQLClient because it has a lot lower usage.
So I think this boils down to whether or not SQLClient is okay with the break. If so, then 3 seems to clearly win.
A consensus has been reached to return an empty table when there is no resultset - when we annotate System.Data.Common for nullability, we can annotate the returned DataTable as non-nullable. Related ADO provider issues:
@terrajobst @stephentoub this discussion was the basis of runtime changes in ADO providers but not in System.Data itself (though it will influence how we null-annotate it). OK to leave the labels/milestone like this or should we remove from the milestone as no actual change happened in the runtime?
Everyone, unfortunately I have to propose to revert this change for 5.0. When we originally had the discussion above, I didn't sufficiently consider the wider effect of the breaking change... since then, I've done a lot of work on System.Data nullability, and have come to better understand the bar for breaking changes in System.Data and their overall impact. I apologize for the mess this is creating.
Some reasoning:
Once again, apologies for this.
PS The PR making GetSchemaTable nullable in System.Data (and Odbc/OleDb) is #41082.
To confirm: the desired new behaviour is that when there is no result set, GetSchemaTable will return null (and GetColumnSchema will continue returning an empty collection)?
Since MySqlConnector changed from throwing an exception to returning an empty DataTable, I don't think there would be many users relying on the new return value, and changing it to null shouldn't have a significant impact.
To confirm: the desired new behaviour is that when there is no result set, GetSchemaTable will return null (and GetColumnSchema will continue returning an empty collection)?
Yeah, that's correct. This includes the case where NextResult has been called on the reader, and it returned false (i.e. consumed all result sets). You should see this behavior with System.Data.SqlClient, and with Microsoft.Data.SqlClient prior to 2.0.0; if you're seeing anything different please let me know...
Since MySqlConnector changed from throwing an exception to returning an empty DataTable, I don't think there would be many users relying on the new return value, and changing it to null shouldn't have a significant impact.
That's good to hear.. I now understand the extent to which behavioral changes in the public-facing abstraction methods of ADO.NET should be done really, really carefully.
Or not at all? 馃榾
A certain number of "reallys" starts to be equivalent to "not at all" for sure 鈽癸笍
Most helpful comment
Everyone, unfortunately I have to propose to revert this change for 5.0. When we originally had the discussion above, I didn't sufficiently consider the wider effect of the breaking change... since then, I've done a lot of work on System.Data nullability, and have come to better understand the bar for breaking changes in System.Data and their overall impact. I apologize for the mess this is creating.
Some reasoning:
Once again, apologies for this.
PS The PR making GetSchemaTable nullable in System.Data (and Odbc/OleDb) is #41082.