The bulk insert functionality of SQL Server supports an ORDER
hint to speed up inserting of already ordered data into a table with a clustered index. Without this hint, SQL Server forces an intermediate sort of the incoming data before inserting, which can add significant overhead (think tempdb
spills). Existing unmanaged ways to bulk copy (bcp.exe
, BULK INSERT
, OPENROWSET(BULK, ...)
) all offer ways to specify this hint, but SqlBulkCopy
does not, leaving managed code out in the cold.
public class SqlBulkCopy
{
// Existing members left out
public SqlBulkCopyOrderHintColumnCollection OrderHintColumns { get; }
}
// New class
public sealed class SqlBulkCopyOrderHintColumn {
public SqlBulkCopyOrderHintColumn(string column, System.Data.SqlClient.SortOrder sortOrder);
public string Column { get; set; }
public SortOrder SortOrder { get; set; } // throws ArgumentException on SortOrder.Unspecified
}
// New class, same shape as SqlBulkCopyColumnMappingCollection
public class SqlBulkCopyOrderHintColumnCollection : System.Collections.CollectionBase {
public SqlBulkCopyOrderHintColumn Add(SqlBulkCopyOrderHintColumn bulkCopyOrderHintColumn);
// throws ArgumentException on SortOrder.Unspecified
public SqlBulkCopyOrderHintColumn Add(string column, SortOrder sortOrder);
public void Clear();
public bool Contains(SqlBulkCopyOrderHintColumn value);
public void CopyTo(SqlBulkCopyOrderHintColumn[] array, int index);
public System.Collections.IEnumerator GetEnumerator();
public int IndexOf(SqlBulkCopyOrderHintColumn value);
public void Insert(int index, SqlBulkCopyOrderHintColumn value);
public void Remove(SqlBulkCopyOrderHintColumn value);
public void RemoveAt(int index);
}
using (var sqlBulkCopy = new SqlBulkCopy(connection)) {
// leaving out other options
sqlBulkCopy.OrderHintColumns.Add("Column1", SortOrder.Descending);
sqlBulkCopy.OrderHintColumns.Add("Column2", SortOrder.Ascending);
sqlBulkCopy.OrderHintColumns.Add("Column3", SortOrder.Descending);
sqlBulkCopy.DestinationTableName = "Table1";
sqlBulkCopy.WriteToServer();
}
This should result in SqlBulkCopy
issuing a statement of the form
insert bulk Table1 (...) with (ORDER ([Column1] DESC, [Column2] ASC, [Column3] DESC))
Ensuring the input rows are actually ordered according to the hint is the programmer's responsibility; SQL Server will either ignore the hint (if the table is a heap) or produce an error (if the table is clustered) and SqlBulkCopy
need not further verify this.
SqlBulkCopyOrderHintColumnCollection
a strongly typed collection to get with the times, or maintain consistency with the existing ColumnMapping
property and use a non-generic collection?SqlBulkCopyOrderHintColumn
be immutable, again, mostly getting with the times? SqlBulkCopyColumnMapping
is very much the opposite; it even has a parameterless constructor, which seems quite undesirable.See #5114 for previous discussion.
I propose that you should keep the new API in style with the current
As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues.
See #5114 for previous discussion.
After moving this issue to dotnet/SqlClient, the above reference from the original post should now say #14.
Yes, this is a needed feature for me...without it, I drop the clustered index and re-create it afterwards. It's reasonable for a bulk-load tool to have this option.
Closing the issue as https://github.com/dotnet/SqlClient/pull/540 is now merged.
This feature will be available starting Microsoft.Data.SqlClient v2.0.0
That's great to hear! :)
Thanks!!
Most helpful comment
Closing the issue as https://github.com/dotnet/SqlClient/pull/540 is now merged.
This feature will be available starting Microsoft.Data.SqlClient v2.0.0