Description
Describe the bug
While running and processing a large (multi-GB) query, the debugger shows the GC running almost constantly. After reducing allocations as much as possible in my code, I ran it through JetBrains DotMemory, and it said that the process was allocating almost 5 GB of char[]
buffers in TdsPaser.TryReadPlpUnicodeChars
. The stack trace it gave shows that this is being called as part of reading data from a result set, apparently to read a string buffer and turn it into a string.
Would it be possible to cache and reuse the char buffer rather than (I assume; the relevant code doesn't appear to be in this repo) allocating a new buffer for each TryReadPlpUnicodeChars invocation?
To reproduce
- Grab and restore this StackOverflow data dump.
- Create a
SqlConnection
and open a connection to the database. - Run the following:
void Process(SqlConnection conn)
{
using var cmd = conn.CreateCommand();
cmd.CommandText = "select * from dbo.Posts";
using var reader = cmd.ExecuteReader();
var buffer = new object[reader.FieldCount];
while (reader.Read())
{
reader.GetValues(buffer);
}
}
Expected behavior
As we're processing one row at a time, memory usage should remain fairly low.
Observed behavior:
Constant GC pressure. Profiling shows the vast majority of it comes from char buffer allocations as described above.
Further technical details
Microsoft.Data.SqlClient version: nuget 4.1.0
.NET target: .NET 7
SQL Server version: SQL Server 15.0.2095.3
Operating system: Windows 10