-
Notifications
You must be signed in to change notification settings - Fork 10.3k
HTTPs and "Connection: close" - huge performance degradation on windows #18488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Since this is focused on Kestrel, I've moved it for now to dotnet/aspnetcore. If investigation demonstrates it's instead due to something lower in the stack, we can move it back. Thanks. |
With or without TLS (HTTPS)? lsass implies TLS. |
@Tratcher Tests were run with TLS (HTTPS) |
@stephentoub @halter73 haven't we seen other reports of SslStream handshakes being extremely expensive? |
We can do a quick investigation with some timers to confirm, but I'm pretty sure it's the SslStream handshake issues we've seen for a while (@karelz you're already tracking SSL performance improvements, right?). |
@anurse we track SSL perf, but more for read/write throughput. While I've heard about handshake perf several times, I didn't see any data or anything. |
Yep, we'll try to do a quick bit of work to isolate what we can. If I remember correctly, the biggest signal we have is our benchmarks show a significant performance gap between ConnectionClose and ConnectionClose with HTTPS. Improving HTTPS perf in general is on our radar, so this is an area we'll be working on. |
Maybe the pull request dotnet/runtime#1949 helps. |
no, I don't think so. However, dotnet/performance#1146 may be an interesting background. My Linux and Windows machines are pretty similar. They produce similar results for steady encrypt/decrypt but a handshake is MUCH slower on Linux. I'm wondering if there could be something wrong with your windows Kestrel server. You can checkout perf repro and you can try to run standard SSL benchmarks. As far as lsass, that should be expected. On Windows, schannel does SSL handshake and key management in this one daemon. After that is done, decrypt/encrypt with session key is done in-process. Could you share your test code and methodology @samsosa? I would like to see if I can reproduce your results. |
Running the System.Net.Security benchmarks gave me these results
For the benchmark I used multiple Azure VMs and fortio running on Ubuntu 18.04 The code is pretty simple and builds on top of the kestrel sample Program.csnamespace KestrelSample
{
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using System;
using System.IO;
using System.Security.Cryptography.X509Certificates;
using System.Text.RegularExpressions;
public class Program
{
public static void Main(string[] args)
{
CreateHostBuilder(args).Build().Run();
}
public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.ConfigureLogging(config => {
config.ClearProviders();
})
.ConfigureWebHostDefaults(webBuilder =>
{
webBuilder.UseUrls("http://0.0.0.0:11115", "https://0.0.0.0:21115");
webBuilder.UseStartup<Startup>();
});
}
} Startup.csnamespace KestrelSample
{
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Hosting.Server.Features;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Http.Extensions;
using Microsoft.Extensions.Hosting;
using Newtonsoft.Json;
using System;
using System.IO;
using System.Text;
using System.Threading.Tasks;
internal class Response
{
[JsonProperty("method")]
public string Method { get; set; }
[JsonProperty("schema")]
public string Schema { get; set; }
[JsonProperty("protocol")]
public string Protocol { get; set; }
[JsonProperty("host")]
public string Host { get; set; }
[JsonProperty("headers")]
public IHeaderDictionary Headers { get; set; }
[JsonProperty("path")]
public string Path { get; set; }
[JsonProperty("query")]
public IQueryCollection Query { get; set; }
[JsonProperty("queryString")]
public QueryString QueryString { get; set; }
[JsonProperty("body", NullValueHandling = NullValueHandling.Ignore)]
public string Body { get; set; }
}
public class Startup
{
public void Configure(IApplicationBuilder app)
{
var serverAddressesFeature =
app.ServerFeatures.Get<IServerAddressesFeature>();
app.UseStaticFiles();
app.Use(async (context, next) =>
{
if (context.Request.Query.TryGetValue("base", out var _))
{
await next.Invoke().ConfigureAwait(false);
}
else
{
try
{
var response = new Response
{
Method = context.Request.Method,
Schema = context.Request.HttpContext.Request.Scheme,
Protocol = context.Request.HttpContext.Request.Protocol,
Host = context.Request.HttpContext.Request.Host.ToString(),
Path = context.Request.Path.ToString(),
Headers = context.Request.Headers,
Query = context.Request.Query,
QueryString = context.Request.QueryString,
};
context.Response.ContentType = "application/json";
if (context.Request.Body != null)
{
using (var reader = new StreamReader(context.Request.Body))
{
response.Body = await reader.ReadToEndAsync().ConfigureAwait(false);
}
}
await context.Response.WriteAsync(JsonConvert.SerializeObject(response));
}
catch (Exception exception)
{
var result = new
{
error = exception.Message
};
context.Response.ContentType = "application/json";
context.Response.StatusCode = 500;
await context.Response.WriteAsync(JsonConvert.SerializeObject(result));
}
}
});
app.Run(async (context) =>
{
context.Response.ContentType = "text/html";
await context.Response
.WriteAsync("<!DOCTYPE html><html lang=\"en\"><head>" +
"<title></title></head><body><p>Hosted by Kestrel</p>");
if (serverAddressesFeature != null)
{
await context.Response
.WriteAsync("<p>Listening on the following addresses: " +
string.Join(", ", serverAddressesFeature.Addresses) +
"</p>");
}
await context.Response.WriteAsync("<p>Request URL: " +
$"{context.Request.GetDisplayUrl()}</p>");
await context.Response.WriteAsync("<p>Request URL: " +
$"{context.Request.Host}</p>");
});
}
}
} I run my tests again and got similar results
On Windows both fortio and lsass had a huge cpu consumption, lsass peaked at around 40% |
@surr34 do you have the numbers between http vs https? |
updated the table above to include http numbers |
can you please provide the test code for the |
I created a repository that contains all the code I used, both Kestrel and NodeJs https://github.com/surr34/https-bench . Initially I measured the numbers locally and I was surprised by the result, hence I setup a few Azure VMs to run the measurement again (see numbers above). |
Thanks @sur34. I think I have repro and I will investigate. It seems like I also have 2K self-signed RSA. |
Is there a Runtime issue tracking this work? I believe there have been improvements in this area, are there newer perf numbers? |
I don't have any recent numbers but from what I saw bug part of the difference was that NodeJS used TLS1.3 and Kestrel did not. There were TSL1.3 fixes in SslStream but that still depends on OS support and configuration while NodeJS does not. |
Closing as the work is on the runtime side. |
What is the runtime side work and how is it tracked? |
I don't think there's a specific runtime issue tracking this, but this scenario is captured in the "https" variation of our "ConnectionClose" benchmark which we record in PowerBI. You have to select the right checkboxes yourself. PowerBI doesn't capture which checkboxes are selected when you share a link. The only related runtime issue that I see is dotnet/runtime#27916 which deals with TLS session resumption on linux. It was closed in part because FTPWebRequest wasn't seen as a compelling enough reason to put the significant amount of work required to get TLS session resumption working with OpenSSL (assuming it's not "impossible"). Maybe Kestrel's usage of SslStream for HTTPS is a compelling enough reason to reopen this.
Kestrel currently defaults to TLS1.1 or TLS1.2 regardless of platform support for TLS1.3, but that issue is being tracked by #14997 and should be fixed in .NET 5 preview6. Otherwise, Kestrel's SslStream usage is pretty standard. |
Maybe a dumb question, but how is TLS session resumption on Linux related here? I thought we were digging into Windows perf degradation. |
Re-opening until we identify the specific runtime work items. |
We should re-test once dotnet/runtime#1720 lands (again) |
The Linux TLS session resumption is only tangentially related. Given that the performance of this scenario is already better on Linux than on Windows, I can understand it still not being a big priority. The reason for closing this was because it comes up in every Server triage, but any work to improve this scenario will need to be done in the runtime. We are already tracking this scenario in our benchmarks, and it seems unlikely that Kestrel will need to react to any runtime changes that might improve this scenario. @karelz should we transfer the issue to the runtime so you have something tracking it? |
@halter73 I would like to see something filed -- I just don't understand what we are trying to track with that work. Maybe offline chat could help? |
@karelz and I talked offline and he asked if our benchmarks confirm the slower HTTPS "Connection: close" performance on Windows. It turns out, they don't: Linux (~7.3K RPS)Windows (13.7K RPS)Our "Connection: close" benchmarks are quite a bit simpler than https://github.com/surr34/https-bench since our benchmark sends empty request bodies and the server app outputs plaintext rather than serialized JSON. This explains the much higher numbers. I went ahead and tried https://github.com/surr34/https-bench on Azure F4 Windows and Linux VMs and used fortio to drive load to see if anything about this setup causes different results. I still see better RPS (or should I say QPS) numbers with a Windows server: Ubuntu 16.04 (~720 QPS)
Windows Server 2016 (~1050 RPS)
This is using v3.0.103 of the .NET Core SDK. @surr34 Do you have any idea why I'm not able to replicate your results? |
What version of OpenSSL do you have @halter73? 16.04 is pretty old and when I run the benchmarks on Linux & Node on Windows it used TLS 1.3. I probed @sebastienros while back and it seems like we did not have any good setup to benchmark it. |
The server is running OpenSSL 1.0.2g, so no TLS 1.3 support. The thing is, Kestrel uses I should probably get around to upgrading my VMs anyway. I'll try to do that tomorrow and get updated results. It's still possible that a difference in OpenSSL versions is responsible for the benchmark differences. |
That would be great @halter73. I can resurrect my old setup if needed. I had two old desktop machines and I saw significant drop. Maybe not as big as the number above but still quite visible. And Node significantly faster on the same machine. |
The VMs I mentioned upgrading have since been deleted. I can try to recreate the setup, but #22437 has been merged and our normal benchmark infrastructure should catch regressions caused by default protocol changes or dependency updates. |
We have a Kestrel service that recently receives a lot of https requests containing a
Connection: close
header. It seems the header has a massive impact on the number of requests we can handle. Testing with a simple service locally, the RPS drops from over 30k to under 500.For testing purpose we ran the same service on Linux and implemented it in NodeJS as well. While the C# version on Windows was fast for kept-alive connections, the performance degradation was huge. Using
Connection: close
, the C# Windows version was outperformed by both NodeJS and the C# implementation on a Linux system. Based on other Github issues I believed the Windows version to be way faster than the Linux one. However below the number (RPS) we measuredFor the Kestrel Windows version we also noticed a significant spike in the CPU consumption of the lsass process.
The text was updated successfully, but these errors were encountered: