-
Notifications
You must be signed in to change notification settings - Fork 523
Faster response Content-Length parsing. #1166
Conversation
catch (FormatException ex) | ||
const string errorMessage = "Content-Length value must be an integral number."; | ||
var input = value.ToString(); | ||
var parsed = (long)0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just use long parsed = 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0L
@benaadams might also want to take a look at this. Can we use ParseContentLength in MessageBody.For? This might have to be rewritten to TryParseContentLength so we can have different behavior in the failure cases. Have you run any microbencharks comparing this to the previous long.Parse version? I suggest using https://github.com/PerfDotNet/BenchmarkDotNet. |
// If done on end of input or char is not a number, input is invalid | ||
if (ch == end || *ch < 0x30 || *ch > 0x39) | ||
{ | ||
throw new InvalidOperationException(errorMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't throw inline; move out out a separate method e.g. void ThrowInvalidContentLength()
else it will bring in 3 InvalidOperationException
ctors and 3 string loads
var end = ptr + input.Length; | ||
|
||
// Skip leading whitespace | ||
while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do
var ch = (ushort*)ptr;
var end = char + input.Length;
while (ch < end && ((*ch - 0x09) <= (0x0D - 0x09)) || *ch == 0x20))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one didn't make a difference, so I'll leave it as it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the ushort
made a difference but not the rewritten while
loop.
} | ||
|
||
// If done on end of input or char is not a number, input is invalid | ||
if (ch == end || *ch < 0x30 || *ch > 0x39) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with ushort* above combine with parsing below?
} | ||
|
||
// Parse number | ||
while (ch < end && *ch >= 0x30 && *ch <= 0x39) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like?
ushort digit;
if (ch == end || (digit = (*ch - 0x30)) > 9)
{
ThrowInvalidContentLength();
}
do
{
parsed *= 10;
parsed += digit;
} while (ch < end && (digit = (*ch - 0x30)) <= 9);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit fixed while clause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part gave it a significant boost. I was getting ~170ms to parse 10 million longs, with this change it went down to ~140ms.
Nice 👍 |
@benaadams Thanks for the feedback 🍻 |
@benaadams Thanks for the feedback! Got some very significant gains with your suggestions 😄 |
@halter73 Didn't use BenchmarkDotNet but ran my own one. I'm parsing 10 million random positive longs. Numbers: Initial Give or take 10ms on each number on each run. Code: using System;
using System.Diagnostics;
using System.Globalization;
using Microsoft.Extensions.Primitives;
namespace ParseContentLength
{
public class Program
{
private const int loops = (int)1e7;
public static void Main(string[] args)
{
Console.WriteLine($"ParseContentLength: {Benchmark(value => ParseContentLength(value)).TotalMilliseconds}ms");
Console.WriteLine($"long.Parse: {Benchmark(value => long.Parse(value, NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite, CultureInfo.InvariantCulture)).TotalMilliseconds}ms");
}
public static TimeSpan Benchmark(Func<string, long> parse)
{
var bytes = new byte[8];
var random = new Random();
long ticks = 0;
for (var i = 0; i < loops; i++)
{
random.NextBytes(bytes);
var input = Math.Abs(BitConverter.ToInt64(bytes, 0)).ToString();
var sw = new Stopwatch();
sw.Start();
var result = parse(input);
ticks += sw.ElapsedTicks;
if (result.ToString() != input) throw new Exception();
}
return TimeSpan.FromTicks(ticks);
}
public static unsafe long ParseContentLength(StringValues value)
{
var input = value.ToString();
var parsed = 0L;
fixed (char* ptr = input)
{
var ch = (ushort*)ptr;
var end = ch + input.Length;
// Skip leading whitespace
while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20))
{
ch++;
}
// Parse number
ushort digit = 0;
if (ch == end || (digit = (ushort)(*ch - 0x30)) > 9)
{
ThrowInvalidContentLengthException();
}
do
{
parsed *= 10;
parsed += digit;
ch++;
} while (ch < end && (digit = (ushort)(*ch - 0x30)) <= 9);
// If done and there's input and char is not whitespace, input is invalid
if (ch < end && (*ch < 0x09 || *ch > 0x0D || *ch != 0x20))
{
ThrowInvalidContentLengthException();
}
// Skip trailing whitespace
while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20))
{
ch++;
}
// If not at end of input, input is invalid
if (ch != end)
{
ThrowInvalidContentLengthException();
}
}
return parsed;
}
private static void ThrowInvalidContentLengthException()
{
throw new InvalidOperationException("Content-Length value must be an integral number.");
}
}
} |
Plaintext still looks about the same: 1157141.668 RPS (average of 10 runs). |
@halter73 Alright I'll give it a shot. |
@CesarBS I was looking through our tests, and I don't see any that attempt to set |
|
@halter73 We do have tests for that in |
I remember the "bad" tests now. I was looking in the wrong place. Now that we implemented our own parsing, we should test more invalid content-lengths than just "bad" though. Control characters, whitespace in the middle of otherwise valid numbers, etc... |
Added more tests. Turns out
|
Don't allow control characters. |
@Tratcher Even if they're whitespace? |
Yes. |
Kestrel should already strip whitespace at the start and end of headers. Legal whitespace at least. |
@CesarBS We should compare the before/after perf of this PR on the big iron. I'll forward you an email on how to do it, and you can post the results. |
We strip whitespace when reading request headers. But what should we do when setting Content-Length from a string, and it contains leading or trailing whitespace? If we want to keep parity with https://github.com/aspnet/HttpAbstractions/blob/87cd79d6fc54bb4abf07c1e380cd7a9498a78612/src/Microsoft.AspNetCore.Http/Internal/ParsingHelpers.cs#L407, we should allow |
Leave ParsingHelpers for now. Kestrel filters out everything for request headers, and Kestrel can be as strict as it wants to for response headers. |
7a8efb6
to
cbe89d8
Compare
Big iron
|
|
||
public static TheoryData<string> BadContentLengths => new TheoryData<string> | ||
{ | ||
"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we throw in some invalid characters other than space? E.g. "!" and "ü".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ü" will be rejected by ValidateHeaderCharacters()
. But yeah I'll add a few more cases.
e1d4e68
to
fff0ade
Compare
It seems that this perf. improvement has been undone in 4da06e8. |
@ulrichb you're looking at the old version of TryParseInt64. |
@Tratcher Oh, I checked the 'master'-branch version instead of 'dev'. Thanks. |
Fixes perf regression introduced by f8813a6.
I ran benchmarks before the change, after it, on current dev and on this PR. Averages of 10 runs each:
So most of the perf loss is recovered with this change.
@mikeharder @halter73 @davidfowl @DamianEdwards