Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Faster response Content-Length parsing. #1166

Merged
merged 1 commit into from
Oct 18, 2016
Merged

Conversation

cesarblum
Copy link
Contributor

Fixes perf regression introduced by f8813a6.

I ran benchmarks before the change, after it, on current dev and on this PR. Averages of 10 runs each:

Before change After change dev PR
1170566.1 1081454.708 1089127.197 1153864.439

So most of the perf loss is recovered with this change.

@mikeharder @halter73 @davidfowl @DamianEdwards

catch (FormatException ex)
const string errorMessage = "Content-Length value must be an integral number.";
var input = value.ToString();
var parsed = (long)0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just use long parsed = 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0L

@halter73
Copy link
Member

@benaadams might also want to take a look at this.

Can we use ParseContentLength in MessageBody.For? This might have to be rewritten to TryParseContentLength so we can have different behavior in the failure cases.

Have you run any microbencharks comparing this to the previous long.Parse version? I suggest using https://github.com/PerfDotNet/BenchmarkDotNet.

// If done on end of input or char is not a number, input is invalid
if (ch == end || *ch < 0x30 || *ch > 0x39)
{
throw new InvalidOperationException(errorMessage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't throw inline; move out out a separate method e.g. void ThrowInvalidContentLength() else it will bring in 3 InvalidOperationException ctors and 3 string loads

var end = ptr + input.Length;

// Skip leading whitespace
while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20))
Copy link
Contributor

@benaadams benaadams Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do

var ch = (ushort*)ptr;
var end = char + input.Length;

while (ch < end && ((*ch - 0x09) <= (0x0D - 0x09)) || *ch == 0x20))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one didn't make a difference, so I'll leave it as it was.

Copy link
Contributor Author

@cesarblum cesarblum Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the ushort made a difference but not the rewritten while loop.

}

// If done on end of input or char is not a number, input is invalid
if (ch == end || *ch < 0x30 || *ch > 0x39)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with ushort* above combine with parsing below?

}

// Parse number
while (ch < end && *ch >= 0x30 && *ch <= 0x39)
Copy link
Contributor

@benaadams benaadams Oct 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like?

ushort digit;
if (ch == end || (digit = (*ch - 0x30)) > 9)
{
    ThrowInvalidContentLength();
}

do
{
    parsed *= 10;
    parsed += digit;
} while (ch < end && (digit = (*ch - 0x30)) <= 9);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit fixed while clause

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part gave it a significant boost. I was getting ~170ms to parse 10 million longs, with this change it went down to ~140ms.

@benaadams
Copy link
Contributor

Nice 👍

@halter73
Copy link
Member

halter73 commented Oct 14, 2016

@benaadams Thanks for the feedback 🍻

@cesarblum
Copy link
Contributor Author

@benaadams Thanks for the feedback! Got some very significant gains with your suggestions 😄

@cesarblum
Copy link
Contributor Author

@halter73 Didn't use BenchmarkDotNet but ran my own one. I'm parsing 10 million random positive longs. Numbers:

Initial ParseContentLength in PR: 200.1526ms
ParseContentLength after @benaadams' suggestions: 150.7072ms
long.Parse: 636.982ms

Give or take 10ms on each number on each run.

Code:

using System;
using System.Diagnostics;
using System.Globalization;
using Microsoft.Extensions.Primitives;

namespace ParseContentLength
{
    public class Program
    {
        private const int loops = (int)1e7;

        public static void Main(string[] args)
        {
            Console.WriteLine($"ParseContentLength: {Benchmark(value => ParseContentLength(value)).TotalMilliseconds}ms");
            Console.WriteLine($"long.Parse: {Benchmark(value => long.Parse(value, NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite, CultureInfo.InvariantCulture)).TotalMilliseconds}ms");
        }

        public static TimeSpan Benchmark(Func<string, long> parse)
        {
            var bytes = new byte[8];
            var random = new Random();
            long ticks = 0;

            for (var i = 0; i < loops; i++)
            {
                random.NextBytes(bytes);
                var input = Math.Abs(BitConverter.ToInt64(bytes, 0)).ToString();
                var sw = new Stopwatch();
                sw.Start();
                var result = parse(input);
                ticks += sw.ElapsedTicks;
                if (result.ToString() != input) throw new Exception();
            }

            return TimeSpan.FromTicks(ticks);
        }

        public static unsafe long ParseContentLength(StringValues value)
        {
            var input = value.ToString();
            var parsed = 0L;

            fixed (char* ptr = input)
            {
                var ch = (ushort*)ptr;
                var end = ch + input.Length;

                // Skip leading whitespace
                while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20))
                {
                    ch++;
                }

                // Parse number
                ushort digit = 0;
                if (ch == end || (digit = (ushort)(*ch - 0x30)) > 9)
                {
                    ThrowInvalidContentLengthException();
                }

                do
                {
                    parsed *= 10;
                    parsed += digit;
                    ch++;
                } while (ch < end && (digit = (ushort)(*ch - 0x30)) <= 9);

                // If done and there's input and char is not whitespace, input is invalid 
                if (ch < end && (*ch < 0x09 || *ch > 0x0D || *ch != 0x20))
                {
                    ThrowInvalidContentLengthException();
                }

                // Skip trailing whitespace
                while (ch < end && ((*ch >= 0x09 && *ch <= 0x0D) || *ch == 0x20))
                {
                    ch++;
                }

                // If not at end of input, input is invalid
                if (ch != end)
                {
                    ThrowInvalidContentLengthException();
                }
            }

            return parsed;
        }

        private static void ThrowInvalidContentLengthException()
        {
            throw new InvalidOperationException("Content-Length value must be an integral number.");
        }
    }
}

@halter73
Copy link
Member

@CesarBS The nice thing about BenchmarkDotNet is that it does multiple test runs and gives you median and standard deviation. Once you've used it once, it's really easy to set up new scenarios and it gives you nice output (even markdown formatted) like shown here.

@cesarblum
Copy link
Contributor Author

Plaintext still looks about the same: 1157141.668 RPS (average of 10 runs).

@cesarblum
Copy link
Contributor Author

@halter73 Alright I'll give it a shot.

@halter73
Copy link
Member

@CesarBS I was looking through our tests, and I don't see any that attempt to set Response.Headers["Content-Length"] to an invalid string. We should add that, especially now that we have our won parsing logic.

@cesarblum
Copy link
Contributor Author

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Windows
Processor=?, ProcessorCount=4
Frequency=2435876 ticks, Resolution=410.5299 ns, Timer=TSC
CLR=CORE, Arch=64-bit ? [RyuJIT]
GC=Concurrent Workstation
dotnet cli version: 1.0.0-preview2-003121

Type=Benchmarks  Mode=Throughput

             Method |      Median |     StdDev |
------------------- |------------ |----------- |
          ParseLong | 436.6269 ns | 10.3795 ns |
 ParseContentLength | 200.1591 ns | 30.6031 ns |

@cesarblum
Copy link
Contributor Author

@halter73 We do have tests for that in FrameResponseHeaderTests. Do you mean I should add functional tests?

@halter73
Copy link
Member

I remember the "bad" tests now. I was looking in the wrong place. Now that we implemented our own parsing, we should test more invalid content-lengths than just "bad" though. Control characters, whitespace in the middle of otherwise valid numbers, etc...

@cesarblum
Copy link
Contributor Author

Added more tests. Turns out 0x09 through 0x0D don't work as whitespace because the header gets rejected by FrameHeaders.ValidateHeaderCharacters(). So either we:

  1. Change FrameHeaders.ValidateHeaderCharacters() to allow 0x09 through0x0D; or
  2. Make ParseContentLength() even simpler by only allowing 0x20 for whitespace. It wouldn't be the same as NumberStyles.AllowLeadingWhite | NumberStyles.AllowTrailingWhite in long.Parse though.

@Tratcher
Copy link
Member

Don't allow control characters.

@cesarblum
Copy link
Contributor Author

@Tratcher Even if they're whitespace?

@Tratcher
Copy link
Member

Yes.

@halter73
Copy link
Member

Kestrel should already strip whitespace at the start and end of headers. Legal whitespace at least.

@halter73
Copy link
Member

@CesarBS We should compare the before/after perf of this PR on the big iron. I'll forward you an email on how to do it, and you can post the results.

@cesarblum
Copy link
Contributor Author

We strip whitespace when reading request headers. But what should we do when setting Content-Length from a string, and it contains leading or trailing whitespace? If we want to keep parity with https://github.com/aspnet/HttpAbstractions/blob/87cd79d6fc54bb4abf07c1e380cd7a9498a78612/src/Microsoft.AspNetCore.Http/Internal/ParsingHelpers.cs#L407, we should allow 0x09 through 0x0D, as well as 0x20 (space). That's what NumberStyles.AllowLeadingWhite and NumberStyles.AllowTrailingWhite do.

@halter73
Copy link
Member

@CesarBS good point about request vs response. @Tratcher Do you think we should change ParsingHelpers to not strip whitespace? I don't mind being strict about setting a content-length string, but I would like consistency. But then there's back compat...

@Tratcher
Copy link
Member

Leave ParsingHelpers for now. Kestrel filters out everything for request headers, and Kestrel can be as strict as it wants to for response headers.

@cesarblum cesarblum force-pushed the cesarbs/fix-perf-regression branch 2 times, most recently from 7a8efb6 to cbe89d8 Compare October 17, 2016 22:24
@cesarblum
Copy link
Contributor Author

cesarblum commented Oct 17, 2016

Big iron

8c103f0 dev PR
4949442.425 RPS 4887490.302 4922363.127


public static TheoryData<string> BadContentLengths => new TheoryData<string>
{
"",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we throw in some invalid characters other than space? E.g. "!" and "ü".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ü" will be rejected by ValidateHeaderCharacters(). But yeah I'll add a few more cases.

@cesarblum cesarblum force-pushed the cesarbs/fix-perf-regression branch from e1d4e68 to fff0ade Compare October 18, 2016 22:58
@cesarblum cesarblum merged commit fff0ade into dev Oct 18, 2016
@cesarblum cesarblum deleted the cesarbs/fix-perf-regression branch October 18, 2016 23:19
@ulrichb
Copy link

ulrichb commented Dec 30, 2016

It seems that this perf. improvement has been undone in 4da06e8.

(See HeaderUtilities.TryParseInt64.)

@Tratcher
Copy link
Member

@ulrichb you're looking at the old version of TryParseInt64.
aspnet/HttpAbstractions@d50a241#diff-1d7c1559d9f2fc18636596536f94741cR440

@ulrichb
Copy link

ulrichb commented Dec 30, 2016

@Tratcher Oh, I checked the 'master'-branch version instead of 'dev'. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants