Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Reduce GetString allocs and conversions #312

Merged
merged 1 commit into from
Nov 17, 2015

Conversation

benaadams
Copy link
Contributor

Resolves #291

{
// avoid declaring other local vars, or doing work with stackalloc
// to prevent the .locals init cil flag , see: https://github.com/dotnet/coreclr/issues/1279
char* output = stackalloc char[length];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big can length be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 16k max

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is just a limit on the max it will attempt with stackalloc before using new and heap allocating instead; isn't actually a parsing limit.

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from 702e560 to 6282d0b Compare November 1, 2015 09:43
@benaadams benaadams closed this Nov 3, 2015
@benaadams benaadams reopened this Nov 3, 2015
@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch 2 times, most recently from 6b5e761 to 4161301 Compare November 3, 2015 10:20
@benaadams benaadams closed this Nov 3, 2015
@benaadams benaadams reopened this Nov 4, 2015
@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from 056bcd1 to 5de4169 Compare November 4, 2015 11:51
@@ -10,6 +10,7 @@ namespace Microsoft.AspNet.Server.Kestrel.Infrastructure
{
public struct MemoryPoolIterator2
{
private const int _maxHeaderStackLength = 16384;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason we don't want to stackalloc a char array larger than this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total headers are generally constrained < 8k/16k So individual headers shouldn't be larger than 16k; if you are going larger than that you probably have other issues to resolve than performance - Added some related comments to: https://github.com/aspnet/KestrelHttpServer/pull/313/files#diff-6b7953b837b6ad7c921259c907520639R14

Can go larger; but that was my reasoning.

For upper levels; total stack size aspnet4 x86 was 256k, x64 was 512k; .net 1MB - not sure what the current defaults are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ref:
nginx default 8k
Apache default 8190 bytes
IIS default 16384 bytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add even more reasoning here: HTTP/2 compression will further reduce the average header size. I agree 16k is a good limit given what we see out in the world.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is in the wrong layer MemoryPoolIterator2 doesn't understand what it's looking at. That's the caller's job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its to prevent the stack blowing up; so if length is larger than this it heap allocates instead. Maybe a rename like _maxStackAllocBytes?

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch 13 times, most recently from fee6712 to d88d4c3 Compare November 10, 2015 10:50
@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from c3ca476 to 68d63ed Compare November 10, 2015 11:21
@halter73
Copy link
Member

This looks good to me, but since this is a pretty big change I would like someone else to also sign off on it.

@lodejard @davidfowl @troydai thoughts?

@davidfowl
Copy link
Member

Is there anyway we can take advantage of CopyTo here? We already have the memory pool interator

@benaadams
Copy link
Contributor Author

@davidfowl you mean move into MemoryPoolIterator2 rather than have it in Frame? GetNextString kinda thing?

@troydai
Copy link
Contributor

troydai commented Nov 13, 2015

It looks good to me. The change is straightforward. One thing we may or may not to do is to extract the logics of taking two iterators to generate a string to a separate class. The logic here expand the iterator class which in my opinion should be kept concise. The logic of creating string with two iterators is very much self-contained.

@benaadams
Copy link
Contributor Author

Renamed constant

@benaadams
Copy link
Contributor Author

I think this can be cleaned up

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch 3 times, most recently from 9d3760e to 53582f2 Compare November 15, 2015 11:42
@benaadams
Copy link
Contributor Author

Simpler?

@benaadams
Copy link
Contributor Author

Is there anyway we can take advantage of CopyTo here?

Its the byte -> char conversion that needs to be explicit to get it in right format for string; CopyTo is byte-> byte

One thing we may or may not to do is to extract the logics of taking two iterators to generate a string to a separate class.

@troydai Changed the string from MemoryPoolIterator to Extenstion methods; better?

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch 2 times, most recently from 145e5d1 to 9bdc06a Compare November 15, 2015 12:21
@benaadams
Copy link
Contributor Author

For the Utf8->char*(string) it can be done faster with intrinsics but I don't think Vector exposes all the needed methods (see http://woboq.com/blog/utf-8-processing-using-simd.html) and probably would want to push that in the coreclr/corefx Encoding.UTF8, so this code would remain the same anyway.

}
}

private static string GetAsciiStringHeap(MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the name of code reuse:

private static unsafe string GetAsciiStringHeap(MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length)
{
    var buffer = new char[length];
    fixed (char* output = buffer)
    {
        return MultiBlockAsciiIter(output, start, end, inputOffset, length);
    }
}

private static unsafe string MultiBlockAsciiIter(char* output, MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length)
{
  ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, done

@troydai
Copy link
Contributor

troydai commented Nov 16, 2015

:shipit:

}
else
{
requestUrlPath = pathBegin.GetAsciiString(pathEnd);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always use Utf8 for the path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed and reverted as discussed in outdated diff

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from 05dea5d to 6769e1e Compare November 16, 2015 23:42
@@ -675,7 +675,7 @@ private bool TakeStartLine(SocketInput input)
pathEnd = UrlPathDecoder.Unescape(pathBegin, pathEnd);
}

var requestUrlPath = pathBegin.GetString(pathEnd);
var requestUrlPath = pathBegin.GetUtf8String(pathEnd);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfowl just told me to always Utf8 the path; was last commit change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol @benaadams wires crossed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is utf8 because line 675 is already doing the first level hex un-escape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't in previous commit: benaadams@6769e1e

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh, benaadams@6769e1e was better. Sorry about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfowl revert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question is are paths always url encoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://tools.ietf.org/html/rfc3986#page-11 seems to suggest so; but looking for a more recent one...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internationalized Resource Identifiers (IRIs) https://www.ietf.org/rfc/rfc3987.txt "Mapping of IRIs to URIs"; says yes? Convert to utf8 then url-encode to ascii - so reverting would be a safe transformation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from c18dd0f to 30fdee1 Compare November 17, 2015 01:33
@benaadams
Copy link
Contributor Author

Adding some tests for this

@benaadams benaadams force-pushed the MemoryPoolIterator2-GetString branch from fc8e1fa to 4dc4346 Compare November 17, 2015 03:46
davidfowl added a commit that referenced this pull request Nov 17, 2015
@davidfowl davidfowl merged commit 9c47796 into aspnet:dev Nov 17, 2015
@benaadams benaadams deleted the MemoryPoolIterator2-GetString branch November 17, 2015 04:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants