-
Notifications
You must be signed in to change notification settings - Fork 523
Reduce GetString allocs and conversions #312
Conversation
{ | ||
// avoid declaring other local vars, or doing work with stackalloc | ||
// to prevent the .locals init cil flag , see: https://github.com/dotnet/coreclr/issues/1279 | ||
char* output = stackalloc char[length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How big can length be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 16k max
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is just a limit on the max it will attempt with stackalloc
before using new
and heap allocating instead; isn't actually a parsing limit.
702e560
to
6282d0b
Compare
6b5e761
to
4161301
Compare
056bcd1
to
5de4169
Compare
@@ -10,6 +10,7 @@ namespace Microsoft.AspNet.Server.Kestrel.Infrastructure | |||
{ | |||
public struct MemoryPoolIterator2 | |||
{ | |||
private const int _maxHeaderStackLength = 16384; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason we don't want to stackalloc a char array larger than this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Total headers are generally constrained < 8k/16k So individual headers shouldn't be larger than 16k; if you are going larger than that you probably have other issues to resolve than performance - Added some related comments to: https://github.com/aspnet/KestrelHttpServer/pull/313/files#diff-6b7953b837b6ad7c921259c907520639R14
Can go larger; but that was my reasoning.
For upper levels; total stack size aspnet4 x86 was 256k, x64 was 512k; .net 1MB - not sure what the current defaults are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ref:
nginx default 8k
Apache default 8190 bytes
IIS default 16384 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add even more reasoning here: HTTP/2 compression will further reduce the average header size. I agree 16k is a good limit given what we see out in the world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is in the wrong layer MemoryPoolIterator2 doesn't understand what it's looking at. That's the caller's job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its to prevent the stack blowing up; so if length is larger than this it heap allocates instead. Maybe a rename like _maxStackAllocBytes
?
fee6712
to
d88d4c3
Compare
c3ca476
to
68d63ed
Compare
This looks good to me, but since this is a pretty big change I would like someone else to also sign off on it. @lodejard @davidfowl @troydai thoughts? |
Is there anyway we can take advantage of CopyTo here? We already have the memory pool interator |
@davidfowl you mean move into MemoryPoolIterator2 rather than have it in Frame? GetNextString kinda thing? |
It looks good to me. The change is straightforward. One thing we may or may not to do is to extract the logics of taking two iterators to generate a string to a separate class. The logic here expand the iterator class which in my opinion should be kept concise. The logic of creating string with two iterators is very much self-contained. |
68d63ed
to
7f50736
Compare
Renamed constant |
I think this can be cleaned up |
9d3760e
to
53582f2
Compare
Simpler? |
Its the byte -> char conversion that needs to be explicit to get it in right format for string; CopyTo is byte-> byte
@troydai Changed the string from MemoryPoolIterator to Extenstion methods; better? |
145e5d1
to
9bdc06a
Compare
For the Utf8->char*(string) it can be done faster with intrinsics but I don't think Vector exposes all the needed methods (see http://woboq.com/blog/utf-8-processing-using-simd.html) and probably would want to push that in the coreclr/corefx Encoding.UTF8, so this code would remain the same anyway. |
} | ||
} | ||
|
||
private static string GetAsciiStringHeap(MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the name of code reuse:
private static unsafe string GetAsciiStringHeap(MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length)
{
var buffer = new char[length];
fixed (char* output = buffer)
{
return MultiBlockAsciiIter(output, start, end, inputOffset, length);
}
}
private static unsafe string MultiBlockAsciiIter(char* output, MemoryPoolBlock2 start, MemoryPoolIterator2 end, int inputOffset, int length)
{
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better, done
|
} | ||
else | ||
{ | ||
requestUrlPath = pathBegin.GetAsciiString(pathEnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always use Utf8 for the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed and reverted as discussed in outdated diff
05dea5d
to
6769e1e
Compare
@@ -675,7 +675,7 @@ private bool TakeStartLine(SocketInput input) | |||
pathEnd = UrlPathDecoder.Unescape(pathBegin, pathEnd); | |||
} | |||
|
|||
var requestUrlPath = pathBegin.GetString(pathEnd); | |||
var requestUrlPath = pathBegin.GetUtf8String(pathEnd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidfowl just told me to always Utf8 the path; was last commit change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol @benaadams wires crossed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is utf8 because line 675 is already doing the first level hex un-escape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't in previous commit: benaadams@6769e1e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, benaadams@6769e1e was better. Sorry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidfowl revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question is are paths always url encoded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://tools.ietf.org/html/rfc3986#page-11 seems to suggest so; but looking for a more recent one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internationalized Resource Identifiers (IRIs) https://www.ietf.org/rfc/rfc3987.txt "Mapping of IRIs to URIs"; says yes? Convert to utf8 then url-encode to ascii - so reverting would be a safe transformation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
c18dd0f
to
30fdee1
Compare
Adding some tests for this |
fc8e1fa
to
4dc4346
Compare
Reduce GetString allocs and conversions
Resolves #291