Skip to content

http: add parser kOnStreamAlloc callback for faster uploads #52176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

guymguym
Copy link
Contributor

@guymguym guymguym commented Mar 21, 2024

The current parser OnStreamAlloc model uses a static 64KB memory buffer for the incoming upload data, which is then copied to a new JS buffer for every onBody event. This copies the data twice, and also creates a lot of small buffers that overwhelm GC.

This PR suggests the user to provide a callback function that allocates the buffers for receiving the data, which are then emitted to the http request, and once the user is done processing the data, it can recycle the buffer object by using a simple buffer pool.

This was observed while testing upload performance for https://github.com/noobaa/noobaa-core as every node process consumed a lot of GC and memcpy and could not exceed an upload throughput of 3 GB/sec. However with this suggestion, a significant boost of upload performance was observed, which could get almost up to 2x of the current performance.

Here are the results from the provided benchmark/http/server_upload.js running on Mac M1:

$ ./node benchmark/http/server_upload.js
http/server_upload.js useBufferPool=0 delay=-1 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 3,495.08
http/server_upload.js useBufferPool=1 delay=-1 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 5,855.73
http/server_upload.js useBufferPool=0 delay=0 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 3,031.98
http/server_upload.js useBufferPool=1 delay=0 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 4,831.23
http/server_upload.js useBufferPool=0 delay=1 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 3,107.22
http/server_upload.js useBufferPool=1 delay=1 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 4,926.17
http/server_upload.js useBufferPool=0 delay=5 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 3,065.13
http/server_upload.js useBufferPool=1 delay=5 backpressure=0 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 5,012.74
http/server_upload.js useBufferPool=0 delay=-1 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 3,343.51
http/server_upload.js useBufferPool=1 delay=-1 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 5,095.58
http/server_upload.js useBufferPool=0 delay=0 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 2,981.24
http/server_upload.js useBufferPool=1 delay=0 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 4,491.13
http/server_upload.js useBufferPool=0 delay=1 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 2,972.44
http/server_upload.js useBufferPool=1 delay=1 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 4,412.07
http/server_upload.js useBufferPool=0 delay=5 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 1,802.44
http/server_upload.js useBufferPool=1 delay=5 backpressure=1 upload=1048576 bufSize=65536 connections=200 duration=10 stats=0 benchmarker="wrk": 2,105.78

I considered also adding an http server option to expose this capability, but didn't know if that would be acceptable as an API change.

Would be great to get your review! Thanks in advance.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. http Issues or PRs related to the http subsystem. http_parser Issues and PRs related to the HTTP Parser dependency or the http_parser binding. needs-ci PRs that need a full CI run. typings labels Mar 21, 2024
@mcollina
Copy link
Member

How would you get this exposed in the http server?

@mcollina mcollina requested review from ShogunPanda and targos March 21, 2024 14:44
@guymguym
Copy link
Contributor Author

guymguym commented Mar 21, 2024

@mcollina I thought of adding an option to createServer which will be passed to the parser as is:

  * `requestBufferAllocator` {Function} If provided this function is called
    to allocate buffers for the incoming request body on uploads. These buffers
    are emitted to the request 'data' event and the stream consumer can recycle
    them to reduce redundant memory copies and garbage collection of many small
    buffers. This is useful for servers that want to maximize their upload
    throughput and willing to pay some memory for that.
    The provided function will be called with a single number argument `length`,
    and should return a buffer with that exact size for maximum efficiency,
    but if its return value is not a buffer it is ignored and falls back to
    the usual buffer copying.

I originally tried to come up with ways to encapsulate this inside the http module, for example by setting the size of the buffer pool to be used internally, however this doesn't really work out because in order to recycle the js buffers, the request consumer has to be involved and put back the buffers into the pool only once processing the data from them is done.

If you think this is a reasonable API option for http.createServer, I would happily add this to the PR.

@guymguym guymguym force-pushed the guy-node-http-onread branch from a617f84 to 0031663 Compare March 21, 2024 16:19
@guymguym guymguym changed the title http: add parser kOnStreamAlloc callback for less memcpy and gc on uploads http: add parser kOnStreamAlloc callback for faster uploads Mar 21, 2024
@guymguym
Copy link
Contributor Author

guymguym commented Mar 21, 2024

Other than a custom allocator option like requestBufferAllocator (better names suggestions are welcome), an more convenient API could be a method like http.releaseBuffer(buf) that server code can optionally call when done with an incoming buffer to return it back to a global http buffer pool (size controls/configs will be needed). It would be best if that api can hijack the underlying memory to prevent any references to that Buffer and its ArrayBuffer taken previously to be used accidentally. This can be a safe api to recycle buffers for next http allocations. Just an idea, but I don't think it replaces the need for a more basic custom allocator option.

@ShogunPanda
Copy link
Contributor

I like the custom option for the Server constructor.
Can you try implement it and see where it brings us?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. http_parser Issues and PRs related to the HTTP Parser dependency or the http_parser binding. http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. typings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants