-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
http: add parser kOnStreamAlloc callback for faster uploads #52176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Review requested:
|
How would you get this exposed in the http server? |
Signed-off-by: Guy Margalit <[email protected]>
@mcollina I thought of adding an option to createServer which will be passed to the parser as is: * `requestBufferAllocator` {Function} If provided this function is called
to allocate buffers for the incoming request body on uploads. These buffers
are emitted to the request 'data' event and the stream consumer can recycle
them to reduce redundant memory copies and garbage collection of many small
buffers. This is useful for servers that want to maximize their upload
throughput and willing to pay some memory for that.
The provided function will be called with a single number argument `length`,
and should return a buffer with that exact size for maximum efficiency,
but if its return value is not a buffer it is ignored and falls back to
the usual buffer copying. I originally tried to come up with ways to encapsulate this inside the http module, for example by setting the size of the buffer pool to be used internally, however this doesn't really work out because in order to recycle the js buffers, the request consumer has to be involved and put back the buffers into the pool only once processing the data from them is done. If you think this is a reasonable API option for http.createServer, I would happily add this to the PR. |
a617f84
to
0031663
Compare
Other than a custom allocator option like |
I like the custom option for the Server constructor. |
The current parser OnStreamAlloc model uses a static 64KB memory buffer for the incoming upload data, which is then copied to a new JS buffer for every onBody event. This copies the data twice, and also creates a lot of small buffers that overwhelm GC.
This PR suggests the user to provide a callback function that allocates the buffers for receiving the data, which are then emitted to the http request, and once the user is done processing the data, it can recycle the buffer object by using a simple buffer pool.
This was observed while testing upload performance for https://github.com/noobaa/noobaa-core as every node process consumed a lot of GC and memcpy and could not exceed an upload throughput of 3 GB/sec. However with this suggestion, a significant boost of upload performance was observed, which could get almost up to 2x of the current performance.
Here are the results from the provided benchmark/http/server_upload.js running on Mac M1:
I considered also adding an http server option to expose this capability, but didn't know if that would be acceptable as an API change.
Would be great to get your review! Thanks in advance.