Skip to content

Perfomance regression node 16.20.0 -> 18.16.0 #95

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LuisFros opened this issue Jun 7, 2023 · 19 comments
Open

Perfomance regression node 16.20.0 -> 18.16.0 #95

LuisFros opened this issue Jun 7, 2023 · 19 comments

Comments

@LuisFros
Copy link

LuisFros commented Jun 7, 2023

Running a simple HTTP graphql server and following the guide https://nodejs.org/en/docs/guides/simple-profiling

  • What factors or changes could be taken into consideration to account for this perfomance regression?
  • Internal profilings show no significant difference and its mostly notable on request/response times

------------- NODE 16----------------

This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Finished 250 requests


Server Software:        
Server Hostname:        localhost
Server Port:            3100

Document Path:          /graphql
Document Length:        34 bytes

Concurrency Level:      20
Time taken for tests:   3.203 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      75000 bytes
Total body sent:        367750
HTML transferred:       8500 bytes
Requests per second:    78.06 [#/sec] (mean)
Time per request:       256.227 [ms] (mean)
Time per request:       12.811 [ms] (mean, across all concurrent requests)
Transfer rate:          22.87 [Kbytes/sec] received
                        112.13 kb/s sent
                        135.00 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       1
Processing:   186  224  60.2    208     451
Waiting:      186  224  60.2    208     451
Total:        186  224  60.4    208     452

Percentage of the requests served within a certain time (ms)
  50%    208
  66%    213
  75%    215
  80%    217
  90%    224
  95%    424
  98%    440
  99%    447
 100%    452 (longest request)

-------------NODE 18----------------

This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Finished 250 requests


Server Software:        
Server Hostname:        localhost
Server Port:            3100

Document Path:          /graphql
Document Length:        34 bytes

Concurrency Level:      20
Time taken for tests:   29.894 seconds
Complete requests:      250
Failed requests:        0
Keep-Alive requests:    250
Total transferred:      75000 bytes
Total body sent:        367750
HTML transferred:       8500 bytes
Requests per second:    8.36 [#/sec] (mean)
Time per request:       2391.480 [ms] (mean)
Time per request:       119.574 [ms] (mean, across all concurrent requests)
Transfer rate:          2.45 [Kbytes/sec] received
                        12.01 kb/s sent
                        14.46 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.2      0       1
Processing:   191  345 1682.1    219   26814
Waiting:      191  345 1682.0    219   26813
Total:        191  345 1682.1    219   26814

Percentage of the requests served within a certain time (ms)
  50%    219
  66%    226
  75%    229
  80%    235
  90%    257
  95%    470
  98%    473
  99%    473
 100%  26814 (longest request)
@aduh95
Copy link

aduh95 commented Jun 7, 2023

Duplicate of #79 and/or #72?

@H4ad
Copy link
Member

H4ad commented Jun 7, 2023

@LuisFros Can you share some code on Stackblitz or some repo?

Also, are you using the crypto module?

@LuisFros
Copy link
Author

LuisFros commented Jun 9, 2023

I am using the crypto module and already handled the performance issue with #72
I can't share some code unfortunately but I would like to know how could I provide some more information without the code itself?

@NewEraCracker
Copy link

For your specific usage scenario, it appears that Node.js 18 exhibits a significant performance regression compared to Node.js 16, with a decrease of approximately up to 88% (less requests being processed in the same time frame).

A proof of concept would be useful, nonetheless.

Could you perhaps try with a custom build of Node.js 18 custom linked to shared OpenSSL 1.1.1:

And run your benchmark again to compare the results?

@LuisFros
Copy link
Author

LuisFros commented Jun 12, 2023

I haven't tried running the custom build with OpenSSL 1.1.1 since the profiling doesn't show that the perfomance regression is caused by the crypto module anymore.

Unrelated to my issue, I managed to create a reproducible case where there is a significant perfomance regression of a function call being passed down as an argument and then called.
This is the code example https://stackblitz.com/edit/stackblitz-starters-sb65sg?file=index.js
When running the code in index.js in node 18.16.0 the perfomance is ~3x slower than in 16.20 for the second benchmarking case called 'Nested call'

EDIT: I forgot to mention that the regression is gone in node 20

@LuisFros
Copy link
Author

LuisFros commented Jun 16, 2023

I created a sample case for the issue of a simple http server and testing different response times and node version:

Code:

const http = require('http')

const host = 'localhost'
const port = parseInt(process.argv[2]) || 3000

const server = http.createServer((req, res) => {
  if (req.url === '/') {
    res.writeHead(200, { 'Content-Type': 'text/plain' })
    res.end(`Use ${host}:${port}/test`)
  } else if (req.url === '/test') {
    res.writeHead(200, { 'Content-Type': 'text/plain' })
    res.end('test')
  } else {
    res.writeHead(404, { 'Content-Type': 'text/plain' })
    res.end('Not found')
  }
})

server.listen(port, host, () => {
  console.log(`Server is running on http://${host}:${port}`)
})

The tests were run using

ab -k -c $concurrency -n $count "http://127.0.0.1:$port/test"

The label on the axis is {$concurrency}-{$requests} and the y-axis is requests/sec from the ab output

image

It is clear that at a high number of concurrency requests there is a significant performance regression of a simple http server.
image

The difference for 16.20.0 and 18.16.0 is around ~5% slower when loading the requests with 150 concurrency and over 200K requests

@RafaelGSS
Copy link
Member

@LuisFros The regression between 18 and 20 is quite different from the ones I measured some time ago (I didn't publish it anywhere yet). Would you mind sharing the hardware information? Was it a dedicated machine?

@LuisFros
Copy link
Author

@RafaelGSS
Yeah it was a dedicated machine:

  • Ubuntu, 22.04 LTS
  • Intel Broadwell
  • x86/64
  • 4vCPU 16GB Memory

@LuisFros
Copy link
Author

LuisFros commented Sep 4, 2023

Hi, are there any updates on this from the performance team?

@LuisFros
Copy link
Author

LuisFros commented Sep 8, 2023

UPDATE: Ran the tests now again with the latest version of node 20: 20.6.0, node 18 :18.17.1 and node 16: 16.20.2 and more combinations of {$concurrency}-{$requests}

  • The results still show a performance regression and node 20 has given worse results
image

@santigimeno
Copy link
Member

I've been looking into the v20 regression.

The issue seems to be caused by this libuv commit which makes libuv to stop accepting concurrent connections in a loop and defers every next incoming connection handling to the next loop iteration. Reverting it, makes v20 to be on par with v18.

As explained in the commit, the main reason for this change is that in most of the scenarios we save an extra accept() call. Are we sure the scenario that this benchmark is showing is representative of a real-life scenario or is more an academic thing? Just asking so we might consider reverting that commit.

/cc @bnoordhuis, what are your thoughts?

@bnoordhuis
Copy link
Member

I'm willing to believe it impacts a single-process hello world server but I'm ambivalent on whether that's something to optimize for.

I'd be interested to see how the cluster module fares with NODE_CLUSTER_SCHED_POLICY=none. A big reason for removing the accept loop is because it results in lopsided load distribution.

Node works around that by doing round robin distribution but I don't want to force other libuv users into working around libuv's shortcomings.

@mcollina
Copy link
Member

My experience with ab is that it open way more sockets than needed, stressing that parts.

@bnoordhuis
Copy link
Member

That's true. ab is kind of horrible, wrk is better in just about every aspect.

@LuisFros
Copy link
Author

My experience with ab is that it open way more sockets than needed, stressing that parts.

Do you think it would be worth replicating the tests with another tool? Is wrk the recommended way to test this?

@RafaelGSS
Copy link
Member

My experience with ab is that it open way more sockets than needed, stressing that parts.

Do you think it would be worth replicating the tests with another tool? Is wrk the recommended way to test this?

I suggest wrk2.

@violetviolinist
Copy link

Hello, are there any updates on this? I think I am seeing a similar degradation while moving from v14 to v20.18.0. Does v22 address this in any capacity?

@lemire
Copy link
Member

lemire commented Nov 6, 2024

@violetviolinist Reading the comments suggests that people are unclear as to whether it is an actual meaningful regression.

@NewEraCracker
Copy link

On my application I do not have any noticeable impact from this issue. Could this be connected to the fact the Node.js 18 prefers IPv6 by default?

https://x.com/matteocollina/status/1640384245834055680

@matteocollina In Node.js v18 we removed an old-time fix that reordered the DNS results to put IPv4 addresses before IPv6. As IPv6 gets widespread usage, this is the right call to do.

This has exposed so many broken IPv6 networks. Work around it with:

node --dns-result-order=ipv4first

I've actually updated the services I could to version 22 and it is quite good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants