-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
src: reduce the nearest parent package JSON cache size #59888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
src: reduce the nearest parent package JSON cache size #59888
Conversation
Review requested:
|
Why not put the cache on the C++ side? cc @nodejs/performance |
@GeoffreyBooth the boundary crossing is adding up to a big overhead in some modules (see current JS cache addition). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #59888 +/- ##
==========================================
- Coverage 88.28% 88.27% -0.01%
==========================================
Files 702 702
Lines 206804 206901 +97
Branches 39793 39808 +15
==========================================
+ Hits 182571 182651 +80
- Misses 16238 16264 +26
+ Partials 7995 7986 -9
🚀 New features to boost your workflow:
|
ebf17f2
to
16d38f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM on green ci.
I think the CI failure is a flake: https://ci.nodejs.org/job/node-test-commit-aix/nodes=aix72-ppc64/59052/console |
Fixes #59733
This PR addresses an increase in heap usage in Node versions containing the change introduced in #59086. That change added a cache to
getNearestParentPackageJSON
inpackage_reader_json
to avoid the overhead of repeatedly calling in to C++ when this function is invoked repeatedly on the same file path. There was a similar cache in the original implementation that existed prior to a change (#50322) which moved thispackage.json
reading to C++. However, the original cache was a map frompackage.json
path to a deserialized object, whereas the new cache is a map from any given module file path to an object representing its parentpackage.json
file. Because it's common for multiple files to share a parentpackage.json
, this cache often contains many more entries than original cache, with many of them pointing to duplicate data.I've addressed the issue here by moving the filesystem traversal which finds the parent
package.json
file for a given file back to the JS side of the fence, and adjusting the cache to be a map frompackage.json
file path to its deserialized object. Doing this makes the cache size manageable again. The traversal code is basically lifted from the original JS-side implementation that got removed.GetNearestParentPackageJSON
on the C++ side isn't called anywhere anymore, so I've also removed that.Memory benchmarks
I measured memory usage with this script.
date-fns
is structured in a way that clearly exhibits the heap usage change:main
Post-change
Runtime benchmarks
I tested runtime impact with two scripts, one of which is the benchmark from the original PR that re-introduced the cache. I will say that I'm not totally confident in the DataDog/CDK reproduction from the original issue. Even without any caching at all, I don't see performance as bad as the original reports.
main
Post-change