Fix unbounded memory usage in pickle environment #912
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In my project we are using incremental builds and building often in parallel with
sphinx-build -j auto
. We use sphinx-needs a lot and we've noticed in this configuration huge memory usage during compilation, sometimes so high (30+ GB) that it crashes my laptop. After some debugging, it turns out it comes from the size of the pickled environment that's used to store information between incremental builds.This file is typically found in your build directory as
environment.pickle
. The problem is that it grows after every incremental parallel build. This can be seen by rebuilding our project with something like:When doing this, we notice that the file size of
environment.pickle
increases every time. Of course when a build is run, this file is unpickled in RAM by each sphinx-build process. Looking more closely at the cause, it comes fromenv.needs_all_docs["all"]
. This list is correctly appended to only if the element is not already in it inadd_doc()
:sphinx-needs/sphinx_needs/utils.py
Lines 575 to 576 in 6d7740f
However,
merge_data()
appends both lists without checking for duplicates. This can result in unbounded increase in the size of the list in the build environment:sphinx-needs/sphinx_needs/needs.py
Lines 730 to 731 in 6d7740f
This can also be seen by running this simple script in your build directory:
This PR fixes this issue by merging lists without duplicates in
merge_data()
.