-
-
Notifications
You must be signed in to change notification settings - Fork 9.1k
[JENKINS-75675]: Refactor classloader logic in order to reduce memory consumption #10659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Yay, your first pull request towards Jenkins core was created successfully! Thank you so much! |
Would this not be better reported/patched against OpenJDK's |
@basil, thank you for the quick review. I understand your concern, but I don’t believe OpenJDK is the right place for this optimization. This isn’t an issue with URLClassLoader itself — it’s designed to work correctly in typical Java applications, where an extra 10MB per class loader is a reasonable overhead in exchange for a simpler and faster implementation. The problem becomes significant in pluggable systems that use separate class loading. So, in my view, this should be addressed within the Jenkins plugin framework. I’ve implemented a 10-line optimization that is easy to maintain or revert if it causes any issues. This is feasible because I used Guava’s MapMaker. Guava cannot be used in the OpenJDK codebase, however. If you consider the ~5–10MB additional memory overhead per plugin acceptable, then let’s just reject the PR. In my production Jenkins cluster, the heap usage is around 4GB when idle — 2GB of which is consumed by 50 billion unnecessary lock objects. It works fine, but the purist in me feels that this is not how it should be. |
Alternative: It is possible to make jenkins.util.URLClassLoader2 not capable for parallel class loading by removing
In this case, class loading will become serial — URLClassLoader will use a single lock instead of a dedicated lock per class name. |
I don't see anything in the I hate wasting memory and I am not against this PR as a short-term solution, but only after we have a discussion with the OpenJDK folks about whether or not this is an upstream problem and, if so, what the long-term plan would be. |
Right, this is just my understanding. I’ve reviewed the existing standard ClassLoader implementation and I don’t have strong arguments to ask the OpenJDK team to change a standard mechanism that already meets the needs of 99% of Java applications — just to benefit Jenkins and perhaps a few other (~10) Java-based pluggable platforms. That said, talking to the OpenJDK folks definitely makes sense if there’s a feasible path forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry forgot to submit this earlier.
I Agree with Basil, this appears to be something for the JDK.
Anything here should be a temporary workaround until we could pickup a JDK with a fix (should the JDK team agree this is an issue)
@@ -1,7 +1,9 @@ | |||
package jenkins.util; | |||
|
|||
import com.google.common.collect.MapMaker; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are trying to reduce the use of guava in core not increase it :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood
return super.getClassLoadingLock(className); | ||
return parallelLockWeakValuesMap.computeIfAbsent(className, k -> new Object()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why the JDK does not simply return an intern
ed String and needs to keep the lock in a map.
For sure interning will put everything in the string pool, but if there are no references then the a String in the string pool should be eligible to be GC'ed (normally a Sting literal in a class would cause a reference for as long as the class was loaded, meaning it would never actually be GCed, but for explicitly interned Strings this is different)
e.g
return className.intern();
There may be some valid reasons here - interning is not free and could cause a slowdown on classloading.
Or maybe this is just a hangover from older JDK versions before the string pool was moved to the heap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
className.intern() is jvm unique for class name. they need per classloader uniqueness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Javadoc:
* <p> Class loaders that support concurrent loading of classes are known as
* <em>{@linkplain #isRegisteredAsParallelCapable() parallel capable}</em> class
* loaders and are required to register themselves at their class initialization
* time by invoking the {@link
* #registerAsParallelCapable ClassLoader.registerAsParallelCapable}
* method. Note that the {@code ClassLoader} class is registered as parallel
* capable by default. However, its subclasses still need to register themselves
* if they are parallel capable.
* In environments in which the delegation model is not strictly
* hierarchical, class loaders need to be parallel capable, otherwise class
* loading can lead to deadlocks because the loader lock is held for the
* duration of the class loading process (see {@link #loadClass
* loadClass} methods).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They need per class loader + class name unique lock to avoid possible deadlocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyone know if we need the parallel capable class loader?
If I remember correctly, enabling parallel loading on at least some of the Jenkins class loaders resulted in a significant performance improvement. So I went ahead and enabled it everywhere. If disabling it on some of the classloaders results in a big improvement in memory usage without a significant performance regression, I would be OK with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to find out by calling parentClassLoader.getResource(className.replace('.','/') + ".class") to do exactly that
This sounds like it should work, and would be preferable to using third-party libraries from our perspective. We try very hard not to add new third-party libraries to Jenkins core for a variety of reasons (including the fact that they become part of the public API we expose to Jenkins plugins).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case you haven't read this yet: https://www.jenkins.io/doc/developer/plugin-development/dependencies-and-class-loading/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jenkins/core/src/main/java/hudson/PluginManager.java
Lines 2416 to 2418 in 608d83c
if (name.startsWith("SimpleTemplateScript")) { // cf. groovy.text.SimpleTemplateEngine
throw new ClassNotFoundException("ignoring " + name);
}
already contains exclusions for Groovy garbage. I wonder if we need to expand what is in that list to include the classes you discovered?
groovy.tmp.*
can be added.
It might be possible to skip all classes that don't have a specified package (is this feasible?).
Skipping java.util.jenkins.model.params$IS_INTERNAL_REGISTER
and similar entries is complicated.
This happens because using params.IS_INTERNAL_REGISTER
in my pipeline causes the Groovy compiler to try all possible combinations of default packages: java.lang, java.util, java.net, groovy.lang, groovy.util, hudson.model, jenkins.model,
etc. This behavior depends on the configuration of the GroovyCompiler. I don't think jenkins-core is the right place to hardcode this list.
I'm not sure how to properly skip such entries without using ClassGraph, but perhaps we can live with them. In my setup, there are around 150k of these entries, but the number is limited and does not grow over time, unlike groovy.tmp.*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like it should work, and would be preferable to using third-party libraries from our perspective. We try very hard not to add new third-party libraries to Jenkins core for a variety of reasons (including the fact that they become part of the public API we expose to Jenkins plugins).
Ok will redo the PR using this approach
25e29b6
to
b9045b6
Compare
fc1fa28
to
feb64f2
Compare
3609275
to
9d97d2c
Compare
9d97d2c
to
695d8ad
Compare
It definitely looks useful and it does look good to me. |
@dukhlov I have been travelling this week and have not had enough time to work on this PR. I want to get this shipped, but this PR is still a bit too big and difficult for me to understand. Even after asking for explanations, it seems that this PR trying to solve several problems at once. I would prefer if it were split into two PRs, one being the minimal changes needed to directly address the memory usage issue, and another PR with any other cleanups. Describing the change(s) with the template here would also help me process this more quickly. If these things aren't done, that isn't necessarily a blocker for this work. I will still get to it eventually, but the lack of the above would place more work on me and it would take longer. |
2db3d51
to
7a232ea
Compare
Reverted changes in UberClassloader. They mostly related to improving performance (thanks to including webapp classes into cache) and not related to decreasing memory consumption |
0c5d7f1
to
08d7509
Compare
ea731d0
to
268823e
Compare
@basil, I pushed my temp branch to this pr's repo a few days ago by mistake. I am sorry about that. Now fixed and ready for review |
4c4cb68
to
c61a84a
Compare
…onsumption Override URLClassloader2.getClassLoadingLock to use lock objects that can be GCed when not needed, Wrap WebAppClassLoader with CheckingExistenceClassLoader to avoid missing loadClass() calls not to crate unnecessary persistent lock object, Refactor delegating classloaders not to use locking: * DelegatingClassloader - non-locking base implementation and use it for classloaders which delegate functionality to other classloaders, * CheckingExistenceClassLoader - check that class exists before loadClass() calls.
c61a84a
to
0976e4e
Compare
@basil, It looks like the patch review might take longer than I initially expected. Could you roughly let me know how long it might take? That would help me plan my upcoming work. |
I am working on some performance improvements for job loading, but this is next on my TODO list after that. Hopefully by next week. |
Ah, great, thank you! |
See JENKINS-75675.
Background
Our Jenkins installation relies heavily on Groovy pipelines, shared libraries, and similar mechanisms. Groovy uses dynamic invocation via MetaClass, CallSite, etc. During introspection, it attempts to load many classes that don’t exist, simply to determine whether a given token is a class, a variable, or something else.
My investigation shows that this behavior leads to memory leakage.
Problem
The base ClassLoader implementation in Java (which all other class loaders inherit from) supports two modes:
parallelCapable (currently used by all Jenkins core class loaders, including WebAppClassLoader and PlatformClassLoader)
non-parallel (legacy mode)
Parallel-capable class loaders create and retain a lock object per class name, indefinitely. So, if we have 1,000 class loading misses, every parallel-capable class loader in the hierarchy keeps 1,000 unused lock objects forever.
In the case of Jenkins's UberClassLoader, a typical setup with ~200 plugins results in around 500 class loaders. In our Jenkins instance, about 2,000 classes are loaded successfully, while we observe over 200,000 class loading misses.
As a result, the class loaders retain:
500 class loaders × 200,000 lock objects each
Plus the internal ConcurrentHashMap.Node objects used to store them
This results in roughly 2 GB of unnecessary memory consumption.
Solution
DelegatingClassLoader
implementation which overrides baseClassLoaders.loadClass
to remove not needed base locking mechanism (it don't load class itself just delegates loading to another class loader and locking will be done here)DelegatingClassLoader
instead ofClassLoader
if possible.getClassLoadingLock()
forUrlClassLoader2
for use GCed locking object which won't be kept forevergetResource()
call) and don't useloadClass()
call not to create locking Object at allWebAppClassLoader
with FilteringClassLoader to avoid not needed lock files creationTesting done
This PR tested by running 2-hour long proprietary automation tests based on jenkins scripted pipeline jobs
class loading logic is broadly used, so it definitely covered by this testing
Also I made
jcmd GC.class_histogram
. Amount ofjava.lang.Object
andjava.util.concurrent.ConcurrentHashMap$Node
instances was decreased dramaticallyBefore fix:

After fix:

Proposed changelog entries
Proposed changelog category
/label internal
Proposed upgrade guidelines
N/A
Submitter checklist
@Restricted
or have@since TODO
Javadocs, as appropriate.@Deprecated(since = "TODO")
or@Deprecated(forRemoval = true, since = "TODO")
, if applicable.eval
to ease future introduction of Content Security Policy (CSP) directives (see documentation).Desired reviewers
@jenkinsci/core-pr-reviewers
Before the changes are marked as
ready-for-merge
:Maintainer checklist
upgrade-guide-needed
label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).lts-candidate
to be considered (see query).