-
Notifications
You must be signed in to change notification settings - Fork 896
Optimize Levenberg Marquardt elimination by caching JunctionTree. #2340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Optimize Levenberg Marquardt elimination by caching JunctionTree. #2340
Conversation
Add JunctionIndexEliminationTree class that stores cluster roots using factor indices instead of full factor objects, enabling efficient reuse of elimination tree structure across multiple LevenbergMarquardt iterations. Key changes: - Add JunctionIndexEliminationTree to cache cluster roots structure - Modify GaussianFactorGraph::optimize() to use cached elimination tree - Cache damped system elimination tree in LevenbergMarquardtOptimizer - Add comprehensive unit tests for JunctionIndexEliminationTree Performance improvement: reduces runtime from ~120s to ~90s (~25% faster) in a specific test cases by avoiding repeated elimination tree construction.
| namespace internal { | ||
|
|
||
| /// IndexedCluster: mirrors ClusterTree::Cluster but stores factor indices | ||
| struct IndexedCluster { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file I I tried keeping code identical (as much as possible) to the new code and i just changed it to use indices instead of pointers to factors (this way it can be reused if the factors do not change and only values and linearization change)
|
woohoo, that's a big win! Will review Thursday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces an optimization to Levenberg-Marquardt iterations by caching the elimination tree structure across iterations. A new JunctionIndexEliminationTree class stores cluster roots using factor indices instead of full factor objects, avoiding repeated elimination tree construction.
Key changes:
- New
JunctionIndexEliminationTreeclass that caches elimination tree structure using factor indices - Modified
GaussianFactorGraph::optimize()to use cached elimination tree when available - Added caching mechanism in
LevenbergMarquardtOptimizerfor the damped system elimination tree
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| gtsam/linear/JunctionIndexEliminationTree.h | Header defining the new JunctionIndexEliminationTree class for cached elimination |
| gtsam/linear/JunctionIndexEliminationTree-inl.h | Internal IndexedCluster data structure implementation |
| gtsam/linear/JunctionIndexEliminationTree.cpp | Implementation of index-based elimination tree construction and elimination |
| gtsam/nonlinear/LevenbergMarquardtOptimizer.h | Added cached elimination tree member variable |
| gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp | Cache creation and usage in tryLambda method |
| gtsam/linear/GaussianFactorGraph.h | Added setCachedClusterRoots method and cache member |
| gtsam/linear/GaussianFactorGraph.cpp | Modified optimize methods to use cached elimination tree |
| gtsam/linear/tests/testJunctionIndexEliminationTree.cpp | Comprehensive unit tests for JunctionIndexEliminationTree |
| // Build damped system for this lambda (adds prior factors that make it like gradient descent) | ||
| auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal); | ||
|
|
||
| // once we have dampped system it will never change so we can cache the cluster roots |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states the damped system "will never change" but this is only true within the context of a single optimization iteration, not across all iterations. The cache is created once and reused across iterations, but the damped system itself is rebuilt each time with different lambda values. Consider clarifying that it's the elimination tree structure (cluster roots) that doesn't change, not the damped system itself.
| // once we have dampped system it will never change so we can cache the cluster roots | |
| // The damped system is rebuilt each iteration with different lambda values, but its elimination | |
| // tree structure (cluster roots) depends only on the graph and ordering, so we cache the roots. |
gtsam/linear/GaussianFactorGraph.h
Outdated
| void setCachedClusterRoots( | ||
| std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) { | ||
| cachedJunctionElimination_ = clusterRoots; | ||
| }; |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation for the setCachedClusterRoots method. According to the coding guidelines, all functions in header files should have doxygen-style API docs explaining parameters, return values, and the method's purpose.
| /// @} | ||
| void setCachedClusterRoots( | ||
| std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) { | ||
| cachedJunctionElimination_ = clusterRoots; | ||
| }; | ||
|
|
||
| private: |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cachedJunctionElimination_ member variable is added to the private section but is not documented. Consider adding documentation to explain its purpose, especially since it affects the behavior of the optimize methods.
| /// @} | |
| void setCachedClusterRoots( | |
| std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) { | |
| cachedJunctionElimination_ = clusterRoots; | |
| }; | |
| private: | |
| /// @} | |
| /// Set a cached elimination structure (cluster roots) to reuse during optimize. | |
| void setCachedClusterRoots( | |
| std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) { | |
| cachedJunctionElimination_ = clusterRoots; | |
| }; | |
| private: | |
| /// Cached junction/elimination tree used to speed up repeated optimize calls. |
| if (cachedJunctionElimination_ != nullptr) { | ||
| return cachedJunctionElimination_->eliminate(*this, function)->optimize(); | ||
| } |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for cachedJunctionElimination_ being non-null doesn't verify that the cached elimination tree matches the current graph structure. If the graph structure has changed (different factors or connectivity), the cached tree will be incorrect. The cache should be invalidated when the graph changes, or the code should verify compatibility before using the cache.
| // once we have dampped system it will never change so we can cache the cluster roots | ||
| if (cachedDampedJunctionElimination_ == nullptr) { | ||
| cachedDampedJunctionElimination_ = std::make_shared<internal::JunctionIndexEliminationTree>( | ||
| dampedSystem, *params_.ordering); | ||
| } | ||
|
|
||
| dampedSystem.setCachedClusterRoots(cachedDampedJunctionElimination_); |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new caching functionality in LevenbergMarquardtOptimizer lacks test coverage. While JunctionIndexEliminationTree has comprehensive unit tests, there are no tests that verify the caching mechanism works correctly in the actual optimization context, including edge cases like when params_.ordering is null or when the cache is reused across multiple optimization iterations.
| auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal); | ||
|
|
||
| // once we have dampped system it will never change so we can cache the cluster roots | ||
| if (cachedDampedJunctionElimination_ == nullptr) { |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache creation dereferences params_.ordering without checking if it's null. If params_.ordering is null, this will cause a segmentation fault. Add a null check before dereferencing, or document that ordering must be provided for the cache optimization to work.
| if (cachedDampedJunctionElimination_ == nullptr) { | |
| if (cachedDampedJunctionElimination_ == nullptr && params_.ordering) { |
gtsam/linear/GaussianFactorGraph.h
Outdated
| void setCachedClusterRoots( | ||
| std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) { | ||
| cachedJunctionElimination_ = clusterRoots; | ||
| }; |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary semicolon after the closing brace of the method definition. Method definitions in C++ should not have a semicolon after the closing brace.
| }; | |
| } |
| // Build damped system for this lambda (adds prior factors that make it like gradient descent) | ||
| auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal); | ||
|
|
||
| // once we have dampped system it will never change so we can cache the cluster roots |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spelling error: "dampped" should be "damped".
| // once we have dampped system it will never change so we can cache the cluster roots | |
| // once we have damped system it will never change so we can cache the cluster roots |
| /** | ||
| * @brief A class that encapsulates indexed cluster roots for elimination | ||
| */ | ||
| class JunctionIndexEliminationTree { |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing GTSAM_EXPORT macro. According to the coding guidelines, on Windows it is necessary to explicitly export all functions from the library which should be externally accessible. The JunctionIndexEliminationTree class should include the GTSAM_EXPORT macro in its definition to ensure proper visibility across DLL boundaries on Windows.
| if (cachedJunctionElimination_ != nullptr) { | ||
| return cachedJunctionElimination_->eliminate(*this, function)->optimize(); | ||
| } | ||
| return BaseEliminateable::eliminateMultifrontal(Ordering::COLAMD, function) | ||
| ->optimize(); | ||
| } | ||
|
|
||
| /* ************************************************************************* */ | ||
| VectorValues GaussianFactorGraph::optimize(const Ordering& ordering, const Eliminate& function) const { | ||
| gttic(GaussianFactorGraph_optimize); | ||
| if (cachedJunctionElimination_ != nullptr) { | ||
| return cachedJunctionElimination_->eliminate(*this, function)->optimize(); | ||
| } |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setCachedClusterRoots method and the caching behavior in optimize methods lack test coverage. The test file testJunctionIndexEliminationTree.cpp has a test called "cachedJunctionElimination" but it doesn't actually test that the cache is being used properly to improve performance or that the cached optimization produces the same results as non-cached optimization.
|
@tzvist I have not read the code in detail yet; it would help me to know the philosophy a little bit better. How do the new values of the damped system enter into the equation? Am I right to assume the system is still built as usual, but because you’re using indices, the new factors are accessed? |
Yes, the damped system is built exactly the same as before (the buildDampedSystem call at line 143) but instead of recomputing the elimination structure every time, we now cache a JunctionIndexEliminationTree that only stores the indices into the factor graph. |
|
I think you have the right idea here. I’ve actually used this idea as well recently in the new multifrontal solver. By the way, if you’re not aware of it, here is the link to the last PR that I just submitted: #2343 That new solver is on track to be integrated with the nonlinear optimizers, to get hopefully much more speed-up than 75% even. But I think that your idea can be used to speed up any eliminatable factor graph, including hybrid, so I’m interested in exploring this regardless. Also only linear systems without constraints can be sped up by the new multifrontal solver. That being said, I prefer if we could discuss the API and make it a little bit more functional. Would you be available to meet some time next week? |
Add JunctionIndexEliminationTree class that stores cluster roots using factor indices instead of full factor objects, enabling efficient reuse of elimination tree structure across multiple LevenbergMarquardt iterations.
Key changes:
Performance improvement: reduces runtime from ~120s to ~90s (~25% faster) in a specific test cases by avoiding repeated elimination tree construction.