Optimize Levenberg Marquardt elimination by caching JunctionTree. #2340

tzvist · 2026-01-06T17:09:51Z

Add JunctionIndexEliminationTree class that stores cluster roots using factor indices instead of full factor objects, enabling efficient reuse of elimination tree structure across multiple LevenbergMarquardt iterations.

Key changes:

Add JunctionIndexEliminationTree to cache cluster roots structure
Modify GaussianFactorGraph::optimize() to use cached elimination tree
Cache damped system elimination tree in LevenbergMarquardtOptimizer
Add comprehensive unit tests for JunctionIndexEliminationTree

Performance improvement: reduces runtime from ~120s to ~90s (~25% faster) in a specific test cases by avoiding repeated elimination tree construction.

Add JunctionIndexEliminationTree class that stores cluster roots using factor indices instead of full factor objects, enabling efficient reuse of elimination tree structure across multiple LevenbergMarquardt iterations. Key changes: - Add JunctionIndexEliminationTree to cache cluster roots structure - Modify GaussianFactorGraph::optimize() to use cached elimination tree - Cache damped system elimination tree in LevenbergMarquardtOptimizer - Add comprehensive unit tests for JunctionIndexEliminationTree Performance improvement: reduces runtime from ~120s to ~90s (~25% faster) in a specific test cases by avoiding repeated elimination tree construction.

tzvist · 2026-01-06T20:20:58Z

gtsam/linear/JunctionIndexEliminationTree-inl.h

+namespace internal {
+
+/// IndexedCluster: mirrors ClusterTree::Cluster but stores factor indices
+struct IndexedCluster {


In this file I I tried keeping code identical (as much as possible) to the new code and i just changed it to use indices instead of pointers to factors (this way it can be reused if the factors do not change and only values and linearization change)

dellaert · 2026-01-08T06:31:20Z

woohoo, that's a big win! Will review Thursday.

Copilot

Pull request overview

This PR introduces an optimization to Levenberg-Marquardt iterations by caching the elimination tree structure across iterations. A new JunctionIndexEliminationTree class stores cluster roots using factor indices instead of full factor objects, avoiding repeated elimination tree construction.

Key changes:

New JunctionIndexEliminationTree class that caches elimination tree structure using factor indices
Modified GaussianFactorGraph::optimize() to use cached elimination tree when available
Added caching mechanism in LevenbergMarquardtOptimizer for the damped system elimination tree

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
gtsam/linear/JunctionIndexEliminationTree.h	Header defining the new JunctionIndexEliminationTree class for cached elimination
gtsam/linear/JunctionIndexEliminationTree-inl.h	Internal IndexedCluster data structure implementation
gtsam/linear/JunctionIndexEliminationTree.cpp	Implementation of index-based elimination tree construction and elimination
gtsam/nonlinear/LevenbergMarquardtOptimizer.h	Added cached elimination tree member variable
gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp	Cache creation and usage in tryLambda method
gtsam/linear/GaussianFactorGraph.h	Added setCachedClusterRoots method and cache member
gtsam/linear/GaussianFactorGraph.cpp	Modified optimize methods to use cached elimination tree
gtsam/linear/tests/testJunctionIndexEliminationTree.cpp	Comprehensive unit tests for JunctionIndexEliminationTree

Copilot · 2026-01-08T06:59:19Z

gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp

  // Build damped system for this lambda (adds prior factors that make it like gradient descent)
  auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

+  // once we have dampped system it will never change so we can cache the cluster roots


The comment states the damped system "will never change" but this is only true within the context of a single optimization iteration, not across all iterations. The cache is created once and reused across iterations, but the damped system itself is rebuilt each time with different lambda values. Consider clarifying that it's the elimination tree structure (cluster roots) that doesn't change, not the damped system itself.

Suggested change

// once we have dampped system it will never change so we can cache the cluster roots

// The damped system is rebuilt each iteration with different lambda values, but its elimination

// tree structure (cluster roots) depends only on the graph and ordering, so we cache the roots.

Copilot · 2026-01-08T06:59:19Z

gtsam/linear/GaussianFactorGraph.h

+    void setCachedClusterRoots(
+      std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
+        cachedJunctionElimination_ = clusterRoots;
+    };


Missing documentation for the setCachedClusterRoots method. According to the coding guidelines, all functions in header files should have doxygen-style API docs explaining parameters, return values, and the method's purpose.

Copilot · 2026-01-08T06:59:19Z

gtsam/linear/GaussianFactorGraph.h

    /// @}
+    void setCachedClusterRoots(
+      std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
+        cachedJunctionElimination_ = clusterRoots;
+    };

  private:


The cachedJunctionElimination_ member variable is added to the private section but is not documented. Consider adding documentation to explain its purpose, especially since it affects the behavior of the optimize methods.

Suggested change

/// @}

void setCachedClusterRoots(

std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {

cachedJunctionElimination_ = clusterRoots;

};

private:

/// @}

/// Set a cached elimination structure (cluster roots) to reuse during optimize.

void setCachedClusterRoots(

std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {

cachedJunctionElimination_ = clusterRoots;

};

private:

/// Cached junction/elimination tree used to speed up repeated optimize calls.

Copilot · 2026-01-08T06:59:20Z

gtsam/linear/GaussianFactorGraph.cpp

+    if (cachedJunctionElimination_ != nullptr) {
+      return cachedJunctionElimination_->eliminate(*this, function)->optimize();
+    }


The check for cachedJunctionElimination_ being non-null doesn't verify that the cached elimination tree matches the current graph structure. If the graph structure has changed (different factors or connectivity), the cached tree will be incorrect. The cache should be invalidated when the graph changes, or the code should verify compatibility before using the cache.

Copilot · 2026-01-08T06:59:20Z

gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp

+  // once we have dampped system it will never change so we can cache the cluster roots
+  if (cachedDampedJunctionElimination_ == nullptr) {
+    cachedDampedJunctionElimination_ = std::make_shared<internal::JunctionIndexEliminationTree>(
+      dampedSystem, *params_.ordering);
+  }
+
+  dampedSystem.setCachedClusterRoots(cachedDampedJunctionElimination_);


The new caching functionality in LevenbergMarquardtOptimizer lacks test coverage. While JunctionIndexEliminationTree has comprehensive unit tests, there are no tests that verify the caching mechanism works correctly in the actual optimization context, including edge cases like when params_.ordering is null or when the cache is reused across multiple optimization iterations.

Copilot · 2026-01-08T06:59:22Z

gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp

  auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

+  // once we have dampped system it will never change so we can cache the cluster roots
+  if (cachedDampedJunctionElimination_ == nullptr) {


The cache creation dereferences params_.ordering without checking if it's null. If params_.ordering is null, this will cause a segmentation fault. Add a null check before dereferencing, or document that ordering must be provided for the cache optimization to work.

Suggested change

if (cachedDampedJunctionElimination_ == nullptr) {

if (cachedDampedJunctionElimination_ == nullptr && params_.ordering) {

Copilot · 2026-01-08T06:59:23Z

gtsam/linear/GaussianFactorGraph.h

+    void setCachedClusterRoots(
+      std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
+        cachedJunctionElimination_ = clusterRoots;
+    };


Unnecessary semicolon after the closing brace of the method definition. Method definitions in C++ should not have a semicolon after the closing brace.

Suggested change

};

}

Copilot · 2026-01-08T06:59:23Z

gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp

  // Build damped system for this lambda (adds prior factors that make it like gradient descent)
  auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

+  // once we have dampped system it will never change so we can cache the cluster roots


Spelling error: "dampped" should be "damped".

Suggested change

// once we have dampped system it will never change so we can cache the cluster roots

// once we have damped system it will never change so we can cache the cluster roots

Copilot · 2026-01-08T06:59:23Z

gtsam/linear/JunctionIndexEliminationTree.h

+/**
+ * @brief A class that encapsulates indexed cluster roots for elimination
+ */
+class JunctionIndexEliminationTree {


Missing GTSAM_EXPORT macro. According to the coding guidelines, on Windows it is necessary to explicitly export all functions from the library which should be externally accessible. The JunctionIndexEliminationTree class should include the GTSAM_EXPORT macro in its definition to ensure proper visibility across DLL boundaries on Windows.

Copilot · 2026-01-08T06:59:23Z

gtsam/linear/GaussianFactorGraph.cpp

+    if (cachedJunctionElimination_ != nullptr) {
+      return cachedJunctionElimination_->eliminate(*this, function)->optimize();
+    }
    return BaseEliminateable::eliminateMultifrontal(Ordering::COLAMD, function)
        ->optimize();
  }

  /* ************************************************************************* */
  VectorValues GaussianFactorGraph::optimize(const Ordering& ordering, const Eliminate& function) const {
    gttic(GaussianFactorGraph_optimize);
+    if (cachedJunctionElimination_ != nullptr) {
+      return cachedJunctionElimination_->eliminate(*this, function)->optimize();
+    }


The setCachedClusterRoots method and the caching behavior in optimize methods lack test coverage. The test file testJunctionIndexEliminationTree.cpp has a test called "cachedJunctionElimination" but it doesn't actually test that the cache is being used properly to improve performance or that the cached optimization produces the same results as non-cached optimization.

dellaert · 2026-01-08T15:16:46Z

@tzvist I have not read the code in detail yet; it would help me to know the philosophy a little bit better. How do the new values of the damped system enter into the equation? Am I right to assume the system is still built as usual, but because you’re using indices, the new factors are accessed?

tzvist · 2026-01-08T15:43:54Z

@tzvist I have not read the code in detail yet; it would help me to know the philosophy a little bit better. How do the new values of the damped system enter into the equation? Am I right to assume the system is still built as usual, but because you’re using indices, the new factors are accessed?

Yes, the damped system is built exactly the same as before (the buildDampedSystem call at line 143) but instead of recomputing the elimination structure every time, we now cache a JunctionIndexEliminationTree that only stores the indices into the factor graph.
The key insight is that across different lambda iterations, the damped system's structure doesn't change - only the damping values change. So we compute the JunctionIndexEliminationTree once (lines 147-149) from the first damped system structure, and it stores the precalculated indices.
Then in each tryLambda call, we build the new damped system with the updated lambda value, pass it to the cached elimination tree via setCachedJunctionElimination (line 151), and the elimination process accesses the new factors according to the precalculated indices. This way we can reuse the expensive elimination structure computation across all lambda search iterations.

dellaert · 2026-01-09T01:01:39Z

I think you have the right idea here. I’ve actually used this idea as well recently in the new multifrontal solver. By the way, if you’re not aware of it, here is the link to the last PR that I just submitted: #2343

That new solver is on track to be integrated with the nonlinear optimizers, to get hopefully much more speed-up than 75% even. But I think that your idea can be used to speed up any eliminatable factor graph, including hybrid, so I’m interested in exploring this regardless. Also only linear systems without constraints can be sped up by the new multifrontal solver.

That being said, I prefer if we could discuss the API and make it a little bit more functional. Would you be available to meet some time next week?

tzvist commented Jan 6, 2026

View reviewed changes

dellaert requested a review from Copilot January 8, 2026 06:47

Copilot started reviewing on behalf of dellaert January 8, 2026 06:55 View session

Copilot AI reviewed Jan 8, 2026

View reviewed changes

Address some Copilot comments

c2301ab

	// once we have dampped system it will never change so we can cache the cluster roots
	// The damped system is rebuilt each iteration with different lambda values, but its elimination
	// tree structure (cluster roots) depends only on the graph and ordering, so we cache the roots.

	if (cachedDampedJunctionElimination_ == nullptr) {
	if (cachedDampedJunctionElimination_ == nullptr && params_.ordering) {

	// once we have dampped system it will never change so we can cache the cluster roots
	// once we have damped system it will never change so we can cache the cluster roots

Optimize Levenberg Marquardt elimination by caching JunctionTree. #2340

Are you sure you want to change the base?

Optimize Levenberg Marquardt elimination by caching JunctionTree. #2340

Conversation

tzvist commented Jan 6, 2026

Uh oh!

tzvist Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

dellaert commented Jan 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

dellaert commented Jan 8, 2026

Uh oh!

tzvist commented Jan 8, 2026

Uh oh!

dellaert commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants