Skip to content

Conversation

@tzvist
Copy link

@tzvist tzvist commented Jan 6, 2026

Add JunctionIndexEliminationTree class that stores cluster roots using factor indices instead of full factor objects, enabling efficient reuse of elimination tree structure across multiple LevenbergMarquardt iterations.

Key changes:

  • Add JunctionIndexEliminationTree to cache cluster roots structure
  • Modify GaussianFactorGraph::optimize() to use cached elimination tree
  • Cache damped system elimination tree in LevenbergMarquardtOptimizer
  • Add comprehensive unit tests for JunctionIndexEliminationTree

Performance improvement: reduces runtime from ~120s to ~90s (~25% faster) in a specific test cases by avoiding repeated elimination tree construction.

Add JunctionIndexEliminationTree class that stores cluster roots using
factor indices instead of full factor objects, enabling efficient reuse
of elimination tree structure across multiple LevenbergMarquardt iterations.

Key changes:
- Add JunctionIndexEliminationTree to cache cluster roots structure
- Modify GaussianFactorGraph::optimize() to use cached elimination tree
- Cache damped system elimination tree in LevenbergMarquardtOptimizer
- Add comprehensive unit tests for JunctionIndexEliminationTree

Performance improvement: reduces runtime from ~120s to ~90s (~25% faster)
in a specific test cases by avoiding repeated elimination tree construction.
namespace internal {

/// IndexedCluster: mirrors ClusterTree::Cluster but stores factor indices
struct IndexedCluster {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file I I tried keeping code identical (as much as possible) to the new code and i just changed it to use indices instead of pointers to factors (this way it can be reused if the factors do not change and only values and linearization change)

@dellaert
Copy link
Member

dellaert commented Jan 8, 2026

woohoo, that's a big win! Will review Thursday.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an optimization to Levenberg-Marquardt iterations by caching the elimination tree structure across iterations. A new JunctionIndexEliminationTree class stores cluster roots using factor indices instead of full factor objects, avoiding repeated elimination tree construction.

Key changes:

  • New JunctionIndexEliminationTree class that caches elimination tree structure using factor indices
  • Modified GaussianFactorGraph::optimize() to use cached elimination tree when available
  • Added caching mechanism in LevenbergMarquardtOptimizer for the damped system elimination tree

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
gtsam/linear/JunctionIndexEliminationTree.h Header defining the new JunctionIndexEliminationTree class for cached elimination
gtsam/linear/JunctionIndexEliminationTree-inl.h Internal IndexedCluster data structure implementation
gtsam/linear/JunctionIndexEliminationTree.cpp Implementation of index-based elimination tree construction and elimination
gtsam/nonlinear/LevenbergMarquardtOptimizer.h Added cached elimination tree member variable
gtsam/nonlinear/LevenbergMarquardtOptimizer.cpp Cache creation and usage in tryLambda method
gtsam/linear/GaussianFactorGraph.h Added setCachedClusterRoots method and cache member
gtsam/linear/GaussianFactorGraph.cpp Modified optimize methods to use cached elimination tree
gtsam/linear/tests/testJunctionIndexEliminationTree.cpp Comprehensive unit tests for JunctionIndexEliminationTree

// Build damped system for this lambda (adds prior factors that make it like gradient descent)
auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

// once we have dampped system it will never change so we can cache the cluster roots
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states the damped system "will never change" but this is only true within the context of a single optimization iteration, not across all iterations. The cache is created once and reused across iterations, but the damped system itself is rebuilt each time with different lambda values. Consider clarifying that it's the elimination tree structure (cluster roots) that doesn't change, not the damped system itself.

Suggested change
// once we have dampped system it will never change so we can cache the cluster roots
// The damped system is rebuilt each iteration with different lambda values, but its elimination
// tree structure (cluster roots) depends only on the graph and ordering, so we cache the roots.

Copilot uses AI. Check for mistakes.
Comment on lines 405 to 408
void setCachedClusterRoots(
std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
cachedJunctionElimination_ = clusterRoots;
};
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for the setCachedClusterRoots method. According to the coding guidelines, all functions in header files should have doxygen-style API docs explaining parameters, return values, and the method's purpose.

Copilot uses AI. Check for mistakes.
Comment on lines 404 to 410
/// @}
void setCachedClusterRoots(
std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
cachedJunctionElimination_ = clusterRoots;
};

private:
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cachedJunctionElimination_ member variable is added to the private section but is not documented. Consider adding documentation to explain its purpose, especially since it affects the behavior of the optimize methods.

Suggested change
/// @}
void setCachedClusterRoots(
std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
cachedJunctionElimination_ = clusterRoots;
};
private:
/// @}
/// Set a cached elimination structure (cluster roots) to reuse during optimize.
void setCachedClusterRoots(
std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
cachedJunctionElimination_ = clusterRoots;
};
private:
/// Cached junction/elimination tree used to speed up repeated optimize calls.

Copilot uses AI. Check for mistakes.
Comment on lines +312 to +314
if (cachedJunctionElimination_ != nullptr) {
return cachedJunctionElimination_->eliminate(*this, function)->optimize();
}
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for cachedJunctionElimination_ being non-null doesn't verify that the cached elimination tree matches the current graph structure. If the graph structure has changed (different factors or connectivity), the cached tree will be incorrect. The cache should be invalidated when the graph changes, or the code should verify compatibility before using the cache.

Copilot uses AI. Check for mistakes.
Comment on lines 145 to 151
// once we have dampped system it will never change so we can cache the cluster roots
if (cachedDampedJunctionElimination_ == nullptr) {
cachedDampedJunctionElimination_ = std::make_shared<internal::JunctionIndexEliminationTree>(
dampedSystem, *params_.ordering);
}

dampedSystem.setCachedClusterRoots(cachedDampedJunctionElimination_);
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new caching functionality in LevenbergMarquardtOptimizer lacks test coverage. While JunctionIndexEliminationTree has comprehensive unit tests, there are no tests that verify the caching mechanism works correctly in the actual optimization context, including edge cases like when params_.ordering is null or when the cache is reused across multiple optimization iterations.

Copilot uses AI. Check for mistakes.
auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

// once we have dampped system it will never change so we can cache the cluster roots
if (cachedDampedJunctionElimination_ == nullptr) {
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache creation dereferences params_.ordering without checking if it's null. If params_.ordering is null, this will cause a segmentation fault. Add a null check before dereferencing, or document that ordering must be provided for the cache optimization to work.

Suggested change
if (cachedDampedJunctionElimination_ == nullptr) {
if (cachedDampedJunctionElimination_ == nullptr && params_.ordering) {

Copilot uses AI. Check for mistakes.
void setCachedClusterRoots(
std::shared_ptr<internal::JunctionIndexEliminationTree> clusterRoots) {
cachedJunctionElimination_ = clusterRoots;
};
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary semicolon after the closing brace of the method definition. Method definitions in C++ should not have a semicolon after the closing brace.

Suggested change
};
}

Copilot uses AI. Check for mistakes.
// Build damped system for this lambda (adds prior factors that make it like gradient descent)
auto dampedSystem = buildDampedSystem(linear, sqrtHessianDiagonal);

// once we have dampped system it will never change so we can cache the cluster roots
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "dampped" should be "damped".

Suggested change
// once we have dampped system it will never change so we can cache the cluster roots
// once we have damped system it will never change so we can cache the cluster roots

Copilot uses AI. Check for mistakes.
/**
* @brief A class that encapsulates indexed cluster roots for elimination
*/
class JunctionIndexEliminationTree {
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing GTSAM_EXPORT macro. According to the coding guidelines, on Windows it is necessary to explicitly export all functions from the library which should be externally accessible. The JunctionIndexEliminationTree class should include the GTSAM_EXPORT macro in its definition to ensure proper visibility across DLL boundaries on Windows.

Copilot uses AI. Check for mistakes.
Comment on lines +312 to +324
if (cachedJunctionElimination_ != nullptr) {
return cachedJunctionElimination_->eliminate(*this, function)->optimize();
}
return BaseEliminateable::eliminateMultifrontal(Ordering::COLAMD, function)
->optimize();
}

/* ************************************************************************* */
VectorValues GaussianFactorGraph::optimize(const Ordering& ordering, const Eliminate& function) const {
gttic(GaussianFactorGraph_optimize);
if (cachedJunctionElimination_ != nullptr) {
return cachedJunctionElimination_->eliminate(*this, function)->optimize();
}
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setCachedClusterRoots method and the caching behavior in optimize methods lack test coverage. The test file testJunctionIndexEliminationTree.cpp has a test called "cachedJunctionElimination" but it doesn't actually test that the cache is being used properly to improve performance or that the cached optimization produces the same results as non-cached optimization.

Copilot uses AI. Check for mistakes.
@dellaert
Copy link
Member

dellaert commented Jan 8, 2026

@tzvist I have not read the code in detail yet; it would help me to know the philosophy a little bit better. How do the new values of the damped system enter into the equation? Am I right to assume the system is still built as usual, but because you’re using indices, the new factors are accessed?

@tzvist
Copy link
Author

tzvist commented Jan 8, 2026

@tzvist I have not read the code in detail yet; it would help me to know the philosophy a little bit better. How do the new values of the damped system enter into the equation? Am I right to assume the system is still built as usual, but because you’re using indices, the new factors are accessed?

Yes, the damped system is built exactly the same as before (the buildDampedSystem call at line 143) but instead of recomputing the elimination structure every time, we now cache a JunctionIndexEliminationTree that only stores the indices into the factor graph.
The key insight is that across different lambda iterations, the damped system's structure doesn't change - only the damping values change. So we compute the JunctionIndexEliminationTree once (lines 147-149) from the first damped system structure, and it stores the precalculated indices.
Then in each tryLambda call, we build the new damped system with the updated lambda value, pass it to the cached elimination tree via setCachedJunctionElimination (line 151), and the elimination process accesses the new factors according to the precalculated indices. This way we can reuse the expensive elimination structure computation across all lambda search iterations.

@dellaert
Copy link
Member

dellaert commented Jan 9, 2026

I think you have the right idea here. I’ve actually used this idea as well recently in the new multifrontal solver. By the way, if you’re not aware of it, here is the link to the last PR that I just submitted: #2343

That new solver is on track to be integrated with the nonlinear optimizers, to get hopefully much more speed-up than 75% even. But I think that your idea can be used to speed up any eliminatable factor graph, including hybrid, so I’m interested in exploring this regardless. Also only linear systems without constraints can be sped up by the new multifrontal solver.

That being said, I prefer if we could discuss the API and make it a little bit more functional. Would you be available to meet some time next week?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants