`ClusterCondition::last_update_time` is updated on no-ops, causing infinite reconciles (in the worst case) #1032

nightkr · 2025-05-14T14:06:08Z

Affected version

Yes. (Still an issue on trunk, introduced in #571, rolled out around SDP 23.4.)

Current and expected behavior

Reconciling a cluster where there nothing has changed should be a no-op.

ClusterCondition::last_update_time breaks this expectation since it is set unconditionally to whatever the current time is, rounded to the second (

operator-rs/crates/stackable-operator/src/status/condition/mod.rs

Lines 350 to 355 in 61596d6

    
           if old_condition.status == new_condition.status { 
        
               ClusterCondition { 
        
                   last_update_time: Some(now), 
        
                   last_transition_time: old_condition.last_transition_time, 
        
                   ..new_condition 
        
               }

). This is registered as another object modification if the new reconcile is not within the same wall-second as the previous one. Depending on how long one reconcile takes, that can cause (up to) an infinite re-reconciliation loop while the object is trying to settle down (which is likely to be an indication that the cluster is struggling to begin with!).

Possible solution

Drop last_update_time completely (for compat: either stub it out or make it equivalent to last_transition_time)
Take the value from whenever the data source for the condition was updated, rather than the current wall time (if it makes sense/is possible for that condition)

Additional context

Discovered by @siegfriedweber, discussed at https://stackable-workspace.slack.com/archives/C02FZ581UCD/p1747230004370629

Environment

No response

Would you like to work on fixing this bug?

None

The text was updated successfully, but these errors were encountered:

maltesander · 2025-06-11T15:41:22Z

The approach back then was to follow the OpenShift ClusterOperatorStatusCondition, see https://github.com/openshift/api/blob/b1bcdbc3/config/v1/types_cluster_operator.go#L101.

There, the last_updated_time does not even appear so i am not sure why it was introduced here, as this would always only be the last timestamp the operator reconciled, which does not provide much value.

Suggestion is using Solution 1 and just drop it.

nightkr added the type/bug label May 14, 2025

lfrancke added the scheduled-for/25.3.0 label May 31, 2025

lfrancke added this to Stackable End-to-End Coordination May 31, 2025

lfrancke moved this to Proposed in Stackable End-to-End Coordination May 31, 2025

lfrancke added the refinement-needed label Jun 11, 2025

maltesander moved this from Proposed to In Refinement in Stackable End-to-End Coordination Jun 11, 2025

maltesander self-assigned this Jun 11, 2025

maltesander moved this from In Refinement to In Progress in Stackable End-to-End Coordination Jun 12, 2025

maltesander linked a pull request Jun 12, 2025 that will close this issue

Remove last_ update_time from ClusterCondition #1054

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

`ClusterCondition::last_update_time` is updated on no-ops, causing infinite reconciles (in the worst case) #1032

`ClusterCondition::last_update_time` is updated on no-ops, causing infinite reconciles (in the worst case) #1032

nightkr commented May 14, 2025 •

edited

Loading

maltesander commented Jun 11, 2025

Uh oh!

Uh oh!

ClusterCondition::last_update_time is updated on no-ops, causing infinite reconciles (in the worst case) #1032

ClusterCondition::last_update_time is updated on no-ops, causing infinite reconciles (in the worst case) #1032

Comments

nightkr commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Affected version

Current and expected behavior

Possible solution

Additional context

Environment

Would you like to work on fixing this bug?

maltesander commented Jun 11, 2025

Uh oh!

`ClusterCondition::last_update_time` is updated on no-ops, causing infinite reconciles (in the worst case) #1032

`ClusterCondition::last_update_time` is updated on no-ops, causing infinite reconciles (in the worst case) #1032

nightkr commented May 14, 2025 •

edited

Loading