Skip to content

Conversation

@sitole
Copy link
Member

@sitole sitole commented Dec 22, 2025


Note

Introduce 5s per-node connect timeouts and unify context/trace usage so slow node connections don’t stall the sync loop.

  • Orchestrator node sync (packages/api/internal/orchestrator/cache.go)
    • Add nodeConnectTimeout (5s) and wrap per-node connects with context.WithTimeout in syncLocalDiscoveredNodes and syncClusterDiscoveredNodes to avoid blocking the sync cycle.
    • Normalize tracing/context handling: use a consistent ctx from tracer.Start(...) across syncNodes, syncLocalDiscoveredNodes, syncClusterDiscoveredNodes, and syncing existing nodes; update error logs to use the same ctx.
    • When sync errors occur, close gRPC connections with context.WithoutCancel(ctx) and deregister the node.

Written by Cursor Bugbot for commit f396e39. This will update automatically on new commits. Configure here.

@sitole sitole changed the title Fix ctx timeout for node connect Fix ctx timeout for node connect during nodes sync pipeline Dec 22, 2025
@sitole sitole marked this pull request as ready for review December 22, 2025 09:38
For local and cluster nodes we have now context created for each async
run. This will not timeout whole sync pipeline when node connect takes
20 seconds (pipeline ctx timeout). Limit is set for each async
execution.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

We were creating new instance of ctx with span called spanCtx. This is
not needed as original ctx is not ment to be used again so we can just
re-assign already existing ctx var.
@sitole sitole force-pushed the fix/ctx-timeout-for-node-connect branch from 379899f to 1879a28 Compare December 22, 2025 09:42
@sitole sitole enabled auto-merge (squash) December 22, 2025 10:00
@sitole sitole merged commit 858d670 into main Dec 22, 2025
27 checks passed
@sitole sitole deleted the fix/ctx-timeout-for-node-connect branch December 22, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants