-
Notifications
You must be signed in to change notification settings - Fork 170
Open
Milestone
Description
Summary
In recent observations, applications using the JS SDK have experienced HTTP 504 Gateway Timeout errors when subscribing to topics through Mirror Node gRPC endpoints in Node.js environments.
The SDK logs multiple retry attempts (e.g., 20 retries) and ultimately fails after extended delays (e.g., 40+ seconds).
This issue tracks the investigation and improvement of 504 error handling behavior in the SDK to ensure predictable retry logic, consistent node rotation, and better developer visibility when connecting to Mirror Nodes.
Observed Behavior
Example logs:
Error subscribing to topic 0.0.1013 during attempt 17. Waiting 8000 ms before next attempt: Received HTTP status code 504
Error subscribing to topic 0.0.1013 during attempt 18. Waiting 8000 ms before next attempt: 14 UNAVAILABLE: Received HTTP status code 504
Error subscribing to topic 0.0.1013 during attempt 19. Waiting 8000 ms before next attempt: Received HTTP status code 504
ERROR HCS Topic subscription error message 14 UNAVAILABLE: Received HTTP status code 504
ERROR Error: 14 UNAVAILABLE: Received HTTP status code 504 at TopicMessageQuery._handleError
In these cases:
- The SDK retries multiple times but continues attempting the same Mirror Node.
- Each retry waits several seconds, leading to long perceived hangs and poor UX.
- The issue is observed in Node.js environments (not browser or gRPC-web proxy layers).
Goals
-
Investigate
- Identify how 504s are surfaced in Mirror Node gRPC (as gRPC
UNAVAILABLEor network-level HTTP status). - Determine whether the SDK’s retry/backoff logic treats 504 as transient or fatal.
- Verify if the SDK reuses the same Mirror Node after consecutive 504s or properly rotates to the next available one.
- Identify how 504s are surfaced in Mirror Node gRPC (as gRPC
-
Define Expected Behavior
- On receiving 504:
- Mark the Mirror Node as temporarily unhealthy.
- Rotate to the next available Mirror Node (similar to gRPC deadline exceed logic).
- Apply exponential backoff only if all nodes fail.
- Ensure consistent handling across all SDK environments (Node, browser, etc.).
- On receiving 504:
-
Improve Error Messaging
- Clearly differentiate between:
- gRPC errors (e.g.,
UNAVAILABLE) - HTTP proxy or Mirror Node 504 errors
- SDK-level timeouts
- gRPC errors (e.g.,
- Provide actionable context to developers (e.g., “Mirror Node timeout — switching to next node”).
- Clearly differentiate between:
Proposed Changes
- Review
TopicMessageQueryand other Mirror Node communication paths. - Update internal node management logic to:
- Recognize HTTP 504 responses from Mirror Nodes as timeout indicators.
- Trigger Mirror Node rotation instead of repeatedly retrying the same node.
- Improve logging and error handling to surface clearer feedback.
- Validate behavior in Node.js (Mirror Node gRPC) builds.
Test Plan
-
Simulated Mirror Node Timeout
- Mock a Mirror Node returning 504 responses for topic subscriptions.
- Verify SDK moves to the next Mirror Node after the first or second 504 (not after many retries).
-
Mirror Node Recovery
- When the node starts returning valid responses again, confirm the SDK resumes normal routing.
-
Cross-SDK Consistency
- Compare handling in JS SDK vs Java SDK to align per-node timeout and rotation behavior.
-
Developer Feedback
- Ensure logs and thrown errors clearly state that a Mirror Node timeout occurred and rotation was triggered.
Notes
- This issue is specific to Mirror Node gRPC connectivity in Node.js and not related to gRPC-web or browser CORS errors.
- Findings from this investigation should inform unified node rotation and retry policy across SDKs.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
📋 Backlog