Skip to content

Support multi-endpoint failover #47

@killme2008

Description

@killme2008

Summary

The current multi-endpoint client configuration provides client-side load balancing across ready subchannels, but it does not guarantee per-RPC failover semantics. When a request is routed to an endpoint that becomes unavailable or returns a transport error, the failed RPC is not retried on another endpoint automatically.

Current behavior

  • Single endpoint uses a direct GrpcChannel.
  • Multi-endpoint uses a static resolver plus random or round_robin load balancing.
  • Endpoint selection happens across ready subchannels.
  • There is no explicit retry policy or request-level failover.

Expected behavior

Multi-endpoint configuration should support failover for transient transport-level endpoint failures, so a failed RPC can be retried against another healthy endpoint when it is safe to do so.

Scope

  • Define the desired failover contract clearly.
  • Evaluate whether this should rely on gRPC retry policy, client-side retry logic, or another mechanism.
  • Clarify interaction with idempotency / write semantics.
  • Add tests that cover endpoint failure scenarios in multi-endpoint mode.

Notes

This is distinct from load balancing. The current implementation can route new calls to other ready endpoints, but it does not provide request-level automatic failover for a call that already failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions