Skip to content

Ingress returning 503s when using Topology Aware Routing and the controller has no endpoints in the zone #11342

@LAMRobinson

Description

@LAMRobinson

What happened:

Ingress returns 503 when run in a multi-zone setup where the backend endpointslice doesn't have any endpoints in the same zone as the Ingress Controller

What you expected to happen:

Like kube-proxy, Ingress should send you to a random endpoint as topology hints are meant to be fail open not shut (unlike xTP).

My impression is that all testing/thought about this feature has been assuming people are using the topology-aware-routing:auto which doesn't let you into this situation, but the hints feature is explicitly designed to separate the responsibility of making the decision of enabling topology routing for a service from the responsibility of implementing it, so the implementation of the hints in the dataplane shouldn't make decisions around the assumption of what it thinks is setting them.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
  Release:       v1.8.4
  Build:         05adfe3ee56fab8e4aded7ae00eed6630e43b458
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Note that this is still the current behavior in the latest commit of this repo, see this snippet:
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/endpointslices.go#L144

Kubernetes version (use kubectl version):
1.27

Environment:
A multi-zone cluster, e.g.:

DC1
  NodeA
  NodeB

DC2
  NodeC
  NodeD

Then:

ingress-nginx-controller-1 NodeA
ingress-nginx-controller-2 NodeB
ingress-nginx-controller-3 NodeC
ingress-nginx-controller-4 NodeD

service-backend-pod-1 NodeD

Ingress 3/4 on NodeC/D populate the endpoint list including pod-1 and work.

Ingress 1/2 on NodeA/B do not populate the endpoint list as pod-1 is marked as in a different zone in the endpointslice

** Workaround **

Setting service-upstream and delegating the decision to kube-proxy makes this work, as kube-proxy handles this situation properly (sends you to a random endpoint regardless of topology). It would be nice if ingress-nginx handled this though as there are lots of downsides to service-upstream as I'm sure you folks know

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.needs-kindIndicates a PR lacks a `kind/foo` label and requires one.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions