Ingress returning 503s when using Topology Aware Routing and the controller has no endpoints in the zone

**What happened**:

Ingress returns 503 when run in a multi-zone setup where the backend endpointslice doesn't have any endpoints in the same zone as the Ingress Controller

**What you expected to happen**:

Like kube-proxy, Ingress should send you to a random endpoint as topology hints are meant to be fail open not shut (unlike xTP).

My impression is that all testing/thought about this feature has been assuming people are using the topology-aware-routing:auto which doesn't let you into this situation, but the hints feature is explicitly designed to separate the responsibility of making the decision of enabling topology routing for a service from the responsibility of implementing it, so the implementation of the hints in the dataplane shouldn't make decisions around the assumption of what it thinks is setting them.

**NGINX Ingress controller version** (exec into the pod and run nginx-ingress-controller --version.):
```
NGINX Ingress controller
  Release:       v1.8.4
  Build:         05adfe3ee56fab8e4aded7ae00eed6630e43b458
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6
```

Note that this is still the current behavior in the latest commit of this repo, see this snippet:
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/endpointslices.go#L144

**Kubernetes version** (use `kubectl version`):
1.27


**Environment**:
A multi-zone cluster, e.g.:
```
DC1
  NodeA
  NodeB

DC2
  NodeC
  NodeD
```

Then:
```
ingress-nginx-controller-1 NodeA
ingress-nginx-controller-2 NodeB
ingress-nginx-controller-3 NodeC
ingress-nginx-controller-4 NodeD

service-backend-pod-1 NodeD
```

Ingress 3/4 on NodeC/D populate the endpoint list including pod-1 and work.

Ingress 1/2 on NodeA/B do not populate the endpoint list as pod-1 is marked as in a different zone in the endpointslice

** Workaround **

Setting service-upstream and delegating the decision to kube-proxy makes this work, as kube-proxy handles this situation properly (sends you to a random endpoint regardless of topology). It would be nice if ingress-nginx handled this though as there are lots of downsides to service-upstream as I'm sure you folks know

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ingress returning 503s when using Topology Aware Routing and the controller has no endpoints in the zone #11342

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ingress returning 503s when using Topology Aware Routing and the controller has no endpoints in the zone #11342

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions