Skip to content

Commit b23a65b

Browse files
committed
fix(sandbox-router-extproc): close validation gaps + lock down dst-host trust
Mirror the parity fixes that just landed on PR kubernetes-sigs#838, in the shapes that fit the ext_proc design. Together with the per-PR ingress strip, this closes the four-finding security review from @aditya-shantanu. 1. X-Sandbox-ID DNS-label validation. The Python router runs ID through ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ (max 63 chars) so a caller can't interpolate extra DNS components into "<id>.<ns>.svc. <cluster-domain>" via the DNS-form fallback. We validated namespace and port but not ID. Now applied to both ID and namespace through a shared validDNSLabel helper. 2. Tightened namespace validation. The previous validNamespace allowed uppercase, leading/trailing hyphens, and unbounded length. K8s itself rejects all three, so the practical risk was just routing to FQDNs that can't exist — but tightening to the same DNS-1123 rule used for ID keeps the validation surface uniform and matches Python exactly. validNamespace removed in favor of validDNSLabel. 3. Authorization header strip. Always add "authorization" to the HeaderMutation.RemoveHeaders so the sandbox never sees the caller's bearer credential. Without this, a sandbox could impersonate the caller against the K8s API or any other Bearer-protected service. Matches the Python router and the same fix on kubernetes-sigs#838. 4. x-envoy-original-dst-host ingress strip (defense in depth). A new envoy.filters.http.header_mutation filter runs BEFORE ext_proc in the HCM chain and removes any client-supplied x-envoy-original-dst-host. ext_proc still always sets it via HeaderMutation, so after the filter chain the value reaching the ORIGINAL_DST cluster is provably the one ext_proc wrote — or absent if a future route disables ext_proc, in which case the cluster fails closed with 503 rather than dispatching to whatever the client asked for. Without this, the security of the data path would rest on "ext_proc is enabled on every route", which the existing /healthz route already demonstrates is not a load- bearing assumption. Tests: TestHandle_InvalidIDRejected covers the six classes of DNS-injection / traversal inputs (dot, slash, underscore, uppercase, leading hyphen, trailing hyphen). TestHandle_AlwaysStripsAuthorization asserts the RemoveHeaders mutation contains "authorization" on both upgrade and non-upgrade paths. TestValidDNSLabel replaces the old TestValidNamespace with the stricter table (accepts 1abc per RFC 1123, rejects MY-NS, -x, x-, length > 63). README documents the new validation surface, the headers we strip before forwarding (Authorization, Origin-on-upgrade, and the listener-level x-envoy-original-dst-host strip), and the rationale for each.
1 parent d4fd268 commit b23a65b

4 files changed

Lines changed: 168 additions & 31 deletions

File tree

sandbox-router/README.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,12 @@ Envoy expects these headers from the client (preserved from the Python router fo
3838

3939
| Header | Required? | Default | Notes |
4040
|---|---|---|---|
41-
| `X-Sandbox-ID` | yes || Sandbox name. Used for DNS-form fallback when UID is missing or uncached. |
41+
| `X-Sandbox-ID` | yes || Sandbox name. Used for DNS-form fallback when UID is missing or uncached. Must be a valid DNS-1123 label (`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`, max 63 chars). |
4242
| `X-Sandbox-UID` | recommended || Sandbox CR UID. Primary key for cache lookup; non-guessable. |
43-
| `X-Sandbox-Namespace` | no | `default` | ASCII letters/digits/hyphens, ≥1 alphanumeric. |
44-
| `X-Sandbox-Port` | no | `8888` | Numeric. |
43+
| `X-Sandbox-Namespace` | no | `default` | Same DNS-1123 label check as ID. |
44+
| `X-Sandbox-Port` | no | `8888` | Integer in `[1, 65535]`. |
45+
46+
DNS-label validation on both ID and namespace prevents DNS injection (e.g. `foo.evil.com` interpolating extra components into the upstream FQDN) and traversal-style inputs (e.g. `foo/bar`). Matches the Python router's `_is_valid_dns_label` check.
4547

4648
Routing precedence:
4749

@@ -51,6 +53,14 @@ Routing precedence:
5153

5254
The ext_proc handler returns Python-router-compatible JSON error bodies (`{"detail":"..."}`) for validation failures so existing clients keep working.
5355

56+
### Headers stripped before forwarding
57+
58+
The `HeaderMutation` returned to Envoy always removes:
59+
60+
- **`Authorization`** — identity-checking authorizers in front of ext_proc (TokenReview, JWT, etc.) consume the bearer credential. Forwarding it to the sandbox would let the sandbox impersonate the caller against the K8s API or any other Bearer-protected service. Matches the Python router, which strips `Authorization` right next to `Host`.
61+
- **`x-envoy-original-dst-host`** on the listener (defense in depth) — an `envoy.filters.http.header_mutation` filter runs *before* ext_proc and removes any client-supplied value, so ext_proc is provably the only writer of that header. Without this, a future route that disables ext_proc and uses the ORIGINAL_DST cluster would dispatch to whatever the client asked for.
62+
- **`Origin` on upgrade requests** — see the WebSockets section below.
63+
5464
### WebSockets and X-Forwarded-* headers
5565

5666
Two small carve-outs make browser-facing backends (vscode-server, Jupyter, anything that holds an Origin/Host CSRF check) work without changes:

sandbox-router/deploy/envoy/configmap.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,22 @@ data:
9393
timeout: 180s
9494
9595
http_filters:
96+
# Strip any inbound x-envoy-original-dst-host BEFORE
97+
# ext_proc runs. ext_proc always sets this header via
98+
# HeaderMutation, so after this filter chain the value
99+
# in the header is provably the one ext_proc wrote (or
100+
# absent if ext_proc was bypassed for the route, in
101+
# which case the ORIGINAL_DST cluster won't find a
102+
# target and returns 503 — fail-closed). Defense-in-
103+
# depth: keeps the security of the data path from
104+
# resting on "ext_proc is enabled on every route".
105+
- name: envoy.filters.http.header_mutation
106+
typed_config:
107+
"@type": type.googleapis.com/envoy.extensions.filters.http.header_mutation.v3.HeaderMutation
108+
mutations:
109+
request_mutations:
110+
- remove: x-envoy-original-dst-host
111+
96112
# ext_proc must run before the router filter so we get the
97113
# header mutation applied + route cache cleared before the
98114
# routing decision is made.

sandbox-router/internal/extproc/server.go

Lines changed: 39 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,15 @@ func (s *Server) onRequestHeaders(_ context.Context, hdrs *extprocv3.HttpHeaders
208208
if r.id == "" {
209209
return immediate(400, `{"detail":"X-Sandbox-ID header is required."}`)
210210
}
211-
if !validNamespace(r.namespace) {
211+
// DNS-label validation on ID + namespace prevents DNS injection
212+
// (e.g. id="foo.evil.com" interpolating extra components into
213+
// "<id>.<ns>.svc.<cluster-domain>") and traversal-style inputs
214+
// (e.g. id="foo/bar"). Matches the Python router's
215+
// _is_valid_dns_label check and the same fix on PR #838.
216+
if !validDNSLabel(r.id) {
217+
return immediate(400, `{"detail":"Invalid sandbox ID format."}`)
218+
}
219+
if !validDNSLabel(r.namespace) {
212220
return immediate(400, `{"detail":"Invalid namespace format."}`)
213221
}
214222
// TCP port range is [1, 65535]. Reject anything outside it (and the
@@ -246,6 +254,17 @@ func (s *Server) onRequestHeaders(_ context.Context, hdrs *extprocv3.HttpHeaders
246254
},
247255
AppendAction: corev3.HeaderValueOption_OVERWRITE_IF_EXISTS_OR_ADD,
248256
}},
257+
// Always strip Authorization on the way to the upstream.
258+
// Identity-checking authorizers (TokenReview, JWT, etc.) sit in
259+
// front of ext_proc and consume the bearer credential there;
260+
// forwarding it to the sandbox would let the sandbox
261+
// impersonate the caller against the K8s API or any other
262+
// Bearer-protected service. Envoy normalizes header keys to
263+
// lowercase, so "authorization" is the key to remove. Matches
264+
// the same fix on the from-scratch Go router (PR #838) and the
265+
// Python router, which strips Authorization right next to Host
266+
// before forwarding.
267+
RemoveHeaders: []string{"authorization"},
249268
}
250269
if r.upgrade {
251270
// Strip Origin on upgrade so WebSocket backends that validate
@@ -369,29 +388,37 @@ func headerString(h *corev3.HeaderValue) string {
369388
return h.Value
370389
}
371390

372-
// validNamespace mirrors the Python router's namespace check:
391+
// validDNSLabel reports whether s is a syntactically valid DNS-1123
392+
// label. Mirrors the Python router's
373393
//
374-
// namespace.replace("-", "").isalnum()
394+
// ^[a-z0-9](?:[-a-z0-9]*[a-z0-9])?$
375395
//
376-
// At least one ASCII alphanumeric, only ASCII letters/digits/hyphens
377-
// otherwise. Empty input and hyphen-only input both rejected.
378-
func validNamespace(s string) bool {
379-
if s == "" {
396+
// regex with a 63-character cap. Applied to both X-Sandbox-ID and
397+
// X-Sandbox-Namespace before either gets interpolated into the DNS-
398+
// form upstream FQDN, so callers can't inject extra DNS components or
399+
// traversal sequences ("foo.evil.com", "foo/bar", etc.). Stricter
400+
// than the previous validNamespace: rejects uppercase, leading/
401+
// trailing hyphens, and anything over 63 chars — same shape K8s
402+
// itself enforces on Pod and Namespace names.
403+
func validDNSLabel(s string) bool {
404+
if s == "" || len(s) > 63 {
380405
return false
381406
}
382-
hasAlphanum := false
383407
for i := 0; i < len(s); i++ {
384408
c := s[i]
385409
switch {
386-
case c >= '0' && c <= '9', c >= 'a' && c <= 'z', c >= 'A' && c <= 'Z':
387-
hasAlphanum = true
410+
case c >= '0' && c <= '9':
411+
case c >= 'a' && c <= 'z':
388412
case c == '-':
389-
// allowed
413+
// Hyphen is allowed only in the interior — not first or last.
414+
if i == 0 || i == len(s)-1 {
415+
return false
416+
}
390417
default:
391418
return false
392419
}
393420
}
394-
return hasAlphanum
421+
return true
395422
}
396423

397424
// joinHostPort formats "host:port", bracketing IPv6 literals per

sandbox-router/internal/extproc/server_test.go

Lines changed: 100 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,35 @@ func TestHandle_InvalidNamespaceRejected(t *testing.T) {
198198
}
199199
}
200200

201+
func TestHandle_InvalidIDRejected(t *testing.T) {
202+
// Mirrors the Python router's DNS-label check on sandbox_id. Without
203+
// this, a caller could interpolate extra DNS components into the
204+
// upstream FQDN via the DNS form ("<id>.<ns>.svc.<cluster-domain>").
205+
cases := map[string]string{
206+
"dot in id (would inject extra DNS components)": "foo.evil.com",
207+
"slash in id (traversal flavor)": "foo/bar",
208+
"underscore in id": "foo_bar",
209+
"uppercase": "FooBar",
210+
"leading hyphen": "-foo",
211+
"trailing hyphen": "foo-",
212+
}
213+
s, _ := newServer(t)
214+
for name, id := range cases {
215+
t.Run(name, func(t *testing.T) {
216+
resp := s.handle(context.Background(), reqHeaders(map[string]string{
217+
HeaderSandboxID: id,
218+
}))
219+
ir := resp.GetImmediateResponse()
220+
if ir == nil || ir.Status.Code != 400 {
221+
t.Fatalf("id=%q: expected 400 immediate; got: %+v", id, resp)
222+
}
223+
if !strings.Contains(string(ir.Body), "Invalid sandbox ID") {
224+
t.Errorf("id=%q: body should mention Invalid sandbox ID: %s", id, ir.Body)
225+
}
226+
})
227+
}
228+
}
229+
201230
func TestHandle_InvalidPortRejected(t *testing.T) {
202231
// Port must fall in [1, 65535]. Anything else gets rejected with a
203232
// 400 immediate before we hand the value to net.JoinHostPort, so a
@@ -350,26 +379,42 @@ func TestHandle_UnknownPhaseReturnsNil(t *testing.T) {
350379
}
351380
}
352381

353-
func TestValidNamespace(t *testing.T) {
382+
func TestValidDNSLabel(t *testing.T) {
383+
// Stricter than the previous validNamespace: rejects uppercase,
384+
// leading/trailing hyphens, and labels > 63 chars. Same shape as
385+
// RFC 1123 / K8s' own DNS-label rules and the Python router's
386+
// _is_valid_dns_label regex.
354387
cases := map[string]bool{
355-
"default": true,
356-
"prod": true,
357-
"my-ns": true,
358-
"my-ns-1": true,
359-
"MY-NS": true,
360-
"a": true,
361-
"": false,
362-
"-": false,
363-
"---": false,
364-
"my_ns": false,
365-
"my.ns": false,
366-
" ns": false,
367-
"ns ": false,
388+
// accepted
389+
"default": true,
390+
"prod": true,
391+
"my-ns": true,
392+
"my-ns-1": true,
393+
"a": true,
394+
"abc123": true,
395+
"1abc": true, // leading digit OK in RFC 1123 (vs the older 952)
396+
strings.Repeat("a", 63): true,
397+
398+
// rejected — character class
399+
"MY-NS": false, // uppercase
400+
"my_ns": false, // underscore
401+
"my.ns": false, // dot would inject extra DNS components
402+
"foo/bar": false, // slash, traversal flavor
403+
" ns": false, // leading space
404+
"ns ": false, // trailing space
368405
"bad!": false,
406+
407+
// rejected — structure
408+
"": false, // empty
409+
"-": false, // single hyphen
410+
"---": false, // hyphens only
411+
"-x": false, // leading hyphen
412+
"x-": false, // trailing hyphen
413+
strings.Repeat("a", 64): false, // exceeds 63-char cap
369414
}
370415
for in, want := range cases {
371-
if got := validNamespace(in); got != want {
372-
t.Errorf("validNamespace(%q) = %v, want %v", in, got, want)
416+
if got := validDNSLabel(in); got != want {
417+
t.Errorf("validDNSLabel(%q) = %v, want %v", in, got, want)
373418
}
374419
}
375420
}
@@ -509,6 +554,45 @@ func TestHandle_NonUpgradePreservesOrigin(t *testing.T) {
509554
}
510555
}
511556

557+
// TestHandle_AlwaysStripsAuthorization is the regression test for the
558+
// privilege-escalation hazard the Python router avoids by stripping
559+
// Authorization. When the router is fronted by an identity-checking
560+
// authorizer (TokenReview, JWT, etc.) the caller's bearer credential
561+
// is meant for the router, not the sandbox. Forwarding it would let
562+
// the sandbox impersonate the caller against the K8s API or any
563+
// other Bearer-protected service. The strip applies to both upgrade
564+
// and non-upgrade paths, so we assert it on each.
565+
func TestHandle_AlwaysStripsAuthorization(t *testing.T) {
566+
s, stub := newServer(t)
567+
uid := types.UID("auth-uid")
568+
stub[uid] = cache.Entry{PodIP: "10.0.0.9"}
569+
570+
for _, name := range []string{"non-upgrade", "upgrade"} {
571+
t.Run(name, func(t *testing.T) {
572+
hdrs := map[string]string{
573+
HeaderSandboxID: "alpha",
574+
HeaderSandboxUID: string(uid),
575+
"Authorization": "Bearer should-not-leak",
576+
}
577+
if name == "upgrade" {
578+
hdrs["Connection"] = "Upgrade"
579+
hdrs["Upgrade"] = "websocket"
580+
}
581+
resp := s.handle(context.Background(), reqHeaders(hdrs))
582+
found := false
583+
for _, h := range removeHeadersFrom(resp) {
584+
if h == "authorization" {
585+
found = true
586+
}
587+
}
588+
if !found {
589+
t.Fatalf("%s: expected RemoveHeaders to contain \"authorization\"; got %v",
590+
name, removeHeadersFrom(resp))
591+
}
592+
})
593+
}
594+
}
595+
512596
func TestReadHeaders_UpgradeDetection(t *testing.T) {
513597
// Locks in the predicate: an upgrade is recognized iff BOTH
514598
// Connection contains an upgrade token AND Upgrade is non-empty.

0 commit comments

Comments
 (0)