Environment
- Higress Gateway version: 2.1.11 (also reproducible on latest code in
main branch)
- Plugin:
ai-load-balancer
- Affected lb_policy:
global_least_request, endpoint_metrics, prefix_cache
- Not affected:
cluster_metrics
Description
When ai-load-balancer plugin is enabled on a route, any request without a body (e.g. GET, HEAD, DELETE) will hang indefinitely until idle timeout, resulting in a 504 or downstream disconnect (DC).
The root cause is that 3 out of 4 lb_policy implementations return types.HeaderStopIteration in HandleHttpRequestHeaders, deferring all load-balancing logic (upstream host selection + proxywasm.ResumeHttpRequest()) to HandleHttpRequestBody. However, for requests with no body, the OnHttpRequestBody callback is never invoked by Envoy, so the request is paused forever.
Affected Code
global_least_request/lb_policy.go → HandleHttpRequestHeaders returns HeaderStopIteration
endpoint_metrics/lb_policy.go → HandleHttpRequestHeaders returns HeaderStopIteration
prefix_cache/lb_policy.go → HandleHttpRequestHeaders returns HeaderStopIteration
All three rely on HandleHttpRequestBody to call proxywasm.ResumeHttpRequest(), which never happens for bodiless requests.
cluster_metrics/lb_policy.go is not affected because it completes host selection in HandleHttpRequestHeaders and returns ActionContinue.
Steps to Reproduce
- Configure a route with
ai-load-balancer plugin enabled (any of the 3 affected policies)
- Send a GET request through Higress to that route:
curl -i "http://<higress-gateway>/some-path?param=value" -H "host: example.com"
- The request hangs until timeout
Evidence from Envoy Debug Logs
# Request arrives with end_stream=true in headers (no body)
request headers complete (end_stream=true):
':path', '/sink?messageID=8c13992ee64e4dcaaf8f496c1387a9b5'
':method', 'GET'
# ~10 seconds later, stream reset due to client disconnect
stream reset: reset reason: connection termination
# ai-load-balancer reports error because HandleHttpRequestBody was never called
wasm log higress-system.ai-load-balancer-1.0.0: [ai-load-balancer] get host_selected failed
Access log confirms:
{
"method": "GET",
"path": "/sink?messageID=8c13992ee64e4dcaaf8f496c1387a9b5",
"response_code": "0",
"response_flags": "DC",
"upstream_host": "-",
"response_code_details": "downstream_remote_disconnect",
"duration": "10320"
}
Suggested Fix
In HandleHttpRequestHeaders, detect whether the request has a body. If not, perform the load-balancing logic directly in the headers phase and return ActionContinue instead of HeaderStopIteration.
For example, check the :method header or use the endOfStream flag (if available via the wrapper framework):
func (lb GlobalLeastRequestLoadBalancer) HandleHttpRequestHeaders(ctx wrapper.HttpContext) types.Action {
method, _ := proxywasm.GetHttpRequestHeader(":method")
if method == "GET" || method == "HEAD" || method == "DELETE" || method == "OPTIONS" {
// No body expected, perform LB logic here and continue
// ... (same logic as HandleHttpRequestBody)
return types.ActionContinue
}
return types.HeaderStopIteration
}
Workaround
Disable the ai-load-balancer plugin for routes that receive bodiless requests (GET/HEAD/DELETE).
Environment
mainbranch)ai-load-balancerglobal_least_request,endpoint_metrics,prefix_cachecluster_metricsDescription
When
ai-load-balancerplugin is enabled on a route, any request without a body (e.g. GET, HEAD, DELETE) will hang indefinitely until idle timeout, resulting in a 504 or downstream disconnect (DC).The root cause is that 3 out of 4 lb_policy implementations return
types.HeaderStopIterationinHandleHttpRequestHeaders, deferring all load-balancing logic (upstream host selection +proxywasm.ResumeHttpRequest()) toHandleHttpRequestBody. However, for requests with no body, theOnHttpRequestBodycallback is never invoked by Envoy, so the request is paused forever.Affected Code
global_least_request/lb_policy.go→HandleHttpRequestHeadersreturnsHeaderStopIterationendpoint_metrics/lb_policy.go→HandleHttpRequestHeadersreturnsHeaderStopIterationprefix_cache/lb_policy.go→HandleHttpRequestHeadersreturnsHeaderStopIterationAll three rely on
HandleHttpRequestBodyto callproxywasm.ResumeHttpRequest(), which never happens for bodiless requests.cluster_metrics/lb_policy.gois not affected because it completes host selection inHandleHttpRequestHeadersand returnsActionContinue.Steps to Reproduce
ai-load-balancerplugin enabled (any of the 3 affected policies)Evidence from Envoy Debug Logs
Access log confirms:
{ "method": "GET", "path": "/sink?messageID=8c13992ee64e4dcaaf8f496c1387a9b5", "response_code": "0", "response_flags": "DC", "upstream_host": "-", "response_code_details": "downstream_remote_disconnect", "duration": "10320" }Suggested Fix
In
HandleHttpRequestHeaders, detect whether the request has a body. If not, perform the load-balancing logic directly in the headers phase and returnActionContinueinstead ofHeaderStopIteration.For example, check the
:methodheader or use theendOfStreamflag (if available via the wrapper framework):Workaround
Disable the
ai-load-balancerplugin for routes that receive bodiless requests (GET/HEAD/DELETE).