-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
In what area(s)?
/area autoscale
/area networking
What version of Knative?
HEAD
Description
Today, we have 2 components informing the autoscaler about the state of the system: activator and queue-proxy (per pod). With activator throttling enabled we can now have the situation where the activator and the queue-proxies simultaneously send valid concurrency data. The autoscaler needs some adjustments in its algorithms to account for that (see #3289) but we also need to make sure that we do not double account for requests.
If a request is proxied by the activator today, both the activator and the receiving queue-proxy will increase their respective concurrency counters to report that.
We discussed 2 ways of fixing that:
Proposal 1: Discount in the activator
We can discount a request in the activator as soon as it hits the proxyRequest method (i.e. is handed off to Golang's ReverseProxy). However, that method has internal retries and we cannot know when the request is really proxied. That can make for blurriness in the autoscaler signal.
Proposal 2: Discount in the queue-proxy
The second proposal was to discount the request in the queue-proxy. The activator would set a header that will cause the queue-proxy to not take that request into consideration when accounting for it's observed concurrency. It's been pointed out that this is a potential attack vector as arbitrary users could set this header as well which would in turn render autoscaling of that revision inaccurate and could potentially choke the application.