Dynamic forwarding load balancing policy #38757

yanavlasov · 2025-03-15T01:17:59Z

Commit Message:
Additional Description:

Risk Level: Low, new extension
Testing: Unit Tests
Docs Changes: Yes
Release Notes: Yes
Platform Specific Features: N/A

Signed-off-by: Yan Avlasov <[email protected]>

repokitteh-read-only · 2025-03-15T01:18:05Z

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #38757 was opened by yanavlasov.

see: more, trace.

repokitteh-read-only · 2025-03-15T01:18:10Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #38757 was opened by yanavlasov.

see: more, trace.

wbpcode · 2025-03-15T06:34:12Z

@yanavlasov If I got this PR correctly, this PR provides a way to allow the downstream/filters to force the LB to select a specific host (by header or metadata). If if there is no target host, then the fallback lb will be used?

If it's your goal, I think we have provided the setUpstreamOverrideHost() which could specific a host directly. If the override host is set, then all LB will select that host as priority except the health status of that host doesn't meet requirements.
So, I think it would be better to use or enhance the setUpstreamOverrideHost() directly rather than to add a new complex LB.

/wait-any

Signed-off-by: Yan Avlasov <[email protected]>

yanavlasov · 2025-03-17T18:35:55Z

@yanavlasov If I got this PR correctly, this PR provides a way to allow the downstream/filters to force the LB to select a specific host (by header or metadata). If if there is no target host, then the fallback lb will be used?

If it's your goal, I think we have provided the setUpstreamOverrideHost() which could specific a host directly. If the override host is set, then all LB will select that host as priority except the health status of that host doesn't meet requirements. So, I think it would be better to use or enhance the setUpstreamOverrideHost() directly rather than to add a new complex LB.

The goal is broader than the downstream filter overriding the upstream endpoint. The first part is to allow selection of endpoints with ext_proc filter - the external endpoint picker. It needs to provide primary and retry hosts.

The eventual goal is to make ext_lb - protocol for external picking of endpoints, as the initial approach is not very effective for retries - it can end up using stale retry endpoints.

This is part of the ai-gateway work to support load balancing to inference workloads where load balancing policy can be more complicated than what is supported by Envoy today and where there often operator specific business logic for picking endpoints.

Initially this LB support this proposal in k8s inference gateway exptensions: https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/004-endpoint-picker-protocol

I did not think the setUpstreamOverrideHost() and an HTTP filter is going to meet both of these requirements and started adding LB policy, which will eventually support the remote callout protocol.

Signed-off-by: Yan Avlasov <[email protected]>

mathetake · 2025-03-17T20:36:47Z

api/envoy/extensions/load_balancing_policies/dynamic_forwarding/v3/dynamic_forwarding.proto

+// If neither header nor metadata is present or there were errors parsing header or metadata values the
+// specified fallback load balancing policy is used. This allows load balancing to degrade to a
+// a built in policy (i.e. Round Robin) in case external endpoint picker fails.


sorry if i am missing something and i am no expert in the LB code, but does this mean that the fallback happens only when the header or metadata parsing failed, not when a HTTP request failed or received 5xx ? cc @yuzisun @wengyao04

ah I guess the fallback here != the fallback/subsetting in the proposal ?

What happens when request failed or received unsuccessful HTTP status is determined by the route retry policy. It is not up to LB policy. If request has to be retried per policy, the router filter will call chooseHost again to get endpoint for retry attempt.

The fallback LB may be called in this case, if there are no retry endpoints specified in the metadata or headers (note that this PR does not implement this functionality - it will be added in the next PR) or there are more retry attempts than endpoints in the metatata or headers.

thank you for clarifying, Yan... i am still trying to get my heads around exactly when this fallback via x-gateway-destination-endpoint-fallback happens. Since I read that both x-gateway-destination-endpoint-fallback and x-gateway-destination-endpoint are set by the extproc, what's the point of having two separate metadata? The proposal doc doesn't mention exactly when fallback happens (sorry if i miss anything). I think i should wait for the next PR as well as the reference impl...

There are two fallback things: fallback endpoints and fallback load balancing policy.
The fallback endpoints are set by the endpoint picking extension. For example it can select one primary endpoint and one fallback, based on their queue length. If the primary endpoint is not reachable or returned an error status the fallback endpoint will be used for retry.

The fallback load balancing policy is used when endpoint picker has failed altogether and set neither primary nor fallback endpoints. In this case Envoy will use potentially inefficient fallback LB policy, but would still be able to serve the request.

ok now it all has started making sense and see the clear distinction between fallback lb policy vs fallback endpoints in the metadata. I think I would love to see the actual implementation of the fallback endpoints logic then, which will clear all my remaining questions.

Signed-off-by: Yan Avlasov <[email protected]>

yanavlasov · 2025-03-18T01:02:12Z

@LiorLieberman PR with the LB policy

Signed-off-by: Yan Avlasov <[email protected]>

wbpcode · 2025-03-19T16:46:11Z

I also did some similar work. From the implementation of the code, I didn't found any thing is the setUpstreamOverrideHost cannot do (except the retry.).

In the current implementation, all complex AI related selection will be handled by the ext_proc sever (like polling metrics and comparing the load), and finally the ext proc server will return some endpoint candidates and the ext proc filter will set these endpoint candidates to metadata and lets the LB to use the metadata to do the final decision.

But the story could much simpler, the ext proc filter could set these endpoint candidates by the setUpstreamOverrideHost. This only enhancement necessarily is making setUpstreamOverrideHost could accept multiple addresses to support retry.

The asynchronous load balancing and external scheduler is a great point (and is helpful to make accurate retry) and I also happy to see it, If this PR is concentrate on that, I am fine. But from the code and the API, seems it doesn't?

yanavlasov · 2025-03-20T00:46:19Z

I also did some similar work. From the implementation of the code, I didn't found any thing is the setUpstreamOverrideHost cannot do (except the retry.).

In the current implementation, all complex AI related selection will be handled by the ext_proc sever (like polling metrics and comparing the load), and finally the ext proc server will return some endpoint candidates and the ext proc filter will set these endpoint candidates to metadata and lets the LB to use the metadata to do the final decision.

But the story could much simpler, the ext proc filter could set these endpoint candidates by the setUpstreamOverrideHost. This only enhancement necessarily is making setUpstreamOverrideHost could accept multiple addresses to support retry.

The asynchronous load balancing and external scheduler is a great point (and is helpful to make accurate retry) and I also happy to see it, If this PR is concentrate on that, I am fine. But from the code and the API, seems it doesn't?

Adding functionality to set upstream hosts to ext_proc is a move in the wrong direction. It will not work for https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/004-endpoint-picker-protocol but also conceptually I do not think it belongs there. The setUpstreamOverrideHost() API does not look like a good choice to me. It was added to support session affinity before we had load balancing policies, and I think it would have been better implemented as LB policy rather than this additional method on the decoder filter callbacks.

Using LB policy is I think a better choice as it fits better with the abstraction model, where LB policy is responsible for selecting the endpoint with request attributes as one of the inputs. And while it does not right now have remote callout it would be a natural next step to implement it there.

wbpcode · 2025-03-20T02:22:41Z

The setUpstreamOverrideHost() API does not look like a good choice to me. It was added to support session affinity before we had load balancing policies

Long time ago, when @htuch and me designed this API, we hope it's a common way to let the downstream to effect the LB selection and we make it's LB_POLICY-independent by design. session affinity is only one of usage. And we now have expose this API to lua. And this method may be exposed to wasm/dynamic module to let the dynamic extension could also effect the LB selection. I personally think there should no difference between ext_proc and lua/wasm, there all are ways to extend envoy dynamically.

The setUpstreamOverrideHost() API make the whole LB selection a 2-ply structure. The first layer will select the host based on the downstream's explicit input (see ThreadLocalClusterManagerImpl::ClusterEntry::chooseHost) . The second layer will select the host based on the LB policy if we cannot find one at the first layer (see different implementations of LB policy).

I think the Dynamic forwarding load balancing policy actually do completely same thing: it also try to make the LB selection a 2-ply structure. The first layer will select the host based on the downstream's explicit input (metadata/headers). The second layer fallback to a LB policy if we cannot find one at the first layer. See, it's completely same in high-level.

I removed some comments, because after a re-thinking, some comment may be one-sided. In conclusion, I think what actually is important is the 2-ply structure for host selection: one for downstream selection/decision, one for upstream load balancing. These two layer should be orthogonal or independent.

That's say, we actually could add an extension point at the cluster (We have added a override_host_status in the cluster in the past, we can add a new field override_host_policy there) to support to get addresses from downstream's direct input (like your current metadata/headers/etc.). I also added a draft PR to show how it works: #38814

This extension will be evaluated first at the ThreadLocalClusterManagerImpl::ClusterEntry::chooseHost. If the extension return a valid host, then that host will be selected directly. Or we will call the LB policy to do the load balancing.

This way could meet your current requirements of the AI gateway without adding a complete new LB policy. And it still fine if you want to support async load balancing in the future. This way have following advantages:

The downstream selection (override_host_policy) is completely independent with the upstream load balancing. We can composite different different downstream selection policies (override_host_extension) and upstream load balanciing policies.
It's unnecessary to add such a big new lb policy implementation (thousands lines of code). A simple/light extension could cover you current logic.
No need to handle the fallback. That's supported for now.
No need to handle the embeded load balancing policy. Load balancing policy is a pretty complex extension, thread aware load balancer, load balancer factory, worker local load balancer, all these make it's pretty hard to write or mainatain. If embeded sub local balancing policy is used, it brings more complexity.

wbpcode · 2025-03-20T06:34:10Z

Sorry I started a long discussion about PR and even after you have done something. 😞 I am pretty happy to see more AI related thing in the Envoy (because I am also doing some similar thing). But if we add this new extension/API, then it basically is forever, so, I think it deserve more disscussion.

yanavlasov · 2025-03-21T15:09:42Z

I do not see the benefit of adding an extension to an extension over just adding an extension. Why would the size of code be any different - it still needs to do the same thing? The LB extension itself is 500 lines of code it is not thousands. There are multiple ways of achieving the same goal, this is a very good way, which fits with the current abstraction model for LBs in Envoy, so there is no reason to be concerned about its longevity. The ways you are proposing do not seem to offer any particular benefits.

wbpcode · 2025-03-21T16:35:02Z

I do not see the benefit of adding an extension to an extension over just adding an extension. Why would the size of code be any different - it still needs to do the same thing? The LB extension itself is 500 lines of code it is not thousands. There are multiple ways of achieving the same goal, this is a very good way, which fits with the current abstraction model for LBs in Envoy, so there is no reason to be concerned about its longevity. The ways you are proposing do not seem to offer any particular benefits.

The core logic of this PR is actually pretty simple (search a header or metadata and return the results), but we use lots code to handle the lb embedding. I think the complexity is be enlarged because the lb self's complexity. Ideally, you only need to implement a very simple extension (may be no more than 100 lines.)to get a string from headers or metadata. The test would also be very simple.
The core logic of this PR actually is not part of the upstream load balancing. It actually inject additional layer (direct selection from downstream) in the path. It would be better to split it into a another specific abstraction. That provides the possibility to extend without changing the exist code (like we can support to get addresses from filter state, by CEL, etc. by a new extension).
Make this logic could also be used by clusters which doesn't support custom lb policy. (The clusters who user cluster provided lb.)

markdroth · 2025-04-23T19:58:50Z

/lgtm api

yanavlasov · 2025-04-24T15:06:20Z

@wbpcode I have addressed all comments, please take a look.

wbpcode · 2025-04-25T04:10:02Z

got it, thanks, I will take a look today.

wbpcode

Thanks for this great contribution. It's happy to see thing is landing gradually 🌹 ❤️ I added some comment to the API and the implementation.

/wait

source/extensions/load_balancing_policies/override_host/load_balancer.cc

wbpcode · 2025-04-30T10:06:14Z

source/extensions/load_balancing_policies/override_host/load_balancer.cc

+absl::StatusOr<std::unique_ptr<SelectedHosts>>
+OverrideHostLoadBalancer::LoadBalancerImpl::getSelectedHostsFromMetadata(
+    const ::envoy::config::core::v3::Metadata& metadata, const Config::MetadataKey& metadata_key) {
+  std::unique_ptr<SelectedHosts> selected_hosts;
+  const ProtobufWkt::Value& metadata_value =
+      Config::Metadata::metadataValue(&metadata, metadata_key);
+  if (metadata_value.has_string_value() && !metadata_value.string_value().empty()) {
+    auto selected_hosts_result = SelectedHosts::make(metadata_value.string_value());
+    if (!selected_hosts_result.ok()) {
+      ENVOY_LOG(trace, "Failed to parse SelectedEndpoints OSS {} with error {}",
+                metadata_value.string_value(), selected_hosts_result.status().message());
+      return selected_hosts_result.status();
+    }
+    selected_hosts = std::move(selected_hosts_result.value());
+  }
+  return selected_hosts;
+}
+
+absl::StatusOr<std::unique_ptr<SelectedHosts>>
+OverrideHostLoadBalancer::LoadBalancerImpl::getSelectedHostsFromHeader(
+    const Envoy::Http::RequestHeaderMap* header_map, const Http::LowerCaseString& header_name) {
+  std::unique_ptr<SelectedHosts> selected_hosts;
+  if (!header_map) {
+    return selected_hosts;
+  }
+  HeaderMap::GetResult result = header_map->get(header_name);
+  if (result.empty()) {
+    return selected_hosts;
+  }
+
+  // Use only the first value of the header, if it happens to be have multiple.
+  const std::string primary_host_address(result[0]->value().getStringView());
+  Envoy::Network::Address::InstanceConstSharedPtr primary_host =
+      Envoy::Network::Utility::parseInternetAddressAndPortNoThrow(primary_host_address, false);
+  if (!primary_host || primary_host->type() != Envoy::Network::Address::Type::Ip) {
+    ENVOY_LOG(debug, "Invalid primary host in header {}: {}", header_name, primary_host_address);
+    return absl::InvalidArgumentError("Invalid primary host in header");
+  }
+
+  // TODO(yanavlasov): implement parsing of fallback headers
+  selected_hosts = std::make_unique<SelectedHosts>(
+      SelectedHosts{{{primary_host->ip()->addressAsString(), primary_host->ip()->port()}}, {}});
+  return selected_hosts;
+}


I do think what we need here is to get first valid string (or even string_view i think) from the header or metadata source. Then we use the string to find a host from the cross host map. If the string is unvalid address, then we get nothing from the map and we will fallback to the default policy.

So, IMO, the parsing here make the code more complex and bring some performance overhead, but only bring trival benefit.

And from the definition of the SelectedHosts, you may want to get primary host and fallback host at same time in the future. But you actually could do that after the first choice is failed. We can use a
absl::StatusOr<absl::string_view> getSelectedHosts(LoadBalancerContext* context, const std::vector<OverrideSource>& soruces) to share the search logic.

By the way, I actually think we needn't the fallback_host_sources. The header/metadata value could be a list of addresses the separated by the , (for example, 1.2.3.4:90,1.2.3.5:90), then we try first address at first attempt and try second address at second attempt. This will make thing much simpler.

I have reworked the code.

wbpcode · 2025-04-30T10:08:54Z

source/extensions/load_balancing_policies/override_host/load_balancer.cc

+  if (!metadata.filter_metadata().contains(kEndpointsFallbackIndexKey)) {
+    // Use the primary endpoint.
+    ENVOY_LOG(trace, "Selecting primary endpoint {}", selected_hosts.primary.address.address);
+
+    // Endpoint extracted from the header does not have locality.
+    HostConstSharedPtr host = findHost(selected_hosts.primary);
+    // If the primary endpoint was found in the current host set, use it.
+    // Otherwise try to see if one of the failover endpoints is available. This
+    // is possible when the cluster received EDS update while the request to the
+    // endpoint picker was in flight.
+    if (host) {
+      // Save the first index into fallback hosts, so that subsequent calls to
+      // chooseHost method will use the fallback hosts.
+      updateFallbackIndexMetadata(metadata, 0);
+      return host;
+    }


Prefer filter state to store the index. Then in the future, we may could store the address list in that filter state in the future.

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto

wbpcode · 2025-04-30T10:14:00Z

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto

+  // A list of host sources to get the fallback host addresses from. The first one who provides
+  // the addresses will be used, no matter if the addresses are valid or not.
+  // [#not-implemented-hide:]
+  repeated OverrideHostSource fallback_host_sources = 2;


I didn't see the implementation, but I guess the fallback host source will contains a list of addresses, like 1.2.3.4:90,1.2.3.5:90. Then I think we can use single host source which contains multiple addresses for both primary and fallback? Then we needn't primary_host_sources and fallback_host_sources, we only need one host_sources.

I saw proposal in the https://github.com/kubernetes-sigs/gateway-api-inference-extension/ ... only single fallback address could be used.....

Okay, I will try to create PR at there to see if that possible to use single source for multiple address. Before that, I inclined to use only one host_sources first in our API.

See also kubernetes-sigs/gateway-api-inference-extension#761

You still need two configurations for cases where an implementation that uses original_dst is transitioning to the override_host policy.

You still need two configurations for cases where an implementation that uses original_dst is transitioning to the override_host policy.

Yeah, the migration is another problem. The EPP server may need to aware whether the proxy provide the fallback support and use different header/metadata value format.

But although the EPP is one of our design target, I will still hope we can treat this new extension as general feature of Envoy and the EPP is only one usage of the new extension. That's say, we may needn't to consider the backward compatiblity of EPP self here, we should let the EPP server to resolve the compatibility problem and the new extension of Envoy only need to support the required feature like fallback, retrying, etc?

cc @yanavlasov WDYT?

And no matter how the EPP implement the fallback, we can always treat the header value or metadata value (from the specified host_sources) as list of addresses and iterate them when finding a valid host. (The single address case could be treat as list which contains only one element.)

So, I still think we can keep only one host_sources field here to parse an address list （from the the value of specified host_sources）, and remove the fallback_host_sources in this initial version.

If finally we prove that the fallback_host_sources is the only choice, we can add it in the future.

Ok, sounds good

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto

dprotaso · 2025-05-01T21:47:42Z

Sorry, I misunderstood it. Yeah, these typically should be work of an external LB policy or HTTP filter.
So, if i get you correctly, we need: 1. Select host from downstream input; 2. Call the external LB; 3. Fallback to local LB?

The requirements that I have right now are the following:

Choose primary endpoint based on metadata or headers (for example set by ext_proc callout or WASM)

Choose from a list of retry endpoints when router selects hosts for retries.

Use fallback LB in case headers and metadata is not present.

Support fallback LBs with EDS provisioned endpoints and original_dst behavior, where endpoint populated in the cluster based on the header metadata values.

Support combined EDS and original_dst behavior. For example if an endpoint specified by external picker does not exist in the set of endpoints provided by EDS it is automatically added to the endpoint set. This supports various Cloud use cases depending on how the inference pool is configured or for customer provided control planes.

Provide external endpoint picker with the list of endpoints in the locality of the proxy (i.e.e in the request metadata). At this point my solution will require an additional HTTP filter to work with LB extension.

Eventually implement lb_ext protocol for calling out to external endpoint picker as part of endpoint selection.

Are these other requirements captures in another issue or design doc somewhere?

Signed-off-by: Yan Avlasov <[email protected]>

yanavlasov · 2025-05-06T01:57:21Z

Sorry, I misunderstood it. Yeah, these typically should be work of an external LB policy or HTTP filter.
So, if i get you correctly, we need: 1. Select host from downstream input; 2. Call the external LB; 3. Fallback to local LB?

The requirements that I have right now are the following:

Choose primary endpoint based on metadata or headers (for example set by ext_proc callout or WASM)

Choose from a list of retry endpoints when router selects hosts for retries.

Use fallback LB in case headers and metadata is not present.

Support fallback LBs with EDS provisioned endpoints and original_dst behavior, where endpoint populated in the cluster based on the header metadata values.

Support combined EDS and original_dst behavior. For example if an endpoint specified by external picker does not exist in the set of endpoints provided by EDS it is automatically added to the endpoint set. This supports various Cloud use cases depending on how the inference pool is configured or for customer provided control planes.

Provide external endpoint picker with the list of endpoints in the locality of the proxy (i.e.e in the request metadata). At this point my solution will require an additional HTTP filter to work with LB extension.

Eventually implement lb_ext protocol for calling out to external endpoint picker as part of endpoint selection.

Are these other requirements captures in another issue or design doc somewhere?

No, I do not have them captured in other place.

wbpcode

Thanks for the update. Some more comments are added.

wbpcode · 2025-05-06T02:53:59Z

source/extensions/load_balancing_policies/override_host/selected_hosts.h

+struct SelectedHosts {
+  struct Endpoint {
+  public:
+    struct Address {
+      const std::string address;
+      const uint32_t port;
+    };
+
+    const Address address;
+  };
+
+  const Endpoint primary;
+  const std::vector<Endpoint> failover;


It's unecessary to split the primary address and fallback addreses into two field. We will iterate then in order anyway. And it's also unnecessary to use the Address struct.

A simple std::vector<std::string> could be used as the SelectedHosts

source/extensions/load_balancing_policies/override_host/selected_hosts.cc

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto

Signed-off-by: Yan Avlasov <[email protected]>

wbpcode

LGTM overall with only one minor comment, thanks for the update.

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto

Signed-off-by: Yan Avlasov <[email protected]>

wbpcode

LGTM. Thanks.

Commit Message: Additional Description: Risk Level: Low, new extension Testing: Unit Tests Docs Changes: Yes Release Notes: Yes Platform Specific Features: N/A --------- Signed-off-by: Yan Avlasov <[email protected]> Signed-off-by: Ting Pan <[email protected]>

Dynamic forwarding loadbalancing policy

5d2ac40

Signed-off-by: Yan Avlasov <[email protected]>

repokitteh-read-only bot added the api label Mar 15, 2025

repokitteh-read-only bot assigned markdroth Mar 15, 2025

repokitteh-read-only bot added the waiting:any label Mar 15, 2025

wbpcode self-assigned this Mar 15, 2025

Add more tests

dbc20c0

Signed-off-by: Yan Avlasov <[email protected]>

repokitteh-read-only bot removed the waiting:any label Mar 17, 2025

yanavlasov added 3 commits March 17, 2025 19:44

Add CODEOWNERS and fix proto doc

c6be990

Signed-off-by: Yan Avlasov <[email protected]>

Format

bc45663

Signed-off-by: Yan Avlasov <[email protected]>

More docs

08abf29

Signed-off-by: Yan Avlasov <[email protected]>

mathetake reviewed Mar 17, 2025

View reviewed changes

mathetake mentioned this pull request Mar 17, 2025

docs: adds high level proposal on GAIE support envoyproxy/ai-gateway#492

Merged

Fix gcc build

f53bbba

Signed-off-by: Yan Avlasov <[email protected]>

yanavlasov marked this pull request as ready for review March 18, 2025 00:57

yanavlasov added 2 commits March 18, 2025 01:11

Add proto dep

7a62842

Signed-off-by: Yan Avlasov <[email protected]>

Add coverage

66ca168

Signed-off-by: Yan Avlasov <[email protected]>

mathetake mentioned this pull request Mar 21, 2025

Support fallback to different backend cluster based on response status code from the primary cluster #38841

Open

wbpcode mentioned this pull request Mar 22, 2025

upstream: new override host policy support #38814

Closed

This comment was marked as off-topic.

Sign in to view

repokitteh-read-only bot removed the api label Apr 23, 2025

wbpcode reviewed Apr 30, 2025

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 30, 2025

wbpcode mentioned this pull request Apr 30, 2025

Amend the endpoint picker protocol to support multiple fallback endpoints kubernetes-sigs/gateway-api-inference-extension#761

Merged

danehans mentioned this pull request May 1, 2025

Add Initial Gateway API Inference Extension Support kgateway-dev/kgateway#10411

Closed

25 tasks

yanavlasov added 2 commits May 5, 2025 18:26

Merge branch 'main' into dynamic-forwarding-lb

937c0d1

Address comments

5e111a6

Signed-off-by: Yan Avlasov <[email protected]>

repokitteh-read-only bot added api and removed waiting labels May 6, 2025

wbpcode reviewed May 6, 2025

View reviewed changes

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto Outdated Show resolved Hide resolved

yanavlasov added 5 commits May 7, 2025 00:57

Address comments

ffcef01

Signed-off-by: Yan Avlasov <[email protected]>

Address comments

a26b5e9

Signed-off-by: Yan Avlasov <[email protected]>

Merge branch 'main' into dynamic-forwarding-lb

7405ff3

Address comments

7a7531c

Signed-off-by: Yan Avlasov <[email protected]>

Delete unused class

47d2994

Signed-off-by: Yan Avlasov <[email protected]>

wbpcode reviewed May 8, 2025

View reviewed changes

api/envoy/extensions/load_balancing_policies/override_host/v3/override_host.proto Outdated Show resolved Hide resolved

Address comments

049adde

Signed-off-by: Yan Avlasov <[email protected]>

wbpcode approved these changes May 9, 2025

View reviewed changes

repokitteh-read-only bot removed the api label May 9, 2025

wbpcode merged commit 1e5bbb5 into envoyproxy:main May 9, 2025
25 checks passed

yanavlasov deleted the dynamic-forwarding-lb branch May 10, 2025 00:26

danehans mentioned this pull request Jul 14, 2025

Track Envoy Main Branch kgateway-dev/kgateway#11678

Open

danehans mentioned this pull request Jul 30, 2025

Inference: Refactor Plugin to Use Override_Host LB kgateway-dev/kgateway#11824

Closed

Dynamic forwarding load balancing policy #38757

Dynamic forwarding load balancing policy #38757

Uh oh!

Conversation

yanavlasov commented Mar 15, 2025

Uh oh!

repokitteh-read-only bot commented Mar 15, 2025

Uh oh!

repokitteh-read-only bot commented Mar 15, 2025

Uh oh!

wbpcode commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yanavlasov commented Mar 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanavlasov commented Mar 18, 2025

Uh oh!

wbpcode commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yanavlasov commented Mar 20, 2025

Uh oh!

wbpcode commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbpcode commented Mar 20, 2025

Uh oh!

yanavlasov commented Mar 21, 2025

Uh oh!

wbpcode commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

markdroth commented Apr 23, 2025

Uh oh!

yanavlasov commented Apr 24, 2025

Uh oh!

wbpcode commented Apr 25, 2025

Uh oh!

wbpcode left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wbpcode Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wbpcode Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wbpcode Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wbpcode Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wbpcode commented Mar 15, 2025 •

edited

Loading

wbpcode commented Mar 19, 2025 •

edited

Loading

wbpcode commented Mar 20, 2025 •

edited

Loading

wbpcode commented Mar 21, 2025 •

edited

Loading

wbpcode Apr 30, 2025 •

edited

Loading

wbpcode Apr 30, 2025 •

edited

Loading

wbpcode Apr 30, 2025 •

edited

Loading

wbpcode Apr 30, 2025 •

edited

Loading

wbpcode May 6, 2025 •

edited

Loading

wbpcode May 6, 2025 •

edited

Loading