Skip to content

[Subnet Prioritization] Support capacity-optimized-prioritized and prioritized Allocation Strategy #671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

Allenz5
Copy link

@Allenz5 Allenz5 commented Jun 17, 2025

Description of changes

  • Add Priority to subnets when creating fleet using capacity-optimized-prioritized and prioritized Allocation Strategy
  • Add SingleAvailabilityZone flag when creating fleet if EnableSingleAvailabilityZone is true

Tests

  • Extended test_fleet_manager.py::test_fleet_manager.py to test that Priority is appended in overrides when using capacity-optimized-prioritized and prioritized Allocation Strategy
  • Extended test_fleet_manager.py::test_fleet_manager.py to test that SingleAvailabilityZone flag is correctly set when EnableSingleAvailabilityZone is true

References

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Allenz5 and others added 5 commits June 11, 2025 11:03
…rams launching with EnableSingleAvailabilityZone and prioritized|capacity-optimized-prioritized AllocationStrategy

Signed-off-by: Hanxuan Zhang <[email protected]>
@Allenz5 Allenz5 requested review from a team as code owners June 17, 2025 17:09
# set SingleAvailabilityZone to False
"SingleAvailabilityZone": (
self._compute_resource_config["Networking"]["SingleAvailabilityZone"]
if self._compute_resource_config["Networking"]["SingleAvailabilityZone"] is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that you add a condition here where we check if the allocation strategy is the one we want and then apply SingleAvailibilityZone. We don't want to use this parameter if we use lowest-price AllocationStrategy

for instance_type in self._compute_resource_config["Instances"]:
subnet_ids = self._compute_resource_config["Networking"]["SubnetIds"]
for subnet_id in subnet_ids:
overrides.update({"InstanceType": instance_type["InstanceType"], "SubnetId": subnet_id})
if (
self._compute_resource_config.get("AllocationStrategy") == "prioritized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also check if the CapacityType is the one that we want along with allocation strategy, the reason we should add that even if that's already taken care in CLI validator is because customer can suppress or ignore and move forward.

CHANGELOG.md Outdated
- There were no changes for this version.
- Support prioritized|capacity-optimized-prioritized Allocation Strategy and EnableSingleAvailabilityZone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change should be in 3.14.0

Comment on lines 331 to 332
self._compute_resource_config["Networking"]["SingleAvailabilityZone"]
if self._compute_resource_config["Networking"]["SingleAvailabilityZone"] is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure self._compute_resource_config always has "Networking" section?

self._compute_resource_config["Networking"]["SingleAvailabilityZone"]
if self._compute_resource_config["Networking"]["SingleAvailabilityZone"] is not None
and (
self._compute_resource_config.get("AllocationStrategy") == "prioritized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Can we check these conditions using a function as we keep checking this above an here

(5, "queue-single-az", "fleet1", False, {}, None),
# Use "prioritized" Allocation Strategy AND Launch Override with Priority
(5, "queue-prioritized", "fleet1", False, {}, None),
# Use "capacity-optimized-prioritized" Allocation Strategy AND Launch Override with Priority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add one of them to. have all_or_nothing True?

@@ -312,13 +319,26 @@ def _uses_single_az(self):
subnet_ids = self._compute_resource_config.get("Networking", {}).get("SubnetIds", [])
return len(subnet_ids) == 1

def _uses_subnet_prioritization(self):
return (
self._compute_resource_config.get("AllocationStrategy") == "prioritized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a valid condition where we check prioritized is used with On-demand Capacity and capacity-optimized-prioritized is checked with Spot Capacity type. Please update the unit test accordingly as they are passing without this change

tests/common.py Outdated
"Api": "create-fleet",
"Instances": [{"InstanceType": "t2.medium"}, {"InstanceType": "t2.large"}],
"AllocationStrategy": "capacity-optimized-prioritized",
"CapacityType": "on-demand",
Copy link
Contributor

@himani2411 himani2411 Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…zed' is used with On-Demand Capacity and 'capacity-optimized-prioritized' is used with Spot Capacity type.

Signed-off-by: Hanxuan Zhang <[email protected]>
"SingleAvailabilityZone", None
)
if enable_single_availability_zone is None or (
enable_single_availability_zone and self._uses_subnet_prioritization() is False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we suppress the CLI validator and CX uses EnableSingleVaialibilityZone as False when using all-or-nothing with 1 Az and multiple Instances then Ec2 create Fleet call will Fail as we do not set SingleInstanceType or SingleAvailibilityZone which is a requirement from EC2

@himani2411
Copy link
Contributor

Update the PR description to mention what is no longer relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants