Skip to content

Commit 6887380

Browse files
committed
feat: add EC2NodeClass placement group support
1 parent cd5bb29 commit 6887380

File tree

31 files changed

+2614
-1592
lines changed

31 files changed

+2614
-1592
lines changed

charts/karpenter-crd/templates/karpenter.k8s.aws_ec2nodeclasses.yaml

Lines changed: 929 additions & 794 deletions
Large diffs are not rendered by default.

cmd/controller/main.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ func main() {
7676
cloudProvider,
7777
op.SubnetProvider,
7878
op.SecurityGroupProvider,
79+
op.PlacementGroupProvider,
7980
op.InstanceProfileProvider,
8081
op.InstanceProvider,
8182
op.PricingProvider,

designs/placement-groups.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Placement Group Support
2+
3+
## Context
4+
5+
Amazon EC2 placement groups let operators influence instance placement for low-latency (`cluster`), failure-domain isolation (`partition`), and small critical workloads (`spread`). The long-standing request in https://github.com/aws/karpenter-provider-aws/issues/3324 is to make these groups usable from `EC2NodeClass`.
6+
7+
Karpenter already treats `EC2NodeClass` as launch configuration for existing AWS resources such as subnets, security groups, AMIs, and instance profiles. Placement groups fit best when modeled the same way.
8+
9+
## Problem
10+
11+
Users can launch Karpenter-managed nodes into subnets, security groups, and capacity reservations, but cannot direct those nodes into an existing placement group. This blocks workloads that already rely on EC2 placement-group semantics, for example:
12+
13+
- tightly-coupled clusters that need cluster placement-group networking
14+
- replicated systems that want partition placement-group isolation
15+
- small critical workloads that want spread placement-group separation
16+
17+
The previously proposed design in #5389 focused on Karpenter creating placement groups. That adds a new EC2 resource lifecycle to reconcile and exposes strategy-specific creation APIs that users may rely on long term.
18+
19+
## Options
20+
21+
### Option 1: Karpenter creates and owns placement groups
22+
23+
Pros:
24+
25+
- users can describe strategy directly in `EC2NodeClass`
26+
- Karpenter could validate strategy-specific configuration at reconciliation time
27+
28+
Cons:
29+
30+
- introduces new lifecycle ownership for EC2 resources outside the current launch path
31+
- expands the stable API surface with strategy creation details such as `cluster`, `spread`, `partition`, partition count, and spread level
32+
- complicates shared placement groups and future AWS-specific variants
33+
- makes rollback and drift semantics harder because the placement group becomes a controller-managed dependency
34+
35+
### Option 2: Karpenter references an existing placement group
36+
37+
Pros:
38+
39+
- matches how `EC2NodeClass` already models other AWS launch dependencies
40+
- keeps the API small: identify the group and optionally pin a partition
41+
- works for user-managed, shared, and externally tagged placement groups
42+
- avoids inventing a placement-group controller lifecycle before demand is proven
43+
44+
Cons:
45+
46+
- users must provision the placement group out of band
47+
- Karpenter cannot configure placement-group strategy on behalf of the user
48+
49+
## Recommendation
50+
51+
Add an optional `spec.placementGroup` field on `EC2NodeClass`:
52+
53+
```yaml
54+
spec:
55+
placementGroup:
56+
name: analytics-partition
57+
partition: 2
58+
```
59+
60+
Behavior:
61+
62+
- `name` or `id` identifies the existing placement group; the fields are mutually exclusive
63+
- `id` supports shared placement groups, which require `GroupId` during launch
64+
- `partition` is optional and only meaningful for partition placement groups
65+
- Karpenter resolves the configured group into `status.placementGroup`
66+
- launch templates include the placement-group reference so both `CreateFleet` and `RunInstances` honor it
67+
68+
## Key Decisions
69+
70+
- Karpenter does not create, tag, delete, or mutate placement groups in this design
71+
- placement-group strategy remains an operator concern because it belongs to the EC2 placement-group resource, not the instance launch request
72+
- partition selection is the only launch-time knob worth exposing initially because AWS applies it at instance launch and it is useful even when the placement group is created elsewhere
73+
74+
## User Guidance
75+
76+
- Use `name` for placement groups in the same account and `id` for shared placement groups
77+
- Pair cluster placement groups with subnet or topology constraints that keep launches in a single Availability Zone
78+
- Omit `partition` to let EC2 distribute instances across partitions, or set it when the workload needs explicit partition affinity
79+
80+
## Future Work
81+
82+
- richer status surfacing for placement-group strategy and readiness
83+
- strategy-aware validation and scheduling hints
84+
- a separate proposal for Karpenter-managed placement-group lifecycle if real demand justifies the larger API

examples/v1/placement-group.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
apiVersion: karpenter.sh/v1
2+
kind: NodePool
3+
metadata:
4+
name: placement-group
5+
spec:
6+
template:
7+
spec:
8+
requirements:
9+
- key: topology.kubernetes.io/zone
10+
operator: In
11+
values:
12+
- us-west-2a
13+
nodeClassRef:
14+
group: karpenter.k8s.aws
15+
kind: EC2NodeClass
16+
name: placement-group
17+
---
18+
apiVersion: karpenter.k8s.aws/v1
19+
kind: EC2NodeClass
20+
metadata:
21+
name: placement-group
22+
spec:
23+
amiFamily: AL2023
24+
role: "KarpenterNodeRole-${CLUSTER_NAME}"
25+
subnetSelectorTerms:
26+
- tags:
27+
karpenter.sh/discovery: "${CLUSTER_NAME}"
28+
securityGroupSelectorTerms:
29+
- tags:
30+
karpenter.sh/discovery: "${CLUSTER_NAME}"
31+
amiSelectorTerms:
32+
- alias: al2023@latest
33+
placementGroup:
34+
# Use `name` for placement groups in the same account.
35+
# Use `id` instead when launching into a shared placement group.
36+
name: analytics-partition
37+
# Optional, only valid for partition placement groups.
38+
partition: 2

kwok/main.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ func main() {
9191
cloudProvider,
9292
op.SubnetProvider,
9393
op.SecurityGroupProvider,
94+
op.PlacementGroupProvider,
9495
op.InstanceProfileProvider,
9596
op.InstanceProvider,
9697
op.PricingProvider,

kwok/operator/operator.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ import (
6161
"github.com/aws/karpenter-provider-aws/pkg/providers/instanceprofile"
6262
"github.com/aws/karpenter-provider-aws/pkg/providers/instancetype"
6363
"github.com/aws/karpenter-provider-aws/pkg/providers/launchtemplate"
64+
"github.com/aws/karpenter-provider-aws/pkg/providers/placementgroup"
6465
"github.com/aws/karpenter-provider-aws/pkg/providers/pricing"
6566
"github.com/aws/karpenter-provider-aws/pkg/providers/securitygroup"
6667
ssmp "github.com/aws/karpenter-provider-aws/pkg/providers/ssm"
@@ -83,6 +84,7 @@ type Operator struct {
8384
RecreationCache *cache.Cache
8485
SubnetProvider subnet.Provider
8586
SecurityGroupProvider securitygroup.Provider
87+
PlacementGroupProvider placementgroup.Provider
8688
InstanceProfileProvider instanceprofile.Provider
8789
AMIProvider amifamily.Provider
8890
AMIResolver amifamily.Resolver
@@ -138,6 +140,7 @@ func NewOperator(ctx context.Context, operator *operator.Operator) (context.Cont
138140
cfg.Region,
139141
false,
140142
)
143+
placementGroupProvider := placementgroup.NewDefaultProvider(ec2api, cache.New(awscache.DefaultTTL, awscache.DefaultCleanupInterval))
141144
versionProvider := version.NewDefaultProvider(operator.KubernetesInterface, eksapi)
142145
// Ensure we're able to hydrate the version before starting any reliant controllers.
143146
// Version updates are hydrated asynchronously after this, in the event of a failure
@@ -205,6 +208,7 @@ func NewOperator(ctx context.Context, operator *operator.Operator) (context.Cont
205208
RecreationCache: recreationCache,
206209
SubnetProvider: subnetProvider,
207210
SecurityGroupProvider: securityGroupProvider,
211+
PlacementGroupProvider: placementGroupProvider,
208212
InstanceProfileProvider: instanceProfileProvider,
209213
AMIProvider: amiProvider,
210214
AMIResolver: amiResolver,

0 commit comments

Comments
 (0)