-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Instance profile reconciler reuses static profile when switching from spec.instanceProfile to spec.role #9028
Description
Description
Hi! We ran into an issue during a migration from spec.instanceProfile to spec.role on an EC2NodeClass. After the switch, karpenter continued using the old static instance profile instead of creating a dynamic one. This affected all newly-launched nodes, including drift replacements.
We traced it through the source and believe we understand the mechanism — writing it up here in case it's helpful.
Version
v1.8.6
What We Observed
- EC2NodeClass originally had
spec.instanceProfile: OurStaticProfile(with roleOurNodeRoleattached) - Changed to
spec.role: OurNodeRole(removedspec.instanceProfile) - Karpenter detected drift on existing NodeClaims (hash changed) — correct
- Replacement nodes were launched with the old static profile, not a new dynamic one
status.instanceProfilecontinued to show the old static profile name- Restarting karpenter didn't help — same behavior
- Only when we deleted the static profile from IAM did karpenter create a dynamic profile
What We Think Is Happening
In pkg/controllers/nodeclass/instanceprofile.go, when spec.role is set:
status.instanceProfilestill holds the old static profile name from thespec.instanceProfileera- The reconciler calls
GetInstanceProfile()on it — the profile exists and has the correct role attached currentRole != nodeClass.Spec.Roleevaluates to false, so the reconciler skips creating a dynamic profile- Launch templates read from
status.instanceProfile, so all new nodes get the old static profile
The reconciler checks whether the attached role matches, but doesn't distinguish between karpenter-managed profiles (under the /karpenter/ IAM path) and user-provided static ones.
Steps to Reproduce
- Create a static IAM instance profile with a role (e.g.,
MyStaticProfilewithMyNodeRole) - Create an EC2NodeClass with
spec.instanceProfile: MyStaticProfile - Let karpenter launch nodes — they use
MyStaticProfilecorrectly - Change the EC2NodeClass to
spec.role: MyNodeRole(removespec.instanceProfile) - Karpenter detects drift, launches replacements — replacements still use
MyStaticProfile - Delete
MyStaticProfilefrom IAM — karpenter now creates a dynamic profile
Reproduced on a dev cluster (2026-03-23).
Impact
In our case, the old static profile was later deleted via Terraform, which caused nodes to lose IMDS credentials. STS credential caching delayed the visible impact by several hours.
Possible Fix
One approach might be to check whether the profile in status.instanceProfile is karpenter-managed (e.g., by IAM path prefix) when spec.role is set. If it's a static/user-provided profile, the reconciler could create a new dynamic one regardless of whether the role matches.