Skip to content

Instance profile reconciler reuses static profile when switching from spec.instanceProfile to spec.role #9028

@gilad-aperio

Description

@gilad-aperio

Description

Hi! We ran into an issue during a migration from spec.instanceProfile to spec.role on an EC2NodeClass. After the switch, karpenter continued using the old static instance profile instead of creating a dynamic one. This affected all newly-launched nodes, including drift replacements.

We traced it through the source and believe we understand the mechanism — writing it up here in case it's helpful.

Version

v1.8.6

What We Observed

  1. EC2NodeClass originally had spec.instanceProfile: OurStaticProfile (with role OurNodeRole attached)
  2. Changed to spec.role: OurNodeRole (removed spec.instanceProfile)
  3. Karpenter detected drift on existing NodeClaims (hash changed) — correct
  4. Replacement nodes were launched with the old static profile, not a new dynamic one
  5. status.instanceProfile continued to show the old static profile name
  6. Restarting karpenter didn't help — same behavior
  7. Only when we deleted the static profile from IAM did karpenter create a dynamic profile

What We Think Is Happening

In pkg/controllers/nodeclass/instanceprofile.go, when spec.role is set:

  • status.instanceProfile still holds the old static profile name from the spec.instanceProfile era
  • The reconciler calls GetInstanceProfile() on it — the profile exists and has the correct role attached
  • currentRole != nodeClass.Spec.Role evaluates to false, so the reconciler skips creating a dynamic profile
  • Launch templates read from status.instanceProfile, so all new nodes get the old static profile

The reconciler checks whether the attached role matches, but doesn't distinguish between karpenter-managed profiles (under the /karpenter/ IAM path) and user-provided static ones.

Steps to Reproduce

  1. Create a static IAM instance profile with a role (e.g., MyStaticProfile with MyNodeRole)
  2. Create an EC2NodeClass with spec.instanceProfile: MyStaticProfile
  3. Let karpenter launch nodes — they use MyStaticProfile correctly
  4. Change the EC2NodeClass to spec.role: MyNodeRole (remove spec.instanceProfile)
  5. Karpenter detects drift, launches replacements — replacements still use MyStaticProfile
  6. Delete MyStaticProfile from IAM — karpenter now creates a dynamic profile

Reproduced on a dev cluster (2026-03-23).

Impact

In our case, the old static profile was later deleted via Terraform, which caused nodes to lose IMDS credentials. STS credential caching delayed the visible impact by several hours.

Possible Fix

One approach might be to check whether the profile in status.instanceProfile is karpenter-managed (e.g., by IAM path prefix) when spec.role is set. If it's a static/user-provided profile, the reconciler could create a new dynamic one regardless of whether the role matches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage/acceptedIndicates that the issue has been accepted as a valid issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions