Skip to content

Using Refrences Triggers Drift for Gitops reconcilers #2361

Open
@jaguer0

Description

@jaguer0

Describe the bug
When using Flux CD, the first apply is successful, and the ECS ACK controller correctly updates references (e.g., taskDefinitionRef and targetGroupRef). However, after Flux performs an SSA (Server side Apply) reconcile, it updates these references back to match the Git repository state. This causes a difference between the current ECS manifest and the Git state, triggering an unnecessary ECS deployment.

I have already tried setting Flux’s kustomize.toolkit.fluxcd.io/ssa: merge annotation, but the issue persists.

Current Behavior:
Flux reconciles successfully, but an unnecessary redeployment occurs when the AWS ACK controller reconciles. This happens due to the controller’s default reconciliation interval of 10 hours. As a result, ECS tasks are redeployed multiple times per day even when no changes have been made.

Steps to Reproduce:

  1. Set up Flux CD with the ECS ACK controller.
  2. Apply the initial configuration (first apply works fine).
  3. Wait for Flux to perform a reconcile.
  4. The ECS task redeploys unnecessarily after the AWS ACK controller reconciles, which happens every 10 hours by default.

Added context, I shortened my Flux reconciler to every 5 minutes during debug and it seems like the ecs controller is seeing the changes / diffs but not acting upon them which I think is expected because the ECS reached a healthy state, so the extra deployment seems to occur when ack reconciler kicks in.

also seems to appear with other references like elbv2.services.k8s.aws/v1alpha1 Rule , when using targetGroupRef so seems to be references in general as it updates the manifest

Steps to reproduce

apiVersion: ecs.services.k8s.aws/v1alpha1
kind: Service
metadata:
  name: foo-bar
spec:
  name: foo-bar
  capacityProviderStrategy:
  - base: 0
    capacityProvider: FARGATE
    weight: 1
  cluster: staging
  deploymentConfiguration:
    alarms:
      alarmNames:
      - none
      enable: false
      rollback: false
    deploymentCircuitBreaker:
      enable: true
      rollback: true
    maximumPercent: 200
    minimumHealthyPercent: 100
  deploymentController:
    type: ECS
  desiredCount: 1
  enableECSManagedTags: true
  enableExecuteCommand: false
  healthCheckGracePeriodSeconds: 0
  loadBalancers:
  - containerName: foo-bar
    containerPort: 8080
    targetGroupRef:
      from:
        name: foo-bar-tg-staging
  networkConfiguration:
    awsVPCConfiguration:
      assignPublicIP: DISABLED
      securityGroups:
      - sg-xxxxx
      subnets:
      - sg-xxxxxx
      - sg-xxxxxx
  platformVersion: 1.4.0
  propagateTags: NONE
  schedulingStrategy: REPLICA
  taskDefinitionRef:
    from:
      name: foo-bar-staging

As Flux reconciles, you see this diff

{
  "level": "info",
  "ts": "2025-02-26T14:21:05.618Z",
  "logger": "ackrt",
  "msg": "desired resource state has changed",
  "kind": "Service",
  "namespace": "foo-bar",
  "name": "foo-bar-staging",
  "account": "xxxxxx",
  "role": "",
  "region": "us-west-2",
  "is_adopted": false,
  "generation": 4134,
  "diff": [
    {
      "Path": {
        "Parts": [
          "Spec",
          "LoadBalancers"
        ]
      },
      "A": [
        {
          "containerName": "foo-bar",
          "containerPort": 8080,
          "targetGroupARN": "arn:aws:elasticloadbalancing:us-west-2:xxxxxx:targetgroup/foo-bar/cbeebab4exxxx",
          "targetGroupRef": {
            "from": {
              "name": "foo-bar-tg-staging"
            }
          }
        }
      ],
      "B": [
        {
          "containerName": "foo-bar",
          "containerPort": 8080,
          "targetGroupARN": "arn:aws:elasticloadbalancing:us-west-2:xxxxxx:targetgroup/foo-bar/cbeebab4exxxx"
        }
      ]
    },
    {
      "Path": {
        "Parts": [
          "Spec",
          "TaskDefinition"
        ]
      },
      "A": "foo-bar-staging",
      "B": "arn:aws:ecs:us-west-2:xxxxxx:task-definition/foo-bar-staging:32"
    }
  ]
}

Expected outcome
The controller should either not detect any differences or should properly work with the Flux reconciler to prevent unwanted updates.
The ECS task should not be redeployed unless actual changes have occurred.

Environment

  • Using EKS: Yes v1.29.13-eks-8cce635
  • AWS service targeted: ECS

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/resource-referencesIssues or PRs related to resource referenceshelp wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions