Skip to content

Karpenter Nodes Unable to Consolidate #3927

@jaydeep-pf

Description

@jaydeep-pf

Version

Karpenter Version: v0.26.0

Kubernetes Version: v1.23.17

Expected Behavior

  • Launch node and utilize it fully
  • If unscheduled pods are more, schedule it on new node but later consolidate them (this is working on staging because HPA and traffic is not much, but in production its running 100+ pods on each node but later unable to consolidate and nodes are underutilized)

Actual Behavior

Issue is as following:

We have recently migrated from CA to Karpenter. We have custom provisioner and awsnodetemplate for workloads. Each nodes are expected to run max 500 pods (we have this enabled on karpenter), since we have HPA enabled and lot of traffic coming in we are seeing that karpenter launches multiple nodes and schedules 100+ pods on each node and now its unable to consolidate. Earlier we were running 3 nodes now after migrating to karpenter we are running 7 nodes on average. We have following events

│ Events:                                                                                                                                                                                                                                                                      │
│   Type    Reason            Age                    From       Message                                                                                                                                                                                                        │
│   ----    ------            ----                   ----       -------                                                                                                                                                                                                        │
│   Normal  Unconsolidatable  31m (x7 over 2d20h)    karpenter  not all pods would schedule                                                                                                                                                                                    │
│   Normal  Unconsolidatable  68s (x351 over 3d18h)  karpenter  can't remove without creating 3 nodes

Steps to Reproduce the Problem

  • Setup HPA with deployment
  • Provisioner should container 4xlarge instances, and kubelet configuration of max 500 pods
  • Launch 1000+ pods and check instances being provisioned
  • once everything is settled, nodes which are running 100+ pods are not able to consolidate, check events.

Resource Specs and Logs

Provisioner Configs

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: worker-provisioner
spec:
  consolidation:
    enabled: true
  kubeletConfiguration:
    maxPods: 500
  labels:
    node.kubernetes.io/role: worker
    nodegroup: worker
  limits:
    resources:
      cpu: "400"
      memory: 2800Gi
  providerRef:
    name: worker-awsnodetemplate
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - spot
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - r4.4xlarge
    - r5.4xlarge
    - r5a.4xlarge
    - m6i.4xlarge
    - r5.4xlarge
    - r5a.4xlarge
    - r5ad.4xlarge
    - r5b.4xlarge
    - r5d.4xlarge
    - r5dn.4xlarge
    - r5n.4xlarge
    - r6i.4xlarge
  - key: kubernetes.io/os
    operator: In
    values:
    - linux
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64
  taints:
  - effect: NoSchedule
    key: node.kubernetes.io/role
    value: worker
  ttlSecondsUntilExpired: 2592000

AWSNodetemplate Configs

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: worker-awsnodetemplate
spec:
  amiFamily: AL2
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      deleteOnTermination: true
      volumeSize: 200Gi
      volumeType: gp3
  instanceProfile: eks-nodes-production-worker
  metadataOptions:
    httpTokens: optional
  securityGroupSelector:
    karpenter.sh/discovery: eks-production
  subnetSelector:
    karpenter.sh/discovery: eks-production
  tags:
    Environment: production
    Name: production-worker-karpenter-eks-node
  userData: |
    MIME-Version: 1.0
    Content-Type: multipart/mixed; boundary="BOUNDARY"

    --BOUNDARY
    Content-Type: text/x-shellscript; charset="us-ascii"

    !/bin/bash -xe
    # Mount ephmeral volume if exists
    if [[ -e /dev/nvme1n1 ]] && [[ ! $(grep /dev/nvme1n1 /proc/mounts) ]]; then
      mkfs.xfs /dev/nvme1n1
      mount /dev/nvme1n1 /var/lib/docker
      systemctl restart docker
    fi

    # Custom supplied userdata code to install ssm-agent
    yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
    systemctl enable amazon-ssm-agent
    systemctl start amazon-ssm-agent

    # Custom userdata to increase socket max connections
    echo "net.core.somaxconn=4096" >> /etc/sysctl.conf
    sysctl -p
    --BOUNDARY--

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions