[Bug]: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use)

### Version

3.6.2

### What Kubernetes platforms are you running on?

EKS Amazon

### Steps to reproduce

k8s EKS version: 1.31

Describe the bug:
Sometimes, the nginx-ingress-controller restarts the process without cleaning the socket files.
At first time we meet this problem during massive node restarting in the k8s cluster.
Then it happens randomly on weekends.

The problem was noticed in version 3.6.2. Before we used app version 3.0.2 and never had this problem

Manual Pod deletion solves the problem, but it can happen again.

Here is deployment yaml
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    meta.helm.sh/release-name: nginx-inc-ingress-controller
    meta.helm.sh/release-namespace: nginx-ingress
  labels:
    app: nginx-inc-ingress-controller
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: 3.6.1
    helm.sh/chart: nginx-ingress-1.3.1
  name: nginx-inc-ingress-controller
  namespace: nginx-ingress
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx-inc-ingress-controller
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        logs.improvado.io/app: nginx-ingress
        logs.improvado.io/format: json
        logs.improvado.io/ingress-class: nginx-stable
        prometheus.io/port: "9113"
        prometheus.io/scheme: http
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        app: nginx-inc-ingress-controller
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: nginx-inc-ingress-controller
            topologyKey: kubernetes.io/hostname
      automountServiceAccountToken: true
      containers:
      - args:
        - -nginx-plus=false
        - -nginx-reload-timeout=60000
        - -enable-app-protect=false
        - -enable-app-protect-dos=false
        - -nginx-configmaps=$(POD_NAMESPACE)/nginx-inc-ingress-controller
        - -default-server-tls-secret=$(POD_NAMESPACE)/nginx-inc-ingress-controller-default-server-tls
        - -ingress-class=nginx-stable
        - -health-status=true
        - -health-status-uri=/-/health/lb
        - -nginx-debug=false
        - -v=1
        - -nginx-status=true
        - -nginx-status-port=8080
        - -nginx-status-allow-cidrs=127.0.0.1
        - -report-ingress-status
        - -enable-leader-election=true
        - -leader-election-lock-name=nginx-inc-ingress-controller-leader
        - -enable-prometheus-metrics=true
        - -prometheus-metrics-listen-port=9113
        - -prometheus-tls-secret=
        - -enable-service-insight=false
        - -service-insight-listen-port=9114
        - -service-insight-tls-secret=
        - -enable-custom-resources=true
        - -enable-snippets=true
        - -include-year=false
        - -disable-ipv6=false
        - -enable-tls-passthrough=false
        - -enable-cert-manager=false
        - -enable-oidc=false
        - -enable-external-dns=false
        - -default-http-listener-port=80
        - -default-https-listener-port=443
        - -ready-status=true
        - -ready-status-port=8081
        - -enable-latency-metrics=false
        - -ssl-dynamic-reload=true
        - -enable-telemetry-reporting=false
        - -weight-changes-dynamic-reload=false
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        image: 627003544259.dkr.ecr.us-east-1.amazonaws.com/nginx-inc-ingress:master-3.6.2-2-1
        imagePullPolicy: IfNotPresent
        name: ingress-controller
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          name: https
          protocol: TCP
        - containerPort: 9113
          name: prometheus
          protocol: TCP
        - containerPort: 8081
          name: readiness-port
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /nginx-ready
            port: readiness-port
            scheme: HTTP
          periodSeconds: 1
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 1500m
            memory: 1500Mi
          requests:
            cpu: 100m
            memory: 1500Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 101
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/nginx
          name: nginx-etc
        - mountPath: /var/cache/nginx
          name: nginx-cache
        - mountPath: /var/lib/nginx
          name: nginx-lib
        - mountPath: /var/log/nginx
          name: nginx-log
      dnsPolicy: ClusterFirst
      initContainers:
      - command:
        - cp
        - -vdR
        - /etc/nginx/.
        - /mnt/etc
        image: 627003544259.dkr.ecr.us-east-1.amazonaws.com/nginx-inc-ingress:master-3.6.2-2-1
        imagePullPolicy: IfNotPresent
        name: init-ingress-controller
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 101
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /mnt/etc
          name: nginx-etc
      nodeSelector:
        kubernetes.io/arch: amd64
      priorityClassName: cluster-application-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      serviceAccount: nginx-inc-ingress-controller
      serviceAccountName: nginx-inc-ingress-controller
      terminationGracePeriodSeconds: 60
      tolerations:
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 300
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 300
      - effect: NoSchedule
        key: node.kubernetes.io/memory-pressure
        operator: Exists
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: nginx-inc-ingress-controller
        maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
      volumes:
      - emptyDir: {}
        name: nginx-etc
      - emptyDir: {}
        name: nginx-cache
      - emptyDir: {}
        name: nginx-lib
      - emptyDir: {}
        name: nginx-log
```
Logs with error:
```
2024-11-02 10:02:58.543	2024/11/02 06:02:56 [emerg] 18#18: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use)
2024-11-02 10:02:58.543	2024/11/02 06:02:56 [emerg] 18#18: bind() to unix:/var/lib/nginx/nginx-config-version.sock failed (98: Address already in use)
2024-11-02 10:02:58.543	2024/11/02 06:02:56 [emerg] 18#18: bind() to unix:/var/lib/nginx/nginx-502-server.sock failed (98: Address already in use)
2024-11-02 10:02:58.543	2024/11/02 06:02:56 [emerg] 18#18: bind() to unix:/var/lib/nginx/nginx-418-server.sock failed (98: Address already in use)
2024-11-02 10:02:58.543	2024/11/02 06:02:56 [notice] 18#18: try again to bind() after 500ms
2024-11-02 10:02:59.043	2024/11/02 06:02:56 [emerg] 18#18: still could not bind()
```
Here are logs, containing 1 signal reconfiguring and then a crash loop with socket busy error
[Explore-logs-2024-11-05 18_40_57.txt](https://github.com/user-attachments/files/17634341/Explore-logs-2024-11-05.18_40_57.txt)

**Expected behavior**
nginx-ingress controller pod is working.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use) #6752

Version

What Kubernetes platforms are you running on?

Steps to reproduce

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use) #6752

Description

Version

What Kubernetes platforms are you running on?

Steps to reproduce

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions