Skip to content

Issue#138 add probes and resource limits#139

Open
amaanx86 wants to merge 9 commits intooracle:mainfrom
amaanx86:138-add-probes-and-resource-limits
Open

Issue#138 add probes and resource limits#139
amaanx86 wants to merge 9 commits intooracle:mainfrom
amaanx86:138-add-probes-and-resource-limits

Conversation

@amaanx86
Copy link

Fix Cloud Guard Container Security Findings by Adding Health Probes

Subject

Resolve OCI Cloud Guard findings (missing health probes) in OCI Native Ingress Controller by adding readiness and liveness probes to the deployment template.

Problem Statement

Cloud Guard was flagging multiple Container Security findings against the OCI Native Ingress Controller deployed as an OKE managed add-on, specifically:

  • Container without readiness probe (Medium risk)
  • Container without liveness probe (Medium risk)

This issue persisted across our OCI environment and could not be resolved by end users since the controller is deployed as a managed add-on (Deployment spec cannot be safely modified). However, these same findings would affect Helm-based deployments as well.

Solution

Added TCP socket-based health probes to the Helm deployment template:

Readiness Probe:

  • Port: webhook-server (9443)
  • Initial Delay: 30 seconds (allows controller startup time)
  • Period: 10 seconds (frequent health checks)
  • Timeout: 5 seconds
  • Failure Threshold: 3 attempts

Liveness Probe:

  • Port: webhook-server (9443)
  • Initial Delay: 60 seconds (allows stabilization)
  • Period: 20 seconds
  • Timeout: 5 seconds
  • Failure Threshold: 3 attempts

Implementation Details

  • Probes use TCP socket checks on the webhook server port (simpler, more reliable than HTTP for control plane components)
  • Conservative timing prevents flapping while ensuring quick failure detection
  • No breaking changes; existing deployments will inherit health probes from updated Helm chart

Testing

  • Verified probes are correctly configured in deployment template
  • Tested with probe timings to ensure no false positives
  • Cloud Guard findings should resolve after deployment update

Relates To

Closes #138

Commits

  1. Fix OCI Native Ingress Controller (OKE managed add-on) flagged by Cloud Guard for missing probes and resource limits #138: Add readiness probe to OCI Native Ingress Controller
  2. Fix OCI Native Ingress Controller (OKE managed add-on) flagged by Cloud Guard for missing probes and resource limits #138: Add liveness probe to OCI Native Ingress Controller
  3. Fix OCI Native Ingress Controller (OKE managed add-on) flagged by Cloud Guard for missing probes and resource limits #138: Document health probes configuration in values.yaml

Add TCP socket readiness probe on webhook-server port (9443) for CloudGuard compliance and operational reliability.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
Add TCP socket liveness probe on webhook-server port (9443) for CloudGuard compliance and operational reliability. Ensures container restarts automatically if unhealthy.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
Add comments explaining readiness and liveness probe behavior for Cloud Guard compliance.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 14, 2026
AkarshES
AkarshES previously approved these changes Feb 16, 2026
@amaanx86
Copy link
Author

Thank you @AkarshES for approving the changes, let me know what is further required to finalize this pull request!

nirpai
nirpai previously approved these changes Feb 18, 2026
@amaanx86 amaanx86 dismissed stale reviews from nirpai and AkarshES via d433363 February 18, 2026 17:10
… server

Replace TCP socket probes on webhook-server with HTTP GET endpoints
(/healthz/ready for readiness, /healthz/live for liveness) that connect
to the metrics server port.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
Signal to the health checker that all informer caches have been synced
after setup, enabling readiness checks to report ready status.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
…racking

Register /healthz/ready and /healthz/live HTTP endpoints on the metrics
server and mark controllers as ready after initialization for proper
health probe support.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
…iveness probes

Implement HealthChecker with endpoints for tracking cache synchronization
and controller readiness status. Provides /healthz/ready and /healthz/live
handlers for Kubernetes probe support.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
@amaanx86 amaanx86 force-pushed the 138-add-probes-and-resource-limits branch from d433363 to 0373bae Compare February 18, 2026 17:19
@amaanx86
Copy link
Author

Hi @nirpai, @AkarshES,
As requested, I’ve replaced the webhook TCP probes with dedicated HTTP health check endpoints and controller readiness logic.

Helm Deployment Test

Controller deployed from my branch image:
https://hub.docker.com/r/amaanx86/oci-native-ingress-controller

image

Health endpoints verified inside the pod

image

Ingress Functional Test

image

Document HTTP readiness and liveness endpoints on the metrics server.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
AkarshES
AkarshES previously approved these changes Feb 20, 2026
Copy link

@AkarshES AkarshES left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the HTTP healthcheck as Niranjan requested

@amaanx86 amaanx86 requested a review from nirpai February 20, 2026 07:50
nirpai
nirpai previously approved these changes Feb 20, 2026
# maxUnavailable: 1

# The TCP port the Webhook server binds to. (default 9443)
# Health probes for operational reliability and Cloud Guard compliance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is misplaced on webhook Port.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have moved it to metrics server @nirpai

requesting approval on PR @nirpai @AkarshES

Relocate health probe documentation from webhookBindPort to the metrics
section where the probes actually connect.

Signed-off-by: Amaan Ul Haq Siddiqui <amaanulhaq.s@outlook.com>
@amaanx86 amaanx86 dismissed stale reviews from nirpai and AkarshES via 11faf4a February 20, 2026 10:09
@amaanx86 amaanx86 requested review from AkarshES and nirpai February 20, 2026 10:13
@amaanx86
Copy link
Author

Hi @nirpai @AkarshES Can we merge this now ?

@AkarshES
Copy link

We are doing some validation in OKE environment so that we can go ahead and merge this change. Thank you for your effort and patience on this

@amaanx86
Copy link
Author

@AkarshES Thank you for keeping me updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OCI Native Ingress Controller (OKE managed add-on) flagged by Cloud Guard for missing probes and resource limits

3 participants