fix(cncf-install): use Falco chart 8.0.1, disable falcoctl downloader by bmvinay7 · Pull Request #2306 · kubestellar/console-kb

bmvinay7 · 2026-05-21T16:02:08Z

Description

Rewrites fixes/cncf-install/install-falco.json so it actually installs Falco end to end, including on clusters with restricted egress (kind, k3d, air-gapped, corporate networks blocking github.io). The previous mission was unrunnable on any current chart and crash-looped pods on the most common kind setup.

Bugs in the previous mission

1. `helm install ... --version 0.43.0` fails with `chart not found`

$ helm install falco falcosecurity/falco --namespace falco --create-namespace --version 0.43.0
Error: INSTALLATION FAILED: chart "falco" matching 0.43.0 not found in falcosecurity index ...

0.43.0 is the Falco app version. The Helm --version flag takes the chart version. Chart 8.0.1 maps to app v0.43.0:

$ helm search repo falcosecurity/falco --versions | head -5
NAME                CHART VERSION  APP VERSION
falcosecurity/falco 8.0.5          0.43.1
falcosecurity/falco 8.0.1          0.43.0
falcosecurity/falco 7.2.1          0.42.1

Same problem in the upgrade step (--version 0.44.0 is also an app version, not a chart version).

2. `falcoctl-artifact-install` init container crash-loops on restricted egress

By default the chart enables the falcoctl artifact downloader, which runs as the falcoctl-artifact-install init container and fetches https://falcosecurity.github.io/falcoctl/index.yaml at pod startup. On clusters with restricted or flaky egress the request hangs and the init container errors out:

{"level":"ERROR","msg":"unable to fetch index \"falcosecurity\" with URL \"https://falcosecurity.github.io/falcoctl/index.yaml\": ... dial tcp 185.199.108.153:443: i/o timeout"}

The pod stays in Init:CrashLoopBackOff. Falco ships its rules baked into the image, so the runtime fetch is optional. The upstream-documented fix (Rule basics for the Falco 3.0.0 Helm chart) is:

helm install falco \
  --set falcoctl.artifact.install.enabled=false \
  --set falcoctl.artifact.follow.enabled=false

This is upstream's documented "use the embedded ruleset" path and works on any cluster regardless of egress.

3. `kubectl delete crd falcosecurity.org` is a no-op

The previous uninstall step ran kubectl delete crd falcosecurity.org. Falco does not install any CRDs at that name (or any name). The chart cleans up its own resources on helm uninstall. Removed.

What this PR ships

Install (5 steps)

Add the falcosecurity Helm repository.
Install with chart 8.0.1 and both falcoctl artifact flags disabled so the embedded ruleset is used and the pod does not need github.io egress to start.
Wait for the DaemonSet rollout (kubectl rollout status daemonset/falco -n falco) instead of just listing pods.
Tail logs to confirm Falco initialized with configuration files and the BPF engine messages.
Trigger a sample event by reading /etc/shadow inside a busybox pod, which fires the built-in Sensitive file opened for reading by non-trusted program rule. Confirms BPF probe is attached and embedded rules are loaded.

Uninstall (3 steps)

helm uninstall -> namespace delete -> verify no Falco pods, CRDs, or RBAC remain.

Upgrade (3 steps)

Backup DaemonSet -> helm upgrade --version 8.0.1 --reuse-values (so the falcoctl flags persist across upgrades) -> verify rollout.

Troubleshooting (6 entries)

chart not found when passing the app version
falcoctl-artifact-install CrashLoopBackOff with the documented fix
Image-pull stalls on slow networks
Falco running but no events (sample syscall + log grep)
Falco pod OOMKilled (memory limit bump)
BPF probe fails to attach (kmod fallback)

Metadata fixes

containerImages switched from a single placeholder ref to the three real images chart 8.0.1 actually pulls: docker.io/falcosecurity/falco:0.43.0, docker.io/falcosecurity/falco-driver-loader:0.43.0, docker.io/falcosecurity/falcoctl:0.12.2. Verified with docker manifest inspect.
metadata.sourceUrls.helm added.
Author/authorGithub switched to Vinay B M / bmvinay7 for the rewrite, matching precedent from PR Fix Thanos install mission and correct author attribution #2253 (Thanos), fix(platform-install): rewrite argocd-operator install mission to use upstream OLM path #2299 (argocd-operator), and the auto-merged 🐛 Fix platform-kyverno mission: namespace, chart version, labels, uninstall safety #2305 (Kyverno).

Validation

Local CI (mirror of every PR-blocking workflow)

== validate-schema ==
[PASS] validate-schema

== kb-quality-enforcement ==
Score: 100/100 ([PASS] OK)
Breakdown:
  - clarity: 100
  - completeness: 100
  - correctness: 100
  - structure: 100
  - observability: 100
[PASS] kb-quality-enforcement

== scan-missions ==
[PASS] scan-missions   (Schema clean, no sensitive data, no malicious content)

== mission-safety-scan ==
[PASS] mission-safety-scan   (all 14 grep rules clean)

== mission-content-validation (per-step) ==
[PASS] mission-content-validation (per-step)

== mission-content-validation (live URL + crane) ==
[PASS] mission-content-validation (live)
  https://falcosecurity.github.io/charts/index.yaml -> HTTP 200

== pr-verifier (conventional commit subject) ==
[PASS] pr-verifier

== copilot-dco (Signed-off-by trailer) ==
[PASS] copilot-dco

ALL LOCAL CI GATES PASSED. Safe to push.

End-to-end on kind

Cluster: kb-test (kind v0.31.0, Kubernetes 1.36.0).

# Reproduction of the falcoctl bug on the previous chart-version-correct path
$ helm install falco falcosecurity/falco --namespace falco --create-namespace --version 8.0.1
$ kubectl get pods -n falco
falco-c79rt   0/2   Init:CrashLoopBackOff   17 (2m9s ago)
$ kubectl logs -n falco falco-c79rt -c falcoctl-artifact-install --tail=5
{"level":"ERROR","msg":"unable to fetch index ... dial tcp 185.199.108.153:443: i/o timeout"}

# Apply the documented fix
$ helm upgrade falco falcosecurity/falco --namespace falco --version 8.0.1 \
    --set falcoctl.artifact.install.enabled=false \
    --set falcoctl.artifact.follow.enabled=false
Release "falco" has been upgraded.

$ kubectl rollout status daemonset/falco -n falco --timeout=300s
daemon set "falco" successfully rolled out

$ kubectl get pods -n falco -l app.kubernetes.io/name=falco
NAME          READY   STATUS    RESTARTS   AGE
falco-v4x2g   1/1     Running   0          81s

$ kubectl logs -n falco -l app.kubernetes.io/name=falco -c falco --tail=5
[libs]: Trying to open the right engine!
Falco initialized with configuration files
Starting health webserver with threadiness 1, listening on 0.0.0.0:8765

Type of Change

Bug fix (non-breaking change that fixes an issue)
Documentation update

Checklist

I have signed off my commits (git commit -s)
I have updated documentation as needed
I have added tests that prove my fix/feature works
All new and existing tests pass

The "tests that prove my fix/feature works" box stays unticked because the repo has no unit-test framework for missions. The CI validators (schema, scan, quality, safety, content) ARE the tests; they're already covered by "All new and existing tests pass". The kind end-to-end run above is the practical equivalent.

The previous install-falco.json was unrunnable end to end on any cluster the helm CLI could parse, and its install path crash-looped pods on every cluster with restricted egress. Two correctness bugs and one polish gap: 1. helm install falco falcosecurity/falco --version 0.43.0 fails with "no chart version found for falco-0.43.0". 0.43.0 is the Falco application version. The Helm --version flag takes the chart version. The chart version that maps to app v0.43.0 is 8.0.1 (helm search repo falcosecurity/falco --versions). Same problem in the upgrade step (--version 0.44.0 is also an app version). 2. The default chart enables the falcoctl artifact downloader, which runs as the falcoctl-artifact-install init container and fetches https://falcosecurity.github.io/falcoctl/index.yaml at pod startup. On clusters with restricted or flaky egress (kind, k3d, air-gapped, corporate networks blocking github.io) the request hangs and the init container errors with "dial tcp ...:443: i/o timeout". The pod stays in Init:CrashLoopBackOff. Falco ships its rules baked into the image, so the runtime fetch is optional. The upstream-documented fix (https://falco.org/blog/rules-helm-chart-3-0-0/) is to set falcoctl.artifact.install.enabled=false and falcoctl.artifact.follow.enabled=false. Reproduced and fixed end-to-end on a kind cluster. 3. The uninstall step ran "kubectl delete crd falcosecurity.org", which fails because Falco does not install any CRDs. The chart cleans up its own resources on helm uninstall. Removed. This rewrite addresses all three: - Step 2 installs chart 8.0.1 (app v0.43.0) with both falcoctl artifact flags disabled, so the embedded ruleset is used and the pod does not need github.io egress to start. - Step 3 waits for the DaemonSet rollout instead of just listing pods, so the verification fails fast if the BPF probe cannot attach. - Step 5 generates a known-noisy syscall (cat /etc/shadow inside a busybox pod) and greps for the corresponding "Sensitive file opened for reading by non-trusted program" Warning, which proves the BPF probe is wired up and the embedded ruleset is loaded. - Uninstall is split into helm uninstall, namespace delete, and a final pods/CRDs/RBAC verification. - Upgrade uses --reuse-values so the falcoctl flags stay disabled across upgrades (otherwise the next chart bump silently re-enables the runtime downloader and the pod starts crash-looping again). - Six concrete troubleshooting entries: chart-version mismatch, falcoctl artifact-install crash loop with the documented fix, image-pull stalls on slow networks, no-events sanity test, OOMKilled, and the kmod fallback when modern-bpf cannot attach. - metadata.containerImages now lists the three real images the chart 8.0.1 actually pulls (falco, falco-driver-loader, falcoctl) instead of the single placeholder reference. Validated end to end on a kind cluster (kb-test, kind v0.31.0, Kubernetes 1.36.0): - helm install ... --version 8.0.1 with both falcoctl artifact flags disabled: STATUS deployed - kubectl rollout status daemonset/falco -n falco: rolled out - kubectl get pods -n falco: falco-... 1/1 Running 0 (was Init:CrashLoopBackOff before the falcoctl flags were added) - kubectl logs ... -c falco --tail=50: "Falco initialized with configuration files" + libbpf engine messages + "Events detected: N" - kubectl run falco-test ... cat /etc/shadow: triggered the "Sensitive file opened for reading by non-trusted program" Warning event in the Falco logs Local CI parity (scripts/local-ci.sh): - validate-schema -> Valid kc-mission-v1 - kb-quality-enforcement -> 100/100 (clarity, completeness, correctness, structure, observability all 100) - scan-missions -> Schema clean, no sensitive data, no malicious content - mission-safety-scan -> all 14 grep rules clean - mission-content-validation (per-step) -> every step has a code block, no orphan kubectl edit deployment, no orphan kubectl apply -f local-file - mission-content-validation (live) -> Helm repo https://falcosecurity.github.io/charts/index.yaml -> HTTP 200 Signed-off-by: bmvinay7 <vinaybm1234@gmail.com>

kubestellar-prow · 2026-05-21T16:02:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign clubanderson for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kubestellar-hive

LGTM — thorough fix for the Falco install mission.

kubestellar-prow · 2026-05-21T16:09:57Z

@kubestellar-hive[bot]: changing LGTM is restricted to collaborators

Details

In response to this:

LGTM — thorough fix for the Falco install mission.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kubestellar-prow Bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label May 21, 2026

kubestellar-prow Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 21, 2026

kubestellar-hive Bot approved these changes May 21, 2026

View reviewed changes

kubestellar-hive Bot merged commit 4d74e56 into kubestellar:master May 21, 2026
11 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cncf-install): use Falco chart 8.0.1, disable falcoctl downloader#2306

fix(cncf-install): use Falco chart 8.0.1, disable falcoctl downloader#2306
kubestellar-hive[bot] merged 1 commit into
kubestellar:masterfrom
bmvinay7:fix/install-falco-chart-version-and-egress

bmvinay7 commented May 21, 2026

Uh oh!

kubestellar-prow Bot commented May 21, 2026

Uh oh!

kubestellar-hive Bot left a comment

Uh oh!

kubestellar-prow Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bmvinay7 commented May 21, 2026

Description

Bugs in the previous mission

1. helm install ... --version 0.43.0 fails with chart not found

2. falcoctl-artifact-install init container crash-loops on restricted egress

3. kubectl delete crd falcosecurity.org is a no-op

What this PR ships

Install (5 steps)

Uninstall (3 steps)

Upgrade (3 steps)

Troubleshooting (6 entries)

Metadata fixes

Validation

Local CI (mirror of every PR-blocking workflow)

End-to-end on kind

Type of Change

Checklist

Uh oh!

kubestellar-prow Bot commented May 21, 2026

Uh oh!

kubestellar-hive Bot left a comment

Choose a reason for hiding this comment

Uh oh!

kubestellar-prow Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `helm install ... --version 0.43.0` fails with `chart not found`

2. `falcoctl-artifact-install` init container crash-loops on restricted egress

3. `kubectl delete crd falcosecurity.org` is a no-op