Collect shoot cluster events via k8sobjects receiver#77
Open
nickytd wants to merge 3 commits intogardener:mainfrom
Open
Collect shoot cluster events via k8sobjects receiver#77nickytd wants to merge 3 commits intogardener:mainfrom
nickytd wants to merge 3 commits intogardener:mainfrom
Conversation
- Reconcile a shoot access secret (NewShootAccessSecret) so the OTel Collector can authenticate to the shoot API server. - Deploy a ClusterRole + ClusterRoleBinding into the shoot cluster via a dedicated ManagedResource (shootManagedResourceName) granting get/list/watch on events.k8s.io/events. - Mount the generic shoot kubeconfig volume on the Collector pod and set KUBECONFIG env var so the k8sobjects receiver can use it. - Add k8sobjects/events receiver (auth_type: kubeConfig, watch mode) and a transform/events processor that strips managedFields from the event body. - Wire a logs/events pipeline: k8sobjects/events → resource, memoryLimiter, transform/events, batch → exporters. - Delete/wait for shoot ManagedResource before removing shoot access secret in Delete to avoid orphaned RBAC in the shoot cluster. - Fix Migrate: call SetKeepObjects on the shoot ManagedResource before delegating to Delete, so shoot RBAC objects are preserved during control-plane migration and the target seed can reconcile them cleanly.
- Tighten excludePath to '*_shoot--*_event-logger_*.log' to avoid accidentally excluding unrelated containers in non-shoot namespaces. - Bump fluent-bit-plugin image from v1.3.0 to v1.4.0.
Add a routing connector that splits incoming logs by body type: - IsMap(body) → logs/objects pipeline (k8s object events) - default → logs/string pipeline (plain string logs) Each pipeline runs a transform processor to set the _msg attribute before batching. Remove redundant processors: [] from logs/fanin.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
|
Contributor
Author
|
/kind enhancement |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Shoot cluster events (
events.k8s.io/v1) are currently collected by the proprietaryevent-loggercomponent, which tails container logs and emits events in a non-standard, Gardener-specific log format. This format is opaque to upstream observability tooling and makes it impossible to forward shoot events to external OTel-native backends without custom parsing.This PR replaces that approach by adding a
k8sobjectsreceiver to the OTel Collector that watchesevents.k8s.io/v1events directly from the shoot API server in real time (watch mode). Events are emitted as structured OTel log records, enriched with standard resource attributes, and forwarded through the existing exporter pipeline — fully compatible with any upstream OTel-native receiver without any custom parsing or format translation.As a consequence, the Fluent Bit
ClusterInputis updated to explicitly excludeevent-loggercontainer logs from collection, since events are now sourced directly via the k8sobjects receiver.Changes:
Shoot access secret: A
ShootAccessSecret(shoot-access-otelcol) is reconciled in the shoot namespace. Gardener's token projector keeps this secret's kubeconfig up to date with a short-lived token, giving the Collector a secure, auto-rotating credential to the shoot API server.Shoot RBAC (
ManagedResource: <namespace>-shoot): AClusterRole+ClusterRoleBindingis deployed into the shoot cluster grantingget/list/watchonevents.k8s.io/eventsto the Collector's service account. Only the minimum required permissions are requested.Kubeconfig volume: The shoot generic kubeconfig (projected secret + access token) is mounted into the Collector pod at
gardenerutils.VolumeMountPathGenericKubeconfig, andKUBECONFIGis set togardenerutils.PathGenericKubeconfigso thek8sobjectsreceiver picks it up automatically viaauth_type: kubeConfig.k8sobjects/eventsreceiver: Configured withauth_type: kubeConfig, watchingevents.k8s.io/eventsin watch mode for low-latency delivery.transform/eventsprocessor: StripsmanagedFieldsfrom the event body before forwarding, reducing payload size and noise.logs/eventspipeline:k8sobjects/events → resource → memoryLimiter → transform/events → batch → exporters. The resource processor enriches events withk8s.cluster.name,gardener.project.name, andgardener.shoot.name— the same attributes already applied to all other signals.Lifecycle correctness:
Delete: shootManagedResourceis deleted and waited on before the shoot access secret is removed, ensuring theManagedResourcecontroller can still authenticate to the shoot while cleaning up the RBAC objects.Migrate: callsSetKeepObjects(true)on the shootManagedResourcebefore delegating toDelete, so the shoot RBAC objects are preserved on the shoot cluster during control-plane migration and the target seed can take ownership without re-creating them.Fluent Bit (separate commit): excludes
event-loggercontainer logs from theClusterInput(pattern*_shoot--*_event-logger_*.log) since events are now collected via the k8sobjects receiver. Also bumps the plugin image tov1.4.0.Example (separate commit): adds a routing connector to
examples/opentelemetry-receiver.yamlthat splits incoming logs by body type — structured k8s object logs vs plain string logs — with dedicatedtransformprocessors setting_msg.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
events.k8s.io/v1events are collected — corev1.Event(legacy API group"") is intentionally excluded as modern Kubernetes components write exclusively toevents.k8s.io.k8sobjectsreceiver usesauth_type: kubeConfigwhich reads$KUBECONFIG. This is the standard Gardener pattern for shoot-cluster access from seed workloads.ManagedResourceuseskeepObjects: falseduring normal operation;SetKeepObjects(true)is set only transiently duringMigratebefore the resource is deleted from the old seed.```bash
kubectl -n shoot--local--local port-forward statefulset/otelcol 8888:8888
curl -s localhost:8888/metrics | grep -E 'receiver="k8sobjects/events"'
```
A non-zero `otelcol_receiver_accepted_log_records_total{receiver="k8sobjects/events"}` confirms events are being received.
Release note:
```feature operator
Replace proprietary event-logger-based shoot event collection with the OTel
k8sobjects receiver. Events are now collected directly from the shoot API server
as structured OTel log records (events.k8s.io/v1), enriched with cluster,
project, and shoot name attributes, and forwarded in standard OTel format to
any configured upstream receiver without custom parsing.
```