Skip to content

Operator crashloops when trying to access a restricted namespace for label resolving #2429

@yildizozgur

Description

@yildizozgur

Describe the bug
We are using watchNamespaceSelector and it listens crossnamespaces in the cluster. When we enabled rolebinding in watched namespace, Operator works successfully. But when we undeploy all resources(including rolebinding for operator namespace) the Operator logs some errors and then Operator pods are gettinging Crashloopback state. When we use old version, v5.19.4 , it was working without errors. But now, pods are getting failed constantly.

2026-01-08T08:21:34Z	error	controller-runtime.cache.UnhandledError	Failed to watch	{"reflector": "k8s.io/client-go@v0.34.3/tools/cache/reflector.go:290", "type": "*v1.Secret", "error": "failed to list *v1.Secret: secrets is forbidden: User \"system:serviceaccount:obs-grafana-operator-qa:grafana-operator-sa\" cannot list resource \"secrets\" in API group \"\" in the namespace \"obs-monitoring-qa\""}
k8s.io/apimachinery/pkg/util/runtime.logError
	k8s.io/apimachinery@v0.35.0/pkg/util/runtime/runtime.go:221
k8s.io/apimachinery/pkg/util/runtime.handleError
	k8s.io/apimachinery@v0.35.0/pkg/util/runtime/runtime.go:212
k8s.io/apimachinery/pkg/util/runtime.HandleErrorWithContext
	k8s.io/apimachinery@v0.35.0/pkg/util/runtime/runtime.go:198
k8s.io/client-go/tools/cache.DefaultWatchErrorHandler
	k8s.io/client-go@v0.34.3/tools/cache/reflector.go:205
k8s.io/client-go/tools/cache.(*Reflector).RunWithContext.func1
	k8s.io/client-go@v0.34.3/tools/cache/reflector.go:361
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/backoff.go:233
k8s.io/apimachinery/pkg/util/wait.BackoffUntilWithContext.func1
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/backoff.go:255
k8s.io/apimachinery/pkg/util/wait.BackoffUntilWithContext
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/backoff.go:256
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/backoff.go:233
k8s.io/client-go/tools/cache.(*Reflector).RunWithContext
	k8s.io/client-go@v0.34.3/tools/cache/reflector.go:359
k8s.io/client-go/tools/cache.(*controller).RunWithContext.(*Group).StartWithContext.func3
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/wait.go:63
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/wait.go:72

2026-01-08T08:35:48Z	error	error received after stop sequence was engaged	{"error": "failed to wait for grafanadashboard caches to sync kind source: *v1.ConfigMap: timed out waiting for cache to be synced for Kind *v1.ConfigMap"}
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/manager/internal.go:517
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafanamutetiming", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaMuteTiming"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafanamutetiming", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaMuteTiming", "worker count": 1}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafanamutetiming", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaMuteTiming"}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafanamutetiming", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaMuteTiming"}
2026-01-08T08:35:48Z	error	controller-runtime.source.Kind	failed to get informer from cache	{"error": "Timeout: failed waiting for *v1.Secret Informer to sync"}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:80
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:53
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:54
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:68
2026-01-08T08:35:48Z	error	controller-runtime.source.Kind	failed to get informer from cache	{"error": "Timeout: failed waiting for *v1.Secret Informer to sync"}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:80
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:53
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/loop.go:54
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
	k8s.io/apimachinery@v0.35.0/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/internal/source/kind.go:68
2026-01-08T08:35:48Z	error	error received after stop sequence was engaged	{"error": "failed to wait for grafanadatasource caches to sync kind source: *v1.ConfigMap: timed out waiting for cache to be synced for Kind *v1.ConfigMap"}
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/manager/internal.go:517
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafananotificationpolicy", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationPolicy"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafananotificationpolicy", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationPolicy", "worker count": 1}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafananotificationpolicy", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationPolicy"}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafananotificationpolicy", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationPolicy"}
2026-01-08T08:35:48Z	error	error received after stop sequence was engaged	{"error": "failed to wait for grafana caches to sync kind source: *v1.HTTPRoute: timed out waiting for cache to be synced for Kind *v1.HTTPRoute"}
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1
	sigs.k8s.io/controller-runtime@v0.22.4/pkg/manager/internal.go:517
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafanacontactpoint", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaContactPoint"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafanacontactpoint", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaContactPoint", "worker count": 1}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafanacontactpoint", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaContactPoint"}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafanacontactpoint", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaContactPoint"}
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafanalibrarypanel", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaLibraryPanel"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafanalibrarypanel", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaLibraryPanel", "worker count": 1}
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafananotificationtemplate", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationTemplate"}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafanalibrarypanel", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaLibraryPanel"}
2026-01-08T08:35:48Z	info	Starting Controller	{"controller": "grafanaserviceaccount", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaServiceAccount"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafananotificationtemplate", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationTemplate", "worker count": 1}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafanalibrarypanel", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaLibraryPanel"}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafananotificationtemplate", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationTemplate"}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafananotificationtemplate", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaNotificationTemplate"}
2026-01-08T08:35:48Z	info	Starting workers	{"controller": "grafanaserviceaccount", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaServiceAccount", "worker count": 1}
2026-01-08T08:35:48Z	info	Shutdown signal received, waiting for all workers to finish	{"controller": "grafanaserviceaccount", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaServiceAccount"}
2026-01-08T08:35:48Z	info	All workers finished	{"controller": "grafanaserviceaccount", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaServiceAccount"}
2026-01-08T08:35:48Z	info	Stopping and waiting for caches
2026-01-08T08:35:48Z	info	Stopping and waiting for webhooks
2026-01-08T08:35:48Z	info	Stopping and waiting for HTTP servers
2026-01-08T08:35:48Z	info	shutting down server	{"name": "health probe", "addr": "[::]:8081"}
2026-01-08T08:35:48Z	info	shutting down server	{"name": "pprof", "addr": "[::]:8888"}
2026-01-08T08:35:48Z	info	controller-runtime.metrics	Shutting down metrics server with timeout of 1 minute
2026-01-08T08:35:48Z	info	Wait completed, proceeding to shutdown the manager
2026-01-08T08:35:48Z	error	setup	problem running operator	{"version": "v5.21.4", "error": "failed to wait for grafanafolder caches to sync kind source: *v1beta1.GrafanaFolder: timed out waiting for cache to be synced for Kind *v1beta1.GrafanaFolder"}
main.main
	github.com/grafana/grafana-operator/v5/main.go:436
runtime.main
	runtime/proc.go:285

Version
5.21.4

To Reproduce
Steps to reproduce the behavior:

  1. enable watchNamespaceSelector with a specific label. And set this label to a different namespace from operator deployed in the cluster.
  2. redeploy or restart Operator pods
  3. check Operator pods state and logs.
  4. See error

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions