-
Notifications
You must be signed in to change notification settings - Fork 831
Description
When using Memberlist, we are starting the ring service before the ring information is propagated causing query failing with "empty ring" right after the querier has started.
The ring services tries to make sure the ring data is populated before starting the other modules (see #4068) but this is not working well with memberlist as it's service is only joining the cluster when is already running:
cortex/pkg/ring/kv/memberlist/memberlist_client.go
Lines 445 to 450 in 7bc21e5
func (m *KV) running(ctx context.Context) error { | |
// Join the cluster, if configured. We want this to happen in Running state, because started memberlist | |
// is good enough for usage from Client (which checks for Running state), even before it connects to the cluster. | |
if len(m.cfg.JoinMembers) > 0 { | |
// Lookup SRV records for given addresses to discover members. |
The comment about is good enough for usage from Client (which checks for Running state), even before it connects to the cluster.
seems to not hold true as the service states flips to running before calling the running function:
cortex/pkg/util/services/basic_service.go
Lines 182 to 191 in 7bc21e5
b.mustSwitchState(Starting, Running, func() { | |
// unblock waiters waiting for Running state | |
close(b.runningWaitersCh) | |
b.notifyListeners(func(l Listener) { l.Running() }, false) | |
}) | |
stoppingFrom = Running | |
if b.runningFn != nil { | |
err = b.runningFn(b.serviceContext) | |
} |
When this happens we can also see this log line (which should not happen):
Line 269 in 7bc21e5
level.Info(r.logger).Log("msg", "ring doesn't exist in KV store yet") |
To Reproduce
Use memberlist and restart queriers.
Expected behavior
Queriers should start to process request only after the ring information is propagated.
Environment:
- Infrastructure: [e.g., Kubernetes, bare-metal, laptop]
- Deployment tool: [e.g., helm, jsonnet]
Storage Engine
- Blocks
- Chunks
Additional Context