Skip to content

Samples can get out of order between distributor and ingester #670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bboreham opened this issue Jan 25, 2018 · 8 comments
Open

Samples can get out of order between distributor and ingester #670

bboreham opened this issue Jan 25, 2018 · 8 comments

Comments

@bboreham
Copy link
Contributor

For some instances we see "sample timestamp out of order for series" in our logs with a gap between previous and new timestamp of 15 or 30 seconds.
If this was going wrong in the sending Prometheus we would see the same error on all ingester replicas. We do not: they are reported sporadically on one ingester at a time. From this I deduce the out-of-order is happening inside Cortex.

Here is my best theory: suppose some client Prometheus has hundreds of samples queued up for remote write, then the following can happen:

  • Prometheus sends 100 samples to distributor.
  • Distributor replicates the data three times and fires up three goroutines to deliver the data.
  • Once two of the calls have returned from ingesters, distributor returns success to prometheus.
  • Third call continues, on its goroutine.
  • Prometheus sends the next 100 samples; distributor (likely on another node) fires up another 3 goroutines.
  • One of those goroutines can overtake the third one from the previous call.
@tomwilkie
Copy link
Contributor

This can indeed happen; is it a problem? This error shouldn't get through to the user.

@gouthamve
Copy link
Contributor

Hrm, I'd leave this open, cuz it just hit me that the scrape interval is 15s, so unless the goroutine is hanging around for 15s, this should never happen, hrm.

@bboreham
Copy link
Contributor Author

@gouthamve the context is "suppose some client Prometheus has hundreds of samples queued up for remote write", e.g. the sender is catching up after a network outage. So they don't need to take 15s to write.

We don't see this so much now, maybe because I put a cap on the number of shards in the sender.

@weeco
Copy link
Contributor

weeco commented Mar 23, 2020

I do see this as well. I just noticed that Distributors reported hundreds of "sample out of orders" after one or two ingesters were consuming signifcantly more CPU than usual. Sometimes I also observe this behaviour during rolling updates of ingesters.

Screenshot 2020-03-23 at 10 45 52

@bboreham
Copy link
Contributor Author

@weeco do you have the othe conditions described?
Does the error message give two timestamps close to one scrape interval apart?
Does your sending Prometheus have a lot of data queued up?

@garo
Copy link
Contributor

garo commented Apr 17, 2020

I have what looks to be a similar situation. I have three prometheus clusters (all with just one replica) sending data to the same cortex (all share the same tenant/orgid) but just one of them is seeing "sample out of order" errors. This prometheus has in total 133157 series but its seeing errors from just a handful metrics: scrape_duration_seconds, scrape_samples_post_metric_relabeling, scrape_samples_scraped, scrape_series_added and "up".

Here's an example error message:
ts=2020-04-17T09:19:10.407113092Z caller=grpc_logging.go:38 method=/cortex.Ingester/Push duration=399.64µs err="rpc error: code = Code(400) desc = user=services: sample timestamp out of order; last timestamp: 1587115147.329, incoming timestamp: 1587115141.08 for series {__name__=\"scrape_duration_seconds\", __prom=\"eks-services-prod\", endpoint=\"https-metrics\", instance=\"10.107.166.33:10250\", job=\"kubelet\", namespace=\"kube-system\", node=\"ip-10-107-166-33.ec2.internal\", service=\"mon-prometheus-operator-kubelet\"}" msg="gRPC\n"

If I compare the timestamps in the error messages the difference is either 0.036 or 6.25 seconds. The sending prometheus version is 2.17.1

@garo
Copy link
Contributor

garo commented Apr 17, 2020

I am actually also seeing sample with repeated timestamp but different value from another tenant. This error also comes from the same file https://github.com/cortexproject/cortex/blob/master/pkg/ingester/series.go#L71 than the sample timestamp out of order error.

@pracucci
Copy link
Contributor

I am actually also seeing sample with repeated timestamp but different value from another tenant.

@garo This should be unrelated. The most common cause is when a tenant is remote writing clashing series. This could happen when you have relabelling rules in Prometheus which remove some labels from series, which would lead to clashing series (ie. series_1{a="1",b="2} and series_1{a="1",b="3"}, then you've a relabelling rule to remove the label b and you end up with the clashing series series_1{a="1"}). I've mentioning it cause this is an issue I've already seen few times with our customers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants