Skip to content

Query the entire ingester range for series. #3347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

gouthamve
Copy link
Contributor

With prometheus/prometheus#8050 we shouldn't be
paging in chunks for the series API lookups. This would mean much lower
memory utilisation and is similar to doing a range query for that
period with the same query.

With that optimisation, we should allow series API to query all of the
ingester instead of just head block.

Signed-off-by: Goutham Veeramachaneni [email protected]

What this PR does:

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@gouthamve
Copy link
Contributor Author

Depending on #3345

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit dubious about always querying 0, math.MaxInt64. We have clusters with several days retention on ingesters and we would end up querying all ingesters blocks every time, even when the start/end time range is way smaller (think about "Last 1h" dashboards).

What if we try to do something smarter, like:

  • If start/end time range is within the retention period, honor start/end time range
  • Otherwise, query only the head (we would return series/labels different than what the user expect anyway)

@gouthamve gouthamve force-pushed the query-all-ingester-for-series branch from 2a9abd1 to 302b226 Compare October 29, 2020 12:02
@pull-request-size pull-request-size bot added size/M and removed size/S labels Oct 29, 2020
@gouthamve gouthamve force-pushed the query-all-ingester-for-series branch 2 times, most recently from 35062b8 to 1824eed Compare October 29, 2020 16:13
@pull-request-size pull-request-size bot added size/L and removed size/M labels Oct 30, 2020
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, thanks. LGTM! Could you add a CHANGELOG entry, please?

Note: @pstibrany also plan to review it by the end of the day. Do you mind waiting for his review too, please?

@@ -414,6 +414,291 @@ func TestQuerierWithBlocksStorageRunningInSingleBinaryMode(t *testing.T) {
}
}

func TestMetadataQueriesWithBlocksStorage(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test has been copied from the above ones. I think you could just add your assertions to the two tests above. You could create a function to assert on metadata queries, and call it from the two tests above.

@pstibrany
Copy link
Contributor

Note: @pstibrany also plan to review it by the end of the day. Do you mind waiting for his review too, please?

By my initial quick look it looks good, but haven't fully checked the integration test yet. I will do that later today.

gouthamve and others added 5 commits November 2, 2020 15:40
With prometheus/prometheus#8050 we shouldn't be
paging in chunks for the series API lookups. This would mean much lower
memory utilisation and is similar to doing a range query for that
period with the same query.

With that optimisation, we should allow series API to query all of the
ingester instead of just head block.

Signed-off-by: Goutham Veeramachaneni <[email protected]>
Signed-off-by: Goutham Veeramachaneni <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
@pracucci pracucci force-pushed the query-all-ingester-for-series branch from dc3755d to ab92eaf Compare November 2, 2020 14:40
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks!

}
}

func testMetadataQueriesWithBlocksStorage(t *testing.T, c *e2ecortex.Client, series1, series2, series3 []prompb.TimeSeries, blockRangePeriod time.Duration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I've found it bit confusing to express from and to in terms of series, and would suggest passing three points in time instead, relevant to the test:

  • minT of oldest block in ingester
  • minT of head
  • maxT of head

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason I passed the timeseries is because we also need labels (eg. series3[0].Labels)

@pracucci pracucci merged commit 0f80974 into cortexproject:master Nov 3, 2020
@gouthamve gouthamve deleted the query-all-ingester-for-series branch November 3, 2020 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants