Skip to content

Query store for series lookups #3461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
* [ENHANCEMENT] Added `cortex_alertmanager_config_hash` metric to expose hash of Alertmanager Config loaded per user. #3388
* [ENHANCEMENT] Query-Frontend / Query-Scheduler: New component called "Query-Scheduler" has been introduced. Query-Scheduler is simply a queue of requests, moved outside of Query-Frontend. This allows Query-Frontend to be scaled separately from number of queues. To make Query-Frontend and Querier use Query-Scheduler, they need to be started with `-frontend.scheduler-address` and `-querier.scheduler-address` options respectively. #3374
* [ENHANCEMENT] Query-frontend / Querier / Ruler: added `-querier.max-query-lookback` to limit how long back data (series and metadata) can be queried. This setting can be overridden on a per-tenant basis and is enforced in the query-frontend, querier and ruler. #3452 #3458
* [ENHANCEMENT] Querier: added `-querier.query-store-for-labels-enabled` to query store for series API. Only works with blocks storage engine. #3461
* [BUGFIX] Blocks storage ingester: fixed some cases leading to a TSDB WAL corruption after a partial write to disk. #3423
* [BUGFIX] Blocks storage: Fix the race between ingestion and `/flush` call resulting in overlapping blocks. #3422
* [BUGFIX] Querier: fixed `-querier.max-query-into-future` which wasn't correctly enforced on range queries. #3452
Expand Down
2 changes: 1 addition & 1 deletion docs/api/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ GET,POST <prometheus-http-prefix>/api/v1/series
GET,POST <legacy-http-prefix>/api/v1/series
```

Find series by label matchers. Differently than Prometheus and due to scalability and performances reasons, Cortex currently ignores the `start` and `end` request parameters and always fetches the series from in-memory data stored in the ingesters.
Find series by label matchers. Differently than Prometheus and due to scalability and performances reasons, Cortex currently ignores the `start` and `end` request parameters and always fetches the series from in-memory data stored in the ingesters. There is experimental support to query the long-term store with the *blocks* storage engine when `-querier.query-store-for-labels-enabled` is set.

_For more information, please check out the Prometheus [series endpoint](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers) documentation._

Expand Down
5 changes: 5 additions & 0 deletions docs/blocks-storage/querier.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,11 @@ querier:
# CLI flag: -querier.query-ingesters-within
[query_ingesters_within: <duration> | default = 0s]

# Query long-term store for series, label values and label names APIs. Works
# only with blocks engine.
# CLI flag: -querier.query-store-for-labels-enabled
[query_store_for_labels_enabled: <boolean> | default = false]

# The time after which a metric should only be queried from storage and not
# just ingesters. 0 means all queries are sent to store. When running the
# blocks storage, if this option is enabled, the time range of the query sent
Expand Down
5 changes: 5 additions & 0 deletions docs/configuration/config-file-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -751,6 +751,11 @@ The `querier_config` configures the Cortex querier.
# CLI flag: -querier.query-ingesters-within
[query_ingesters_within: <duration> | default = 0s]

# Query long-term store for series, label values and label names APIs. Works
# only with blocks engine.
# CLI flag: -querier.query-store-for-labels-enabled
[query_store_for_labels_enabled: <boolean> | default = false]

# The time after which a metric should only be queried from storage and not just
# ingesters. 0 means all queries are sent to store. When running the blocks
# storage, if this option is enabled, the time range of the query sent to the
Expand Down
3 changes: 2 additions & 1 deletion docs/configuration/v1-guarantees.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The Cortex maintainers commit to ensuring future version of Cortex can read data
Cortex strives to be 100% API compatible with Prometheus (under `/api/prom/*`); any deviation from this is considered a bug, except:

- Requiring the `__name__` label on queries when querying the [chunks storage](../chunks-storage/_index.md) (queries to ingesters or clusters running the blocks storage are not affected).
- For queries to the `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/{name}/values` endpoints, query's time range is ignored and the data is always fetched from ingesters.
- For queries to the `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/{name}/values` endpoints, query's time range is ignored and the data is always fetched from ingesters. There is experimental support to query the long-term store with the *blocks* storage engine when `-querier.query-store-for-labels-enabled` is set.
- Additional API endpoints for creating, removing and modifying alerts and recording rules.
- Additional API around pushing metrics (under `/api/push`).
- Additional API endpoints for management of Cortex itself, such as the ring. These APIs are not part of the any compatibility guarantees.
Expand Down Expand Up @@ -56,3 +56,4 @@ Currently experimental features are:
- OpenStack Swift storage support.
- Metric relabeling in the distributor.
- Scalable query-frontend (when using query-scheduler)
- Querying store for series, labels APIs (`-querier.query-store-for-labels-enabled`)
2 changes: 1 addition & 1 deletion docs/guides/limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ The Cortex chunks storage doesn't support queries without a metric name, like `c

## Query series and labels

When running queries to the `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/{name}/values` endpoints, query's time range is ignored and the data is always fetched from ingesters.
When running queries to the `/api/v1/series`, `/api/v1/labels` and `/api/v1/label/{name}/values` endpoints, query's time range is ignored and the data is always fetched from ingesters. There is experimental support to query the long-term store with the *blocks* storage engine when `-querier.query-store-for-labels-enabled` is set.
27 changes: 24 additions & 3 deletions integration/querier_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ func TestQuerierWithBlocksStorageRunningInMicroservicesMode(t *testing.T) {
"-store-gateway.sharding-strategy": testCfg.blocksShardingStrategy,
"-store-gateway.tenant-shard-size": fmt.Sprintf("%d", testCfg.tenantShardSize),
"-querier.ingester-streaming": strconv.FormatBool(testCfg.ingesterStreamingEnabled),
"-querier.query-store-for-labels-enabled": "true",
})

// Start dependencies.
Expand Down Expand Up @@ -293,6 +294,7 @@ func TestQuerierWithBlocksStorageRunningInSingleBinaryMode(t *testing.T) {
"-blocks-storage.bucket-store.index-cache.backend": testCfg.indexCacheBackend,
"-blocks-storage.bucket-store.index-cache.memcached.addresses": "dns+" + memcached.NetworkEndpoint(e2ecache.MemcachedPort),
"-querier.ingester-streaming": strconv.FormatBool(testCfg.ingesterStreamingEnabled),
"-querier.query-store-for-labels-enabled": "true",
// Ingester.
"-ring.store": "consul",
"-consul.hostname": consul.NetworkHTTPEndpoint(),
Expand Down Expand Up @@ -432,6 +434,7 @@ func testMetadataQueriesWithBlocksStorage(
var (
lastSeriesInIngesterBlocksName = getMetricName(lastSeriesInIngesterBlocks.Labels)
firstSeriesInIngesterHeadName = getMetricName(firstSeriesInIngesterHead.Labels)
lastSeriesInStorageName = getMetricName(lastSeriesInStorage.Labels)

lastSeriesInStorageTs = util.TimeFromMillis(lastSeriesInStorage.Samples[0].Timestamp)
lastSeriesInIngesterBlocksTs = util.TimeFromMillis(lastSeriesInIngesterBlocks.Samples[0].Timestamp)
Expand Down Expand Up @@ -471,6 +474,10 @@ func testMetadataQueriesWithBlocksStorage(
lookup: lastSeriesInIngesterBlocksName,
ok: false,
},
{
lookup: lastSeriesInStorageName,
ok: false,
},
},
labelValuesTests: []labelValuesTest{
{
Expand All @@ -493,6 +500,10 @@ func testMetadataQueriesWithBlocksStorage(
ok: true,
resp: lastSeriesInIngesterBlocks.Labels,
},
{
lookup: lastSeriesInStorageName,
ok: false,
},
},
labelValuesTests: []labelValuesTest{
{
Expand All @@ -502,7 +513,7 @@ func testMetadataQueriesWithBlocksStorage(
},
labelNames: []string{labels.MetricName, lastSeriesInIngesterBlocksName},
},
"query metadata partially inside the ingester range should return the head + local disk data": {
"query metadata partially inside the ingester range": {
from: lastSeriesInStorageTs.Add(-blockRangePeriod),
to: firstSeriesInIngesterHeadTs.Add(blockRangePeriod),
seriesTests: []seriesTest{
Expand All @@ -516,6 +527,11 @@ func testMetadataQueriesWithBlocksStorage(
ok: true,
resp: lastSeriesInIngesterBlocks.Labels,
},
{
lookup: lastSeriesInStorageName,
ok: true,
resp: lastSeriesInStorage.Labels,
},
},
labelValuesTests: []labelValuesTest{
{
Expand All @@ -525,9 +541,9 @@ func testMetadataQueriesWithBlocksStorage(
},
labelNames: []string{labels.MetricName, lastSeriesInIngesterBlocksName, firstSeriesInIngesterHeadName},
},
"query metadata entirely outside the ingester range should return the head data only": {
"query metadata entirely outside the ingester range should return the head data as well": {
from: lastSeriesInStorageTs.Add(-2 * blockRangePeriod),
to: lastSeriesInStorageTs.Add(-blockRangePeriod),
to: lastSeriesInStorageTs,
seriesTests: []seriesTest{
{
lookup: firstSeriesInIngesterHeadName,
Expand All @@ -538,6 +554,11 @@ func testMetadataQueriesWithBlocksStorage(
lookup: lastSeriesInIngesterBlocksName,
ok: false,
},
{
lookup: lastSeriesInStorageName,
ok: true,
resp: lastSeriesInStorage.Labels,
},
},
labelValuesTests: []labelValuesTest{
{
Expand Down
14 changes: 11 additions & 3 deletions pkg/querier/blocks_store_queryable.go
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,7 @@ func (q *blocksStoreQuerier) selectSorted(sp *storage.SelectHints, matchers ...*

// Fetch series from stores. If an error occur we do not retry because retries
// are only meant to cover missing blocks.
seriesSets, queriedBlocks, warnings, numChunks, err := q.fetchSeriesFromStores(spanCtx, clients, minT, maxT, matchers, convertedMatchers, maxChunksLimit, leftChunksLimit)
seriesSets, queriedBlocks, warnings, numChunks, err := q.fetchSeriesFromStores(spanCtx, sp, clients, minT, maxT, matchers, convertedMatchers, maxChunksLimit, leftChunksLimit)
if err != nil {
return storage.ErrSeriesSet(err)
}
Expand Down Expand Up @@ -433,6 +433,7 @@ func (q *blocksStoreQuerier) selectSorted(sp *storage.SelectHints, matchers ...*

func (q *blocksStoreQuerier) fetchSeriesFromStores(
ctx context.Context,
sp *storage.SelectHints,
clients map[BlocksStoreClient][]ulid.ULID,
minT int64,
maxT int64,
Expand All @@ -459,7 +460,13 @@ func (q *blocksStoreQuerier) fetchSeriesFromStores(
blockIDs := blockIDs

g.Go(func() error {
req, err := createSeriesRequest(minT, maxT, convertedMatchers, blockIDs)
// See: https://github.com/prometheus/prometheus/pull/8050
// TODO(goutham): we should ideally be passing the hints down to the storage layer
// and let the TSDB return us data with no chunks as in prometheus#8050.
// But this is an acceptable workaround for now.
skipChunks := sp != nil && sp.Func == "series"

req, err := createSeriesRequest(minT, maxT, convertedMatchers, skipChunks, blockIDs)
if err != nil {
return errors.Wrapf(err, "failed to create series request")
}
Expand Down Expand Up @@ -546,7 +553,7 @@ func (q *blocksStoreQuerier) fetchSeriesFromStores(
return seriesSets, queriedBlocks, warnings, int(numChunks.Load()), nil
}

func createSeriesRequest(minT, maxT int64, matchers []storepb.LabelMatcher, blockIDs []ulid.ULID) (*storepb.SeriesRequest, error) {
func createSeriesRequest(minT, maxT int64, matchers []storepb.LabelMatcher, skipChunks bool, blockIDs []ulid.ULID) (*storepb.SeriesRequest, error) {
// Selectively query only specific blocks.
hints := &hintspb.SeriesRequestHints{
BlockMatchers: []storepb.LabelMatcher{
Expand All @@ -569,6 +576,7 @@ func createSeriesRequest(minT, maxT int64, matchers []storepb.LabelMatcher, bloc
Matchers: matchers,
PartialResponseStrategy: storepb.PartialResponseStrategy_ABORT,
Hints: anyHints,
SkipChunks: skipChunks,
}, nil
}

Expand Down
26 changes: 15 additions & 11 deletions pkg/querier/querier.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ type Config struct {
IngesterStreaming bool `yaml:"ingester_streaming"`
MaxSamples int `yaml:"max_samples"`
QueryIngestersWithin time.Duration `yaml:"query_ingesters_within"`
QueryStoreForLabels bool `yaml:"query_store_for_labels_enabled"`

// QueryStoreAfter the time after which queries should also be sent to the store and not just ingesters.
QueryStoreAfter time.Duration `yaml:"query_store_after"`
Expand Down Expand Up @@ -84,6 +85,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
f.BoolVar(&cfg.IngesterStreaming, "querier.ingester-streaming", true, "Use streaming RPCs to query ingester.")
f.IntVar(&cfg.MaxSamples, "querier.max-samples", 50e6, "Maximum number of samples a single query can load into memory.")
f.DurationVar(&cfg.QueryIngestersWithin, "querier.query-ingesters-within", 0, "Maximum lookback beyond which queries are not sent to ingester. 0 means all queries are sent to ingester.")
f.BoolVar(&cfg.QueryStoreForLabels, "querier.query-store-for-labels-enabled", false, "Query long-term store for series, label values and label names APIs. Works only with blocks engine.")
f.DurationVar(&cfg.MaxQueryIntoFuture, "querier.max-query-into-future", 10*time.Minute, "Maximum duration into the future you can query. 0 to disable.")
f.DurationVar(&cfg.DefaultEvaluationInterval, "querier.default-evaluation-interval", time.Minute, "The default evaluation interval or step size for subqueries.")
f.DurationVar(&cfg.QueryStoreAfter, "querier.query-store-after", 0, "The time after which a metric should only be queried from storage and not just ingesters. 0 means all queries are sent to store. When running the blocks storage, if this option is enabled, the time range of the query sent to the store will be manipulated to ensure the query end is not more recent than 'now - query-store-after'.")
Expand Down Expand Up @@ -218,13 +220,14 @@ func NewQueryable(distributor QueryableWithFilter, stores []QueryableWithFilter,
}

q := querier{
ctx: ctx,
mint: mint,
maxt: maxt,
chunkIterFn: chunkIterFn,
tombstonesLoader: tombstonesLoader,
limits: limits,
maxQueryIntoFuture: cfg.MaxQueryIntoFuture,
ctx: ctx,
mint: mint,
maxt: maxt,
chunkIterFn: chunkIterFn,
tombstonesLoader: tombstonesLoader,
limits: limits,
maxQueryIntoFuture: cfg.MaxQueryIntoFuture,
queryStoreForLabels: cfg.QueryStoreForLabels,
}

dqr, err := distributor.Querier(ctx, mint, maxt)
Expand Down Expand Up @@ -266,9 +269,10 @@ type querier struct {
ctx context.Context
mint, maxt int64

tombstonesLoader *purger.TombstonesLoader
limits *validation.Overrides
maxQueryIntoFuture time.Duration
tombstonesLoader *purger.TombstonesLoader
limits *validation.Overrides
maxQueryIntoFuture time.Duration
queryStoreForLabels bool
}

// Select implements storage.Querier interface.
Expand All @@ -287,7 +291,7 @@ func (q querier) Select(_ bool, sp *storage.SelectHints, matchers ...*labels.Mat
// querying the long-term storage.
// Also, in the recent versions of Prometheus, we pass in the hint but with Func set to "series".
// See: https://github.com/prometheus/prometheus/pull/8050
if sp == nil || sp.Func == "series" {
if (sp == nil || sp.Func == "series") && !q.queryStoreForLabels {
// In this case, the query time range has already been validated when the querier has been
// created.
return q.metadataQuerier.Select(true, sp, matchers...)
Expand Down