Skip to content

Prevent OOMs in the chunk store. #873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 12, 2018

Conversation

tomwilkie
Copy link
Contributor

@tomwilkie tomwilkie commented Jul 10, 2018

This change moves the parsing and instantiation of the struct Chunks after the intersection of their IDs - we were finding on very high cardinality timeseries (~8m) and queriers would OOM just building the array of chunks from the index, even if the intersection of label matchers was relatively small.

We also add a limit to the number of chunks fetched in a single query.

The big casualty here is removing the support for metadataInIndex chunks - comment says these were last used in Nov 2016, so this is probably safe.

Signed-off-by: Tom Wilkie [email protected]

- Merge and dedupe set of strings, not sets of parse chunks.
- Limit number of chunks fetched in a single query.
- Add lots more debug logging to the querier to help track this all down.

Signed-off-by: Tom Wilkie <[email protected]>
@tomwilkie
Copy link
Contributor Author

After deploying this we're seeing about a 2x reduction in steady state memory usage, and almost the complete elimination of the spikes that were causing us problems. NB this was deployed along side #713, but when #713 was deployed on its own we didn't see the same reductions.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - some small points

}

var chunkSet ByKey
func (c *Store) parseIndexEntries(ctx context.Context, entries []IndexEntry, matcher *labels.Matcher) ([]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth retaining the comment that entries are returned in order, and here seems to be the right place for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (cs ByKey) Less(i, j int) bool { return lessByKey(cs[i], cs[j]) }

// This comparison uses all the same information as Chunk.ExternalKey()
func lessByKey(a, b chunk.Chunk) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was a good idea when it saved the creation of millions of temp strings in queries; as part of a test it looks over-complicated.

@@ -21,6 +21,12 @@ import (
"github.com/weaveworks/common/user"
)

func init() {
var al util.AllowedLevel
al.Set("debug")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< cough >

@@ -168,6 +170,10 @@ func (c *Store) calculateDynamoWrites(userID string, chunks []Chunk) (WriteBatch

// Get implements ChunkStore
func (c *Store) Get(ctx context.Context, from, through model.Time, allMatchers ...*labels.Matcher) (model.Matrix, error) {

logger := util.WithContext(ctx, util.Logger)
level.Debug(logger).Log("msg", "ChunkStore.Get", "from", from, "through", through, "matchers", len(allMatchers))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log line is very similar to the tracing log line below; perhaps we should have a wrapper so we don't repeat very similar code?

filters, matchers := util.SplitFiltersAndMatchers(allMatchers)
chunks, err := c.lookupChunksByMetricName(ctx, from, through, matchers, metricName)
if err != nil {
return nil, err
}

level.Debug(logger).Log("func", "ChunkStore.getMetricNameChunks", "msg", "Chunks in index", "n", len(chunks))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"msg" and "n" seem like an unnecessary level of indirection

}

// Merge entries in order because we wish to keep label series together consecutively
chunkIDs := nWayIntersectStrings(chunkIDSets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than assembling a list then breaking it down to intersect, I'd think it is cheaper to do the intersecting as the results come in. But that could be a subsequent change.

@tomwilkie
Copy link
Contributor Author

@bboreham PTAL?

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -166,20 +169,44 @@ func (c *Store) calculateDynamoWrites(userID string, chunks []Chunk) (WriteBatch
return writeReqs, nil
}

type spanLogger struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment describing the intent of this struct ?

@tomwilkie tomwilkie merged commit 45dc92f into cortexproject:master Jul 12, 2018
@tomwilkie tomwilkie deleted the chunk-store-oom branch July 12, 2018 10:01
@tomwilkie
Copy link
Contributor Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants