Skip to content

Cache older index entries #1130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 7, 2019

Conversation

gouthamve
Copy link
Contributor

Fixes #964

Tests pending

Signed-off-by: Goutham Veeramachaneni <[email protected]>
@gouthamve gouthamve changed the title [WIP] Cache older index entries Cache older index entries Nov 26, 2018
@gouthamve
Copy link
Contributor Author

This is good for review now. I'll need to fix the flag naming (cache-older-than), but ideas there appreciated.

I'll put this in dev and report back.

Signed-off-by: Goutham Veeramachaneni <[email protected]>
@@ -60,6 +61,8 @@ type seriesStore struct {
cardinalityCache *cache.FifoCache

writeDedupeCache cache.Cache

cacheLookupsOlderThan time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, will remove it.

@@ -58,6 +58,9 @@ type IndexQuery struct {

// Filters for querying
ValueEqual []byte

// If the result of this lookup can be cached or not.
Cacheable bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be cleaner if its a duration and not just a bool? Ie the schema tells (by virtue of returning queries) how long their results are cachable for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the only answer to that is 0 and infinite because how long we can cache the active entries is determined by how long we keep chunks which the schema has no info about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, false doesn't mean don't cache, it means cache based on something else. Therefore this is misleading. Maybe this should be called Mutable, indicating it will change, and therefore shouldn't be cached for too long?

cfg.memcacheClient.RegisterFlagsWithPrefix("index.", "Deprecated: Use -store.index-cache-read.*;", f)

cfg.indexQueriesCacheConfig.RegisterFlagsWithPrefix("store.index-cache-read.", "Cache config for index entry reading. ", f)
f.DurationVar(&cfg.IndexCacheValidity, "store.index-cache-validity", 5*time.Minute, "Cache validity for active index entries. Should be no higher than -ingester.max-chunk-idle.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is no longer used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how long we want to cache the active entries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where it is used either

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is being used here: https://github.com/cortexproject/cortex/pull/1130/files#diff-d479a87a51735dca31797a0bc4af42caL95 to set the valid duration for caching mutable entries.

@tomwilkie
Copy link
Contributor

I was wondering if, instead of adding a caching schema wrapper, we might extend the bucket type (https://github.com/cortexproject/cortex/blob/master/pkg/chunk/schema_config.go#L233) to expose if its the most recent active bucket or not? Then the schemas can propagate that into their IndexQueries.

@gouthamve
Copy link
Contributor Author

Yes, but if a bucket's entries can be cached not based on if it's the most recent but rather how stale our writes can be. Having said that, I think it still can result in a cleaner abstraction. Will check.

@tomwilkie
Copy link
Contributor

a bucket's entries can be cached not based on if it's the most recent but rather how stale our writes can be

Agreed, in other places (table manager) there is a grace period to deal with this.

Copy link
Contributor

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not expecting this to be so complicated.

"github.com/prometheus/common/model"
)

type cachingSchema struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type is cachingSchema but filename is schema_caching ?

return mergeCacheableAndActiveQueries(cacheableQueries, activeQueries), nil
}

func splitTimesByCacheability(from, through model.Time, cacheBefore model.Time) (model.Time, model.Time, model.Time, model.Time) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really valuable in the presence of the caching front-end which will shard by day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, if using the frontend, but the frontend is an optional component, I guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus I'd say yes it is - the caching frontend only matches exact querier, this will match individual labels, which is useful across multiple different queries.

cfg.memcacheClient.RegisterFlagsWithPrefix("index.", "Deprecated: Use -store.index-cache-read.*;", f)

cfg.indexQueriesCacheConfig.RegisterFlagsWithPrefix("store.index-cache-read.", "Cache config for index entry reading. ", f)
f.DurationVar(&cfg.IndexCacheValidity, "store.index-cache-validity", 5*time.Minute, "Cache validity for active index entries. Should be no higher than -ingester.max-chunk-idle.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where it is used either

Signed-off-by: Goutham Veeramachaneni <[email protected]>
@tomwilkie
Copy link
Contributor

@gouthamve needs rebasing

Signed-off-by: Goutham Veeramachaneni <[email protected]>
@tomwilkie
Copy link
Contributor

LGTM!

@tomwilkie tomwilkie merged commit b692c5f into cortexproject:master Jan 7, 2019
@gouthamve gouthamve deleted the cache-old-entries branch January 8, 2019 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants