-
Notifications
You must be signed in to change notification settings - Fork 816
Diskcache: querier-local SSD backed chunk cache #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Caches don't know about chunk anymore, its all just []byte. - Deal with empty memcached host by returning a noopCache, not special casing nil client. - Make background writes a cache 'middleware'. - s/memcache/memcached/. - Refactor tests.
- mmap a large file, treat it as a series of 2KB buckets. - Use FNV hash to place key and chunk in buckets. - Use existing memcached tests
We've had this running for about a week without problems. Performance impact is hard to measure, we see the diskcache taking about 0.5ms vs 33ms memcache (99th percentile fetch), but a relatively low hit rate of 70% vs memcache which is always at 100%. Given that this is disabled by default, I think the PR is worth it just for the refactoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be ok with merging after nits are fixed.
pkg/chunk/chunk_store.go
Outdated
// ProcessCacheResponse decodes the chunks coming back from the cache, separating | ||
// hits and misses. | ||
func ProcessCacheResponse(chunks []Chunk, keys []string, bufs [][]byte) (found []Chunk, missing []Chunk, err error) { | ||
ctx := NewDecodeContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx
jars - I would expect that to be a context.Context
pkg/chunk/chunk_store.go
Outdated
cacheCorrupt = prometheus.NewCounter(prometheus.CounterOpts{ | ||
Namespace: "cortex", | ||
Name: "cache_corrupt_chunks_total", | ||
Help: "Total count of corrupt chunks found in memcache.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not memcache
pkg/chunk/cache/instrumented.go
Outdated
fetchedKeys = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "cortex", | ||
Name: "cache_fetched_keys", | ||
Help: "Total count of chunks requested from memcache.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not memcache
pkg/chunk/cache/instrumented.go
Outdated
hits = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: "cortex", | ||
Name: "cache_hits", | ||
Help: "Total count of chunks found in memcache.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not memcache
Thanks @bboreham |
An on-disk cache, for caching chunks on local SSD in queriers. Uses a large mmap'd file, treated as a series of 2KB buckets, uses FNV hash to place key and chunk in buckets.
Also a bunch of refactoring of the memcache module, see commits.
Still a work in progress, will update with results.