Skip to content

RFC: stop Store.Put() writing to memcached #611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bboreham opened this issue Nov 23, 2017 · 3 comments
Closed

RFC: stop Store.Put() writing to memcached #611

bboreham opened this issue Nov 23, 2017 · 3 comments

Comments

@bboreham
Copy link
Contributor

Every chunk written to the chunk store is also written to memcached.
Since we expect to write many times more chunks than we read, this activity is largely pointless.
It doesn't appear to take long, but I suspect it evicts data that has been read, and may be read again.

(Marked "RFC" because I don't have data to back up this idea)

@tomwilkie
Copy link
Contributor

Its probably worth putting some number on it - lets say:

  • we have 3GB of memcache, should be able to store 3x10^6 chunks.
  • Chunks on average container ~600 samples, so a single host at 500 samples/s would be flushing 2.5 chunks/s with 3x replication.
  • So 3GB of memcache is ~300 host-hours of samples.

My original aim for memcache was to store the last day or two or chunks, which a 3GB memcache will do two weeks for one host, or one day for 14 hosts. We see >>80% of queries only hitting the last hour, for which only the odd chunk has to be fetched, and we get pretty much 100% hit rate for this.

I'd say the cache was super important back when chunks were in S3 (which was slow) but now they're in DynamoDB/BigTable its less of an issue. OTOH with the in-process caches from #685, which is only populated on reads, we see a much lower hit rate (60-70%).

So I'd suggest memcache is probably useful as is, and if you're seeing lower hit rates it might need to increase its size to store the last day or so. I agree, for chunks in the distant past, its probably not going to be useful, and some extra, alternate caching schemes could be useful there.

The problem of heavy write load masking caching effectiveness from small reads was something ARC was supposed to solve (https://en.wikipedia.org/wiki/Adaptive_replacement_cache), but patents. There are some patches floating around for ARC in memcache (https://groups.google.com/forum/#!topic/memcached/doid1sLL6BA)

@bboreham
Copy link
Contributor Author

Queries in the last hour will predominantly hit the ingester memory. Yes, we really need numbers on this.

should be able to store 3x10^6 chunks.

Are you forgetting metadata?

if you're seeing lower hit rates it might need to increase its size to store the last day or so

I think our mode of use of monitoring data must be different: once I see something interesting I need to look back a few days or weeks to see how it compares, and then the slow performance really sucks. Cache won't help on first time some data is read, but it's common to have viewers auto-refresh or to switch back and forth between views.

In our multi-tenant installation it's impracticable to cache days of incoming data in RAM.

@bboreham
Copy link
Contributor Author

The current set of cache flags allows this to be configured on or off on the command-line; I turned it off in the systems I work with. No need for code changes; closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants