-
Notifications
You must be signed in to change notification settings - Fork 816
RFC: stop Store.Put() writing to memcached #611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Its probably worth putting some number on it - lets say:
My original aim for memcache was to store the last day or two or chunks, which a 3GB memcache will do two weeks for one host, or one day for 14 hosts. We see >>80% of queries only hitting the last hour, for which only the odd chunk has to be fetched, and we get pretty much 100% hit rate for this. I'd say the cache was super important back when chunks were in S3 (which was slow) but now they're in DynamoDB/BigTable its less of an issue. OTOH with the in-process caches from #685, which is only populated on reads, we see a much lower hit rate (60-70%). So I'd suggest memcache is probably useful as is, and if you're seeing lower hit rates it might need to increase its size to store the last day or so. I agree, for chunks in the distant past, its probably not going to be useful, and some extra, alternate caching schemes could be useful there. The problem of heavy write load masking caching effectiveness from small reads was something ARC was supposed to solve (https://en.wikipedia.org/wiki/Adaptive_replacement_cache), but patents. There are some patches floating around for ARC in memcache (https://groups.google.com/forum/#!topic/memcached/doid1sLL6BA) |
Queries in the last hour will predominantly hit the ingester memory. Yes, we really need numbers on this.
Are you forgetting metadata?
I think our mode of use of monitoring data must be different: once I see something interesting I need to look back a few days or weeks to see how it compares, and then the slow performance really sucks. Cache won't help on first time some data is read, but it's common to have viewers auto-refresh or to switch back and forth between views. In our multi-tenant installation it's impracticable to cache days of incoming data in RAM. |
The current set of cache flags allows this to be configured on or off on the command-line; I turned it off in the systems I work with. No need for code changes; closing. |
Every chunk written to the chunk store is also written to memcached.
Since we expect to write many times more chunks than we read, this activity is largely pointless.
It doesn't appear to take long, but I suspect it evicts data that has been read, and may be read again.
(Marked "RFC" because I don't have data to back up this idea)
The text was updated successfully, but these errors were encountered: