Long-term index caching misses chunks on lookups #1698

gouthamve · 2019-09-25T15:19:49Z

In GetChunksForSeries, we do a RangeValueStart on bucket.from. Now I thought if the bucket.hashKey is same, the bucket.from is also same, turns out that is soo not true.

bucket.from is the relative milliseconds from the actual bucket start time.

Now I did not realise this while building the long-term caching approach, I thought if the hashKey and RangeKey match, everything else matches:

cortex/pkg/chunk/schema_caching.go

Lines 99 to 100 in 78e0607

    
           // When deduping, the bucket values only influence TableName and HashValue 
        
           // and just checking those is enough.

But when we split the ranges, we split the actual from and to, to "cacheable" and "active" ranges here:

cortex/pkg/chunk/schema_caching.go

Line 66 in 78e0607

    
           cFrom, cThrough, from, through := splitTimesByCacheability(from, through, model.TimeFromUnix(mtime.Now().Add(-s.cacheOlderThan).Unix()))

and end up picking only the "active" range query on merge. This means the bucket.from is higher than it should be and we end up filtering chunks out. This causes gaps in the queries and we're dropping entire chunks to the floor.

The text was updated successfully, but these errors were encountered:

An attempt to fix cortexproject#1698 We don't mix things when the time-range for the query overlaps the "active" time-range. We consider all index entries as active. This is because the `IndexQuery` fields depend on the `from` value and changing it might mess things up. This is kinda only effective when paired with query-frontend as most queries issued fall in the active-range, but the query-frontend with it's splitting would make sure the queriers actually only see some queries that are totally in the non-active range. Signed-off-by: Goutham Veeramachaneni <[email protected]>

Signed-off-by: Goutham Veeramachaneni <[email protected]>

An attempt to fix cortexproject/cortex#1698 We don't mix things when the time-range for the query overlaps the "active" time-range. We consider all index entries as active. This is because the `IndexQuery` fields depend on the `from` value and changing it might mess things up. This is kinda only effective when paired with query-frontend as most queries issued fall in the active-range, but the query-frontend with it's splitting would make sure the queriers actually only see some queries that are totally in the non-active range. Signed-off-by: Goutham Veeramachaneni <[email protected]>

gouthamve added component/querier type/bug labels Sep 25, 2019

gouthamve mentioned this issue Sep 25, 2019

Simplify long-term caching #1699

Merged

bboreham closed this as completed in #1699 Oct 17, 2019

gouthamve added a commit to gouthamve/cortex that referenced this issue Oct 18, 2019

Remove log line added for debugging cortexproject#1698

bc4a1d3

Signed-off-by: Goutham Veeramachaneni <[email protected]>

gouthamve added a commit that referenced this issue Oct 18, 2019

Remove log line added for debugging #1698 (#1743)

da442a0

Signed-off-by: Goutham Veeramachaneni <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long-term index caching misses chunks on lookups #1698

Long-term index caching misses chunks on lookups #1698

gouthamve commented Sep 25, 2019

Long-term index caching misses chunks on lookups #1698

Long-term index caching misses chunks on lookups #1698

Comments

gouthamve commented Sep 25, 2019