Don't convert chunks to matrixes and then merges them... #713

tomwilkie · 2018-02-16T11:57:55Z

...use iterators instead.

Also:

Update remote read endpoint for SeriesSet interface.
Copy the upstream heap-based merging code, and update it for chunk boundaries.
Dedupe samples coming out of the merge iterator.

Big drawback of this approach is the optimisation necessary about Seek - as we don't know the end time for any particular-seek'd iterator, we aren't technically able to filter out chunks in the future. This has a large performance impact (basically O(n^2)), as seeks at the very beginning of the range have to include all chunks. To avoid this I've assumed an end time of 48hrs, meaning vector queries with duration larger than 24hrs won't work correctly (ie sum(rate(foo[48h]))). But OTOH this allows code closer to O(n log n).

Before vs after:

benchmark                     old ns/op     new ns/op     delta
BenchmarkChunkQueryable-4     364386071     44484806      -87.79%

benchmark                     old allocs     new allocs     delta
BenchmarkChunkQueryable-4     1019223        453391         -55.52%

benchmark                     old bytes     new bytes     delta
BenchmarkChunkQueryable-4     903011968     14560358      -98.39%

Thats about 8.2x speed up in ns/op and 62x less memory usage.

Part of #209, closes #81 as redundant.

tomwilkie · 2018-02-19T15:14:27Z

Having deployed this to our dev env I didn't see the kind of speed up I was expecting. On investigation the code is very sensitive to "chunk density" - how many chunks there are per 48 hours. I originally modelling it at 48 (one per hour, each 3 hours long), but modelling as one every 6 mins more closely matches the performance I see in dev.

The good news is its still an improvement, but just not as much as before:

benchmark                                old ns/op     new ns/op     delta
BenchmarkChunkQueryableDoubleDelta-4     56575939      61026739      +7.87%
BenchmarkChunkQueryableVarbit-4          54786710      48974757      -10.61%

benchmark                                old allocs     new allocs     delta
BenchmarkChunkQueryableDoubleDelta-4     132677         119285         -10.09%
BenchmarkChunkQueryableVarbit-4          110284         82466          -25.22%

benchmark                                old bytes     new bytes     delta
BenchmarkChunkQueryableDoubleDelta-4     126389364     5278391       -95.82%
BenchmarkChunkQueryableVarbit-4          125695590     4220598       -96.64%

tomwilkie · 2018-02-20T19:28:42Z

We're seeing some more issues with this in testing. Worth reviewing, but don't merge yet.

tomwilkie · 2018-07-09T18:02:50Z

I've resurrected this and fixed the correctness issues (by expanding the tests). Current performance:

$ go test ./pkg/querier -benchmem -bench BenchmarkChunkQueryable
goos: darwin
goarch: amd64
pkg: github.com/weaveworks/cortex/pkg/querier
BenchmarkChunkQueryable/matrixes/DoubleDelta-4         	      20	  57487641 ns/op	144035209 B/op	  101260 allocs/op
BenchmarkChunkQueryable/matrixes/Varbit-4              	      20	  54214617 ns/op	143203628 B/op	   74548 allocs/op
BenchmarkChunkQueryable/iterators/DoubleDelta-4        	      20	  84387322 ns/op	 3114931 B/op	   94822 allocs/op
BenchmarkChunkQueryable/iterators/Varbit-4             	      20	  81035552 ns/op	 2283036 B/op	   68109 allocs/op
PASS
ok  	github.com/weaveworks/cortex/pkg/querier	10.560s

Old path (matrixes) is faster, but 60x more memory usage. Going to see if I can improve this a bit.

tomwilkie · 2018-07-10T11:42:18Z

Found an optimisation in the ChunksToMatrix function too, so results are now:

$ go test ./pkg/querier -benchmem -bench BenchmarkChunkQueryable 
goos: darwin
goarch: amd64
pkg: github.com/weaveworks/cortex/pkg/querier
BenchmarkChunkQueryable/matrixes/DoubleDelta-4                  100      20379796 ns/op    42817634 B/op       23292 allocs/op
BenchmarkChunkQueryable/iterators/DoubleDelta-4                  20      76533564 ns/op      984637 B/op       16846 allocs/op
BenchmarkChunkQueryable/matrixes/Varbit-4                       100      19467781 ns/op    42817585 B/op       22573 allocs/op
BenchmarkChunkQueryable/iterators/Varbit-4                       20      81691003 ns/op      984516 B/op       16125 allocs/op
PASS
ok      github.com/weaveworks/cortex/pkg/querier    8.490s

I suspect it might make sense to pick the strategy (iterate or merge) based on the length of the query (or the number of chunks in the series).

tomwilkie · 2018-07-10T20:12:30Z

Needed to update the vendored Prometheus for prometheus/prometheus@78efdc6d

- Refactor chunk store to return chunks. - Refactor querier package to use new Prometheus 2.0 interfaces. - Have separate querier for ingesters, chunk store and metadata. - Make remote read handler take a Queryable. Signed-off-by: Tom Wilkie <[email protected]>

… PromQL query. Signed-off-by: Tom Wilkie <[email protected]>

…and the upstream heap-based merging code. Signed-off-by: Tom Wilkie <[email protected]>

Signed-off-by: Tom Wilkie <[email protected]>

…ate NaN values Signed-off-by: Tom Wilkie <[email protected]>

Signed-off-by: Tom Wilkie <[email protected]>

csmarchbanks · 2018-07-12T22:54:10Z

pkg/chunk/chunk.go

 	}
-	sp.LogFields(otlog.Int("sample streams", len(sampleStreams)))
+	sp.LogFields(otlog.Int("sample streams", len(samples)))


The log name should probably be samples as well

Variable name is misleading I think - its a map from fingerprint to list of list of samples (so we can do a proper tree-based merge later). Have updated variable name to samplesBySeries and change log key to series.

csmarchbanks

One small comment, otherwise LGTM.

Are you thinking at some point use matrix for queries with a small number of chunks for speed, and iterator for a large number of chunks to not OOM?

Signed-off-by: Tom Wilkie <[email protected]>

tomwilkie · 2018-07-13T09:34:18Z

Are you thinking at some point use matrix for queries with a small number of chunks for speed, and iterator for a large number of chunks to not OOM?

Right now I'm just trying to stop the querier OOMing for high cardinality / long queries / lots of simultaneous queries. But yeah, once I've got that working it probably makes a lot of sense to switch based on number of chunks per series - good idea!

tomwilkie changed the title ~~[QIPDon't convert chunks to matrixes and then merges them;~~ [WIP] Don't convert chunks to matrixes and then merges them... Feb 16, 2018

tomwilkie force-pushed the queriers branch 2 times, most recently from 723cfc7 to 71707d3 Compare February 17, 2018 09:49

tomwilkie changed the title ~~[WIP] Don't convert chunks to matrixes and then merges them...~~ Don't convert chunks to matrixes and then merges them... Feb 17, 2018

tomwilkie mentioned this pull request Feb 17, 2018

Query optimisation ideas #209

Closed

26 tasks

tomwilkie force-pushed the queriers branch from 5deda98 to c218f43 Compare February 19, 2018 15:11

tomwilkie mentioned this pull request Feb 19, 2018

Create benchmarks for chunk handling #627

Closed

tomwilkie changed the title ~~Don't convert chunks to matrixes and then merges them...~~ [WIP] Don't convert chunks to matrixes and then merges them... Feb 27, 2018

tomwilkie force-pushed the queriers branch 6 times, most recently from c00829b to 1b4861c Compare July 9, 2018 17:59

tomwilkie changed the title ~~[WIP] Don't convert chunks to matrixes and then merges them...~~ Don't convert chunks to matrixes and then merges them... Jul 9, 2018

tomwilkie force-pushed the queriers branch 3 times, most recently from 3bcb1d9 to 008a714 Compare July 9, 2018 21:13

tomwilkie mentioned this pull request Jul 10, 2018

Prevent OOMs in the chunk store. #873

Merged

tomwilkie force-pushed the queriers branch from 008a714 to d9656f5 Compare July 10, 2018 11:41

tomwilkie added 5 commits July 12, 2018 11:04

Simple query benchmark, using in-memory chunks, but running an acutal…

6c205fd

… PromQL query. Signed-off-by: Tom Wilkie <[email protected]>

Don't convert chunks to matrixes and then merges them; use iterators …

db03eb1

…and the upstream heap-based merging code. Signed-off-by: Tom Wilkie <[email protected]>

Optimise the ChunksToMatrix function.

225738d

Signed-off-by: Tom Wilkie <[email protected]>

Update prometheus to pull in 78efdc6d - Avoid infinite loop on duplic…

7537d7a

…ate NaN values Signed-off-by: Tom Wilkie <[email protected]>

tomwilkie force-pushed the queriers branch from 7db7059 to 7537d7a Compare July 12, 2018 10:05

This was referenced Jul 12, 2018

Composite chunk store #877

Closed

Potentially replace parts of Cortex's remote querying code with upstream code once it exists #5

Closed

Register query engine metrics.

b8709c6

Signed-off-by: Tom Wilkie <[email protected]>

csmarchbanks reviewed Jul 12, 2018

View reviewed changes

csmarchbanks approved these changes Jul 12, 2018

View reviewed changes

Review feedback: clarify log message.

f02d6ea

Signed-off-by: Tom Wilkie <[email protected]>

tomwilkie mentioned this pull request Jul 13, 2018

Consider switching between merging and iterating based on query size #881

Closed

tomwilkie merged commit 3402152 into cortexproject:master Jul 13, 2018

tomwilkie deleted the queriers branch July 13, 2018 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't convert chunks to matrixes and then merges them... #713

Don't convert chunks to matrixes and then merges them... #713

Uh oh!

tomwilkie commented Feb 16, 2018 •

edited

Loading

Uh oh!

tomwilkie commented Feb 19, 2018

Uh oh!

tomwilkie commented Feb 20, 2018

Uh oh!

tomwilkie commented Jul 9, 2018

Uh oh!

tomwilkie commented Jul 10, 2018

Uh oh!

tomwilkie commented Jul 10, 2018

Uh oh!

csmarchbanks Jul 12, 2018

Uh oh!

tomwilkie Jul 13, 2018

Uh oh!

csmarchbanks left a comment

Uh oh!

tomwilkie commented Jul 13, 2018

Uh oh!

Uh oh!

Don't convert chunks to matrixes and then merges them... #713

Don't convert chunks to matrixes and then merges them... #713

Uh oh!

Conversation

tomwilkie commented Feb 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomwilkie commented Feb 19, 2018

Uh oh!

tomwilkie commented Feb 20, 2018

Uh oh!

tomwilkie commented Jul 9, 2018

Uh oh!

tomwilkie commented Jul 10, 2018

Uh oh!

tomwilkie commented Jul 10, 2018

Uh oh!

csmarchbanks Jul 12, 2018

Choose a reason for hiding this comment

Uh oh!

tomwilkie Jul 13, 2018

Choose a reason for hiding this comment

Uh oh!

csmarchbanks left a comment

Choose a reason for hiding this comment

Uh oh!

tomwilkie commented Jul 13, 2018

Uh oh!

Uh oh!

tomwilkie commented Feb 16, 2018 •

edited

Loading