Store chunks in DynamoDB #418

tomwilkie · 2017-05-11T14:15:01Z

Part #141

After a certain time, store chunks in DynamoDB.
Use the table manager to regularly rotate this table, and provision its capacity.
Fetch chunks for DynamoDB in batches of 100.

…, so it is responsible for parallelism.

aaron7

lgtm - @jml could you take a look as well please?

aaron7 · 2017-05-15T13:55:49Z

waiting for some tests as well ^

jml · 2017-05-15T15:57:26Z

pkg/chunk/storage_client.go

@@ -19,8 +19,8 @@ type StorageClient interface {
 	QueryPages(ctx context.Context, query IndexQuery, callback func(result ReadBatch, lastPage bool) (shouldContinue bool)) error

 	// For storing and retrieving chunks.
-	PutChunk(ctx context.Context, key string, data []byte) error
-	GetChunk(ctx context.Context, key string) ([]byte, error)
+	PutChunks(ctx context.Context, chunks []Chunk, keys []string, data [][]byte) error


What are the constraints on how chunks, keys, and data relate to each other? Can we pick better types (e.g. map[string]Chunk) to make this less easy to mess up?

The list must be the same length, and the ordering within them must be consistent.

You could probably consider this a micro optimization, as the chunk can generate the key and the buffer. I'll see if I can factor it out.

Have managed to tidy this up.

jml · 2017-05-15T15:58:04Z

pkg/chunk/schema.go

@@ -49,6 +49,9 @@ type IndexQuery struct {
 	RangeValuePrefix []byte
 	RangeValueStart  []byte

+	// Used when fetching chunks


Better to say why we need it when fetching chunks.

Old code I should have removed, sorry!

jml · 2017-05-15T16:09:13Z

pkg/chunk/storage_client.go

-	PutChunk(ctx context.Context, key string, data []byte) error
-	GetChunk(ctx context.Context, key string) ([]byte, error)
+	PutChunks(ctx context.Context, chunks []Chunk, keys []string, data [][]byte) error
+	GetChunks(ctx context.Context, chunks []Chunk) ([]Chunk, error)


I don't understand this interface. You already have chunks (chunks []Chunk) and this returns the same thing. Why are the inputs & outputs the same types?

The chunks you're passing in are "empty", in that they just describe what to fetch.

I could separate out the parsed chunk ID (ChunkDescriptor) from the chunk itself, and embed on in another. Then it could take a ChunkDescriptor and return a Chunk. WDYT?

jml · 2017-05-16T10:10:57Z

pkg/chunk/table_manager.go

+	f.Int64Var(&cfg.ChunkTableProvisionedWriteThroughput, "dynamodb.chunk-table.write-throughput", 3000, "DynamoDB chunk tables write throughput")
+	f.Int64Var(&cfg.ChunkTableProvisionedReadThroughput, "dynamodb.chunk-table.read-throughput", 300, "DynamoDB chunk tables read throughput")
+	f.Int64Var(&cfg.ChunkTableInactiveWriteThroughput, "dynamodb.chunk-table.inactive-write-throughput", 1, "DynamoDB chunk tables write throughput for inactive tables.")
+	f.Int64Var(&cfg.ChunkTableInactiveReadThroughput, "dynamodb.chunk-table.inactive-read-throughput", 300, "DynamoDB chunk tables read throughput for inactive tables")


Shouldn't these flags be on PeriodicChunkTableConfig? Better yet, could these be implemented in such a way that we don't have duplication with the periodic table config?

Shouldn't these flags be on PeriodicChunkTableConfig?

I followed the pattern we used for PeriodicTableConfig, where only the flags that need to be shared are actually put in the shared struct.

Better yet, could these be implemented in such a way that we don't have duplication with the periodic table config?

Eventually I want Cortex to self-tune its provisioned throughput, but for now the chunk table will need different levels as other tables.

Fair enough.

jml · 2017-05-16T10:13:42Z

pkg/chunk/aws_storage_client.go

+	dynamoDBChunks, err = a.getDynamoDBChunks(ctx, dynamoDBChunks)
+	if err != nil {
+		return nil, err
+	}


Any particular reason to do these sequentially?

I don't expect us to be doing both simultaneously except for when we migrate, when it will only occur for a couple or hours. So I didn't think it was worth the extra code to parallelise. Will add comment to this effect.

jml · 2017-05-16T10:33:40Z

pkg/chunk/aws_storage_client.go

+		// All other errors are fatal.
+		if err != nil {
+			return result, err
+		}


Wouldn't it make more sense to push these to immediately after we get the error, just after we do the "record dynamodb error" bit? As it is, it's disconnected from that logic, and it's not clear whether it matters that this happens after turning the responses into chunks.

~~Sure, yeah~~ actually logic is a little more subtle. Have restructured.

jml · 2017-05-16T10:35:41Z

pkg/chunk/aws_storage_client.go

-func (b dynamoDBReadBatch) Len() int {
-	return len(b)
+// Fill 'to' with WriteRequests from 'from' until 'to' has at most max requests. Remove those requests from 'from'.
+func (b dynamoDBWriteBatch) takeReqs(from dynamoDBWriteBatch, max int) {


to isn't defined.

Also, why is this private when Len and Add are public?

Made them all ~~private~~ public.

jml · 2017-05-16T10:36:19Z

pkg/chunk/aws_storage_client.go

+}
+
+// Fill 'to' with WriteRequests from 'from' until 'to' has at most max requests. Remove those requests from 'from'.
+func (b dynamoDBReadRequest) takeReqs(from dynamoDBReadRequest, max int) {


Ditto comment on writerequest

jml · 2017-05-16T10:40:03Z

pkg/chunk/aws_storage_client.go

+				from[tableName].Keys = fromReqs.Keys[taken:]
+				toFill -= taken
+			}
+		}


I think you'll have fewer bugs if you make this (and equivalent write version) immutable. i.e. have it return a new request populated from from with max or more, without changing b.

The challenge is this needs to be done twice, first picking request from one list, then for another.

Hmm, I see.

take maxRequests (mappend from1 from2)

I think it's probably OK as-is, but if you wanted to pursue immutability, you could make max the first parameter and accept a variadic number of requests.

jml · 2017-05-16T10:44:42Z

pkg/chunk/aws_storage_client.go

 	}
-	return chunkValue.B


This is the same as takeReqs on dynamoDBWriteBatch, right? Is there a way to avoid this duplication, perhaps using an interface?

Yeah. I tried, will have another look.

Nah, can't find a nice way to unify these two. Open to suggestions though!

We're going to need a bigger type system.

tomwilkie · 2017-05-17T14:55:19Z

@jml thank you for high quality feedback! I think I'v addressed most of it, but there are few open questions. PTAL?

jml

Probably the last round. Haven't looked at the tests, sorry.

jml · 2017-05-17T17:16:58Z

pkg/chunk/aws_storage_client.go

+		}
+
+		if err != nil {
+			for tableName := range outstanding {


Shouldn't this be requests? It's entirely possible that:
a) not all the tables in outstanding got sent to dynamodb
b) some of the tables in unprocessed did

Yep, good catch.

jml · 2017-05-17T17:23:57Z

pkg/chunk/aws_storage_client.go

+
+		// If we get provisionedThroughputExceededException, then no items were processed,
+		// so back off and retry all.
+		if awsErr, ok := err.(awserr.Error); ok && awsErr.Code() == provisionedThroughputExceededException {


I don't see how this code is reachable. err is last set by processChunkResponse, which doesn't do AWS stuff.

I think it makes more sense to put it before processChunkResponse, since according to the comment, no items were processed anyway and we're just going to retry them all.

At that point, it might as well go inside the if err != nil block that's starting on what's currently line 463.

Thanks. Comment + code change makes the logic easier to follow too.

jml · 2017-05-17T17:27:50Z

pkg/chunk/aws_storage_client.go

+				from[tableName].Keys = fromReqs.Keys[taken:]
+				toFill -= taken
+			}
+		}


Hmm, I see.

take maxRequests (mappend from1 from2)

I think it's probably OK as-is, but if you wanted to pursue immutability, you could make max the first parameter and accept a variadic number of requests.

tomwilkie · 2017-05-18T10:34:22Z

I think it's probably OK as-is, but if you wanted to pursue immutability, you could make max the first parameter and accept a variadic number of requests.

Right, but then you'd also need to have it return the new, immutable input maps too. Lets face it, its just not pretty in go.

jml · 2017-05-18T12:42:25Z

pkg/chunk/aws_storage_client.go

@@ -427,14 +427,18 @@ func (a awsStorageClient) getS3Chunk(ctx context.Context, chunk Chunk) (Chunk, e
 	return chunk, nil
 }

+// As we're resuing the DynamoDB schema from the index for the chunk tables,


tomwilkie added 4 commits May 11, 2017 15:14

Expand the StorageClient interface to take lists of chunks to get/put…

b37a9e7

…, so it is responsible for parallelism.

Have the table manager build period chunk tables too.

fdf2b80

Split out ChunkTableConfig so ingesters can use it too.

8e94e94

Read and writes chunks from DynamoDB.

07574a8

tomwilkie changed the title ~~Store chunks in DynamoDB~~ [WIP] Store chunks in DynamoDB May 11, 2017

tomwilkie self-assigned this May 11, 2017

tomwilkie added 3 commits May 12, 2017 13:04

First stab at getDynamoDBChunks

5c3713c

Finish off the getDynamoDBChunks function.

756462d

Fix lint

cb2a5d7

tomwilkie changed the title ~~[WIP] Store chunks in DynamoDB~~ Store chunks in DynamoDB May 12, 2017

tomwilkie requested a review from aaron7 May 12, 2017 15:23

aaron7 approved these changes May 15, 2017

View reviewed changes

jml reviewed May 16, 2017

View reviewed changes

tomwilkie added 4 commits May 16, 2017 13:56

Unit tests for dynamodb chunks.

8880ee0

Make the tests pass

7f9611a

Some of the review feedback

65fcac1

Simplify the StorageClient interface.

348e2a1

jml reviewed May 17, 2017

View reviewed changes

tomwilkie added 4 commits May 18, 2017 11:35

Review feedback

79ec2f0

Add handler for triggering chunk flushing.

a3a54ba

Don't use nil range key.

0ccd9c3

Config for testing dynamodb chunks locally.

04b0828

jml approved these changes May 18, 2017

View reviewed changes

Spelling

bb3e011

tomwilkie merged commit 3388bdb into master May 18, 2017

tomwilkie deleted the 141-dynamodb-chunks branch May 18, 2017 13:26

tomwilkie mentioned this pull request Jul 7, 2017

Chunks are not being written to the cache by the ingesters. #496

Closed

bboreham mentioned this pull request Mar 3, 2018

DynamoDB: Use eventually consistent reads for older data #699

Closed

Store chunks in DynamoDB #418

Store chunks in DynamoDB #418

Conversation

tomwilkie commented May 11, 2017 • edited Loading

aaron7 left a comment

Choose a reason for hiding this comment

aaron7 commented May 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomwilkie May 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomwilkie May 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomwilkie commented May 17, 2017

jml left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomwilkie commented May 18, 2017

Choose a reason for hiding this comment

tomwilkie commented May 11, 2017 •

edited

Loading

tomwilkie May 17, 2017 •

edited

Loading

tomwilkie May 17, 2017 •

edited

Loading