Skip to content

Fixed Cache fetch error on Redis Cluster #4056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 27, 2021

Conversation

kamijin-fanta
Copy link
Contributor

@kamijin-fanta kamijin-fanta commented Apr 8, 2021

What this PR does:
Redis Cluster may not support MGet. In that case, fall back to multiple Get.

Which issue(s) this PR fixes:
Fixes #4053

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@stevesg stevesg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one question. Looks like a useful fix for people using Redis.

for i, val := range cmd.Val() {
if val != nil {
ret[i] = StringToBytes(val.(string))
_, isCluster := c.rdb.(*redis.ClusterClient)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question to better understand the fix (I'm not an expect on Redis or the Go bindings).

This is working around an issue/limitation in the bindings, correct? Or a limitation in a particular server version? I only ask because my first reaction to this is that checking the type of an interface feels wrong. If it's the only option, then we'll have to live with it of course.

Either way, it might be nice to put a comment here to explain why we have to check the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review!

I thought the same thing, but I think this method is the simplest. The Redis client is initialized with the NewUniversalClient .

https://github.com/cortexproject/cortex/blob/v1.8.0/pkg/chunk/cache/redis_client.go#L70

This is the implementation.

https://github.com/go-redis/redis/blob/v8.8.0/universal.go#L199-L206

UniversalClient can take Client and ClusterClient. Single node or sentinel if Client is set. mget is always supported. Redis Cluster configuration if ClusterClient is set. mget may not be supported.

The only way to check if the Universal Client is Redis Cluster is by type assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

Copy link
Contributor

@stevesg stevesg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I don't know much about redis client, but given the comment, it looks reasonable.

Comment on lines 122 to 130
cmd := c.rdb.Get(ctx, key)
if err := cmd.Err(); err != nil {
return nil, err
}
ret[i] = StringToBytes(cmd.Val())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a key doesn't exist, then ret[i] is expected to be nil. I don't think it's the case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pracucci thanks for reviewing! c.rdb.MGet(...).Val() returns []interface{}, but c.rdb.Get(...).Val() returns string. If the value is not found, it will return an empty string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value is not found, it will return an empty string.

I think this break the behaviour (see my previous comment above). Having a unit test to prove / disprove it would be great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood. Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed the empty values and added the tests. The tests confirm that the behavior is the same for Single and Cluster configurations of Redis.

@pracucci
Copy link
Contributor

@kamijin-fanta 👋 Would you have some time to take a look at my last comment, please? Thanks!

@kamijin-fanta kamijin-fanta force-pushed the redis-cluster branch 2 times, most recently from cb7c71e to 7d2ffa3 Compare April 28, 2021 00:41
@gouthamve
Copy link
Contributor

Can you please rebase against master to pull in #4137 and move the changelog entry to the top?

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing my feedback! The logic LGTM. I've left a couple of last comments and then we're good to go!

CHANGELOG.md Outdated
@@ -73,6 +73,7 @@
* [BUGFIX] Frontend, Query-scheduler: allow querier to notify about shutdown without providing any authentication. #4066
* [BUGFIX] Querier: fixed race condition causing queries to fail right after querier startup with the "empty ring" error. #4068
* [BUGFIX] Compactor: Increment `cortex_compactor_runs_failed_total` if compactor failed compact a single tenant. #4094
* [BUGFIX] Fixed cache fetch error on Redis Cluster. #4056
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase master and move this CHANGELOG entry to the top. We've cut 1.9 release in the meanwhile. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kamijin-fanta This commit is still valid. Could you address it, please?


clients := []*RedisClient{single, cluster}
for i, c := range clients {
meg := []string{"run on single redis client", "run on cluster redis client"}[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little convoluted. Instead of passing  meg (I guess you meant msg) to each assertion function, you can run the test within a t.Run(description, func(t *testing.T) { /* your test logic here */}) and having the description only passed to t.Run().

Following this example, a common pattern is using table testing to define the test cases. As example, I'm picking a random test where we use table testing:

func TestSchemaConfig_Validate(t *testing.T) {
t.Parallel()
tests := map[string]struct {
config *SchemaConfig
expected *SchemaConfig
err error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. fixed to assigned name and client to slice.

Signed-off-by: kamijin_fanta <[email protected]>
Signed-off-by: kamijin_fanta <[email protected]>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kamijin-fanta! LGTM. Please address the comment in the CHANGELOG and then we can merge. Thanks again!

CHANGELOG.md Outdated
@@ -73,6 +73,7 @@
* [BUGFIX] Frontend, Query-scheduler: allow querier to notify about shutdown without providing any authentication. #4066
* [BUGFIX] Querier: fixed race condition causing queries to fail right after querier startup with the "empty ring" error. #4068
* [BUGFIX] Compactor: Increment `cortex_compactor_runs_failed_total` if compactor failed compact a single tenant. #4094
* [BUGFIX] Fixed cache fetch error on Redis Cluster. #4056
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kamijin-fanta This commit is still valid. Could you address it, please?

@pracucci pracucci enabled auto-merge (squash) May 27, 2021 09:04
@pracucci pracucci merged commit cd7f60d into cortexproject:master May 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

redis cluster not supported
5 participants