-
Notifications
You must be signed in to change notification settings - Fork 21
Implementing failover retries #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR, this looks great and really thorough. I'm pretty much ok with everything, I just noticed one minor detail. But maybe I'm just blind...
Also looks like some of the assertions are flaky, maybe there is something else to assert like a unique connection id (CLIENT ID) or so?
|
I've also adjusted the tests to check if the Redis |
|
Looks good to me, I'm releasing this as a major version as suggested. |
|
Hi All, We are running into a random error on our system where redis will timeout with this message: connection timed out In tracing through the code I see that this message is not in the array: ERROR_MESSAGES_INDICATING_UNAVAILABILITY As such, the shouldRetryRedisException method is returning false, and (from what I can tell) the system is not retrying. I am wondering if the omission of this error is on purpose as maybe the timeout does not qualify as something that indicates unavailability. We have Redis running in production and as you might guess there are lots and lots of hits to redis and it all works perfectly but randomly we will get this timeout. Since it's not retried it the exception will propagate up and cause our job to fail. Thanks! -Rob |
|
@streamingsystems I think you should open a new issue witht his information. To me this sounds like you are using some kind of connection pool that is not utilized frequently enough and some connections time out. However, I'm not opposed to add retry handling for this case. Maybe we can even make the error messages configurable (on top of the defaults), so that it is easier to test the changes before merging them. |
|
Ok thanks, I forked your repo and am running some tests and will open a separate issue. Have a great day. |
|
Looks like a good addition! 👍 |
It's been a while since I've opened #17, but a rolling update of Redis caused some information not to be stored due to the downtime that occurs when Sentinel has to elect a new leader. The basic way of connecting hasn't changed and should work on most installs, but it might be best to release this as a new major version if you're okay with the changes.
In short:
RedisExceptionis retryable or when DNS resolution errors occur. The number of retries and the delay are configurable — by default, 20 retries with a 1-second delay.RetryRedisExceptionis thrown.php-cs-fixerwithlaravel/pint.start-redis-cluster.shscript we use for local testing.start-redis-cluster.shscript now supports automatic restarts when a node goes down.We rely heavily on this logic in production, so I’d love to help keep maintaining it.