Skip to content

Conversation

@dhiaayachi
Copy link

This is the steps that a leadership transfer API call do:

  • we select the closest follower (if node not provided)
  • we wait for it to catch up (for at most electionTimeout)
  • we trigger an RPC call to that follower and we flip it to a candidate
  • the new candidate start a new term and force the old leader to step down (because it has a newer term)

but when the new candidate send the vote request the old leader (which made the transition to a follower by that time) reject it because it has newer logs. rejecting vote request since our last index is greater which favour that same node to become a leader in a next term and we are back to square 1.

There is 3 gaps that lead to that issue:

  • When we start the leadership transfer the flag LeadershipTransferInProgress which avoid replication to happen while we are doing the transfer is set inside a go routine, which create a gap while the go routine is scheduled
  • The same flag LeadershipTransferInProgress is unset at the end of the go routine, without a guarantee that the leader loop is done in the old leader
  • When the new leader get a transfer request and get a replication request from the old leader (while the transfer is happening) it will force it to a follower state, which cancel the transfer.

This PR is to close those 3 gaps.

@dhiaayachi dhiaayachi requested a review from mkeeler January 25, 2022 16:14
Copy link
Member

@hanshasselberg hanshasselberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I can see how that fixes the issues you described. Replicating that in tests is probably too complicated?!

@dhiaayachi
Copy link
Author

Thank you for reviewing this @i0rek .
Yes testing this is a challenge as those gaps are time sensitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants