New box.ctl.on_replication_split_brain_rollback event
#10943
CuriousGeorgiy
started this conversation in
RFC
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The unconfirmed asynchronous transactions from the old term will be rolled back just as unconfirmed synchronous transactions are currently rolled back: by
txn_limbo_read_rollbackafter thePROMOTErequest is written.To give users more flexibility and allow them to take action to save their data when
replication_split_brain_handling_modeis not set tonone, we will introduce a new system eventbox.ctl.on_replication_split_brain_rollback.The event will be delivered before writing a
PROMOTErequest for each asynchronous transaction in the synchronous queue (with theTXN_EARLY_ACKflag set) that will be rolled back thePROMOTErequest. This will ensure that if the event trigger fails, thePROMOTErequest will also fail,ER_SPLIT_BRAINwill be raised, and the asynchronously committed data will not be lost.Event trigger arguments
Information about the asynchronous transaction that will be rolled back by a
PROMOTErequest will be passed to the trigger.The trigger will receive a transaction statement iterator similar to that of other transaction event trigger. The iterator will yield transaction statement information as described in the format of the
_repair_queuespace (Space format). The information will be decoded fromstmt→rowand pushed onto the Lua stack in a format similar to thexlogmodule. I.e., if possible, tuples will pushed, otherwise, Msgpack objects will be pushed.Event trigger failure
If the trigger fails, the corresponding
PROMOTErequest will also fail with anER_SPLIT_BRAINerror.Transactions in event trigger
It will be guaranteed that before the event trigger is called, there active fiber will have no active transactions. I.e., the rolled back transaction will be detached from the active fiber.
Therefore, the trigger will be able to start new transactions. It will only be able to write to local spaces , which will be ensured by the
limbo→is_in_rollbackflag.When all the asynchronous transactions will be processed by the trigger, the last transaction, if any, will be committed. If the commit of the last transaction fails, the corresponding promote request will also fail with an
ER_SPLIT_BRAINerror.Retrying a failed
PROMOTErequestThe semantics of the
box.ctl.on_replication_split_brain_rollbackevent require its event trigger to be called multiple times if thePROMOTErequest or the trigger fail. Therefore, the trigger must be idempotent with respect to transactions.The
transaction_idvalue (namely, thetxn→idfield) that will be yielded by the transaction statement iterator passed to the trigger will be sufficient to uniquely determine transactions.Handling synchronous
PROMOTErequestsWith synchronous
PROMOTErequests, the promote effect takes place only after the correspondingCONFIRMrequest.To handle synchronous
PROMOTErequests the event will need to be delivered before writing the correspondingCONFIRMrequest, rather than before writing thePROMOTErequest. The same semantics for failure and retrying a failed request will apply.Beta Was this translation helpful? Give feedback.
All reactions