Skip to content

Possible data loss on corruption of coordinating node replica #679

@engelsanchez

Description

@engelsanchez

Russell brought this up today, and perhaps he can elaborate on this:

  • A replica becomes unavailable in a put coordinating node due to corruption or user error.
  • The incoming object may or may not contain a vector clock.
  • We get a not_found in the local put. We have lost the information corresponding to this vnode id.
  • The vector clock created could have a stale number for this vnode actor id. Either 1 if no vclock or < than the real one if more writes have happened since.
  • The vector clock is subsumed by the other replicas when sent out, so this write will lose and will eventually be read repaired completely out of existence.

/cc @jtuple @jrwest @evanmcc @Vagabond @russelldb

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions