Skip to content

RFC: VTOrc to support cell locality #18982

@timvaillancourt

Description

@timvaillancourt

Problems

Today, VTOrc for the most part has no concept of Vitess "cells". So when it fetches tablets for probing, it fetches tablets from the topo of every cell that is available in the global topo

This leads to some problems I will detail in sections

Topo Reads

The call to fetch tablets from each cell-level topology is a heavy LIST call on the /tablets topo-path, which can return very large, expensive responses. At least in the Consul topo, this LIST call is strongly consistent, meaning the already-expensive call needs a quorum-on-read. At the scale of Slack's deployment, for example, this read-quorum chatter alone can pose network-throughput risks on Consul servers. That's not good 👎

Further compounding the problem, to have a network-partition-safe (plus/minus some edge cases) deployment of VTOrc, one must run many VTOrcs in a majority or all "cells", to provide visibility/triangulation from several angles. In a typical deployment this means 2 or 3 x VTOrcs, all scraping every cell - because they can't be any smarter, as they don't know what cell they're really inside

Global Topo Locking / Dependency

Finally, because every VTOrc currently watches every "cell", and many VTOrc can be ran, it must assume that another VTOrc in ANY cell may also respond to the same problem. This means VTOrc needs to use global-topo locks to solve ANY problem, global-scoped or not

VTOrc solves two classes of problems: problems that are "shard-wide", such as DeadPrimary, but also problems that are not shard-wide, such as ReplicaSemiSyncMustBeSet - affecting a single tablet in a single cell. The "shard-wide" problem DOES require a global-topo lock but the latter theoretically does not. Today VTOrc will take out global topo locks for both classes of problems 👎

This approach has these undesired effects:

  1. If the global is down, VTOrc cannot fix any problem, anywhere 😱
  2. Shard lock contention, often for operations that are not actually shard-wide
  3. All problems are solved serially, per shard, even if there is no value to them being solved in-serial

Proposal

To address the duplicated all-cell, all-tablets topo read concern, introducing cell-awareness and cell-locallity to VTOrc can help reduce the number of topo calls and the reliance on the global topo to always be up

Assumption: this proposal assumes that when cell-locality is enabled, VTOrc exists in every cell. That's a hard requirement. If you don't like that, you're stuck with the everything-is-global approach of today

The proposal is:

  1. VTOrc adds flags:
    • --cell - What cell this VTOrc instance is located in. Similar to vtgate
    • --cells-to-watch - What cells to watch, defaulting to local-cell only. Similar to vtgate again
  2. When cell-awareness is enabled, via the new flags:
    • Fetch, probe and fix non-global-scoped problems entirely in the local cell, using local-cell locks
      • This offloads some locking from the global topo 🚀
      • Examples: problems that are not shard-wide, such as ReplicaSemiSyncMustBeSet
    • Fetch, probe and fix global/shard-wide problems as we do today, using global-topo locks
      • VTOrcs in all cells will continue to fight for global locks to decide a leader

What this would require:

  • Adding the new cell flags
  • Adding support for locks on cell/local-topos. I don't believe that exists today
  • When cell-awareness is enabled, support a new style of probing tablets:
    • Fetch/probe all tablets from the local-cell only
    • Fetch/probe shard PRIMARYs using:
      • The PrimaryAlias field of the global shard record - we already fetch this record
      • A GetTablet RPC to the cell of the PRIMARY
  • Tweaks for VTOrc to understand the new style of split local/global locks
  • When solving a shard-wide/global problem, fetch all tablets from all cells (unfortunately)

This new cell-local approach will reduce the scrape topo calls from "all-tablets globally" -> "all tablets in the local cell" for local-cell problems, and "N x PRIMARYs (possible cross-cell read)" for global-level problems

Use Case(s)

Users of VTOrc that deploy Vitess in many cells (as recommended)

Metadata

Metadata

Labels

Component: VTorcVitess Orchestrator integrationType: EnhancementLogical improvement (somewhere between a bug and feature)Type: RFCRequest For Comment

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions