-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Problems
Today, VTOrc for the most part has no concept of Vitess "cells". So when it fetches tablets for probing, it fetches tablets from the topo of every cell that is available in the global topo
This leads to some problems I will detail in sections
Topo Reads
The call to fetch tablets from each cell-level topology is a heavy LIST call on the /tablets topo-path, which can return very large, expensive responses. At least in the Consul topo, this LIST call is strongly consistent, meaning the already-expensive call needs a quorum-on-read. At the scale of Slack's deployment, for example, this read-quorum chatter alone can pose network-throughput risks on Consul servers. That's not good 👎
Further compounding the problem, to have a network-partition-safe (plus/minus some edge cases) deployment of VTOrc, one must run many VTOrcs in a majority or all "cells", to provide visibility/triangulation from several angles. In a typical deployment this means 2 or 3 x VTOrcs, all scraping every cell - because they can't be any smarter, as they don't know what cell they're really inside
Global Topo Locking / Dependency
Finally, because every VTOrc currently watches every "cell", and many VTOrc can be ran, it must assume that another VTOrc in ANY cell may also respond to the same problem. This means VTOrc needs to use global-topo locks to solve ANY problem, global-scoped or not
VTOrc solves two classes of problems: problems that are "shard-wide", such as DeadPrimary, but also problems that are not shard-wide, such as ReplicaSemiSyncMustBeSet - affecting a single tablet in a single cell. The "shard-wide" problem DOES require a global-topo lock but the latter theoretically does not. Today VTOrc will take out global topo locks for both classes of problems 👎
This approach has these undesired effects:
- If the global is down, VTOrc cannot fix any problem, anywhere 😱
- Shard lock contention, often for operations that are not actually shard-wide
- All problems are solved serially, per shard, even if there is no value to them being solved in-serial
Proposal
To address the duplicated all-cell, all-tablets topo read concern, introducing cell-awareness and cell-locallity to VTOrc can help reduce the number of topo calls and the reliance on the global topo to always be up
Assumption: this proposal assumes that when cell-locality is enabled, VTOrc exists in every cell. That's a hard requirement. If you don't like that, you're stuck with the everything-is-global approach of today
The proposal is:
- VTOrc adds flags:
--cell- What cell this VTOrc instance is located in. Similar tovtgate--cells-to-watch- What cells to watch, defaulting to local-cell only. Similar tovtgateagain
- When cell-awareness is enabled, via the new flags:
- Fetch, probe and fix non-global-scoped problems entirely in the local cell, using local-cell locks
- This offloads some locking from the global topo 🚀
- Examples: problems that are not shard-wide, such as
ReplicaSemiSyncMustBeSet
- Fetch, probe and fix global/shard-wide problems as we do today, using global-topo locks
- VTOrcs in all cells will continue to fight for global locks to decide a leader
- Fetch, probe and fix non-global-scoped problems entirely in the local cell, using local-cell locks
What this would require:
- Adding the new cell flags
- Adding support for locks on cell/local-topos. I don't believe that exists today
- When cell-awareness is enabled, support a new style of probing tablets:
- Fetch/probe all tablets from the local-cell only
- Fetch/probe shard
PRIMARYs using:- The
PrimaryAliasfield of the global shard record - we already fetch this record - A
GetTabletRPC to the cell of thePRIMARY
- The
- Tweaks for VTOrc to understand the new style of split local/global locks
- When solving a shard-wide/global problem, fetch all tablets from all cells (unfortunately)
This new cell-local approach will reduce the scrape topo calls from "all-tablets globally" -> "all tablets in the local cell" for local-cell problems, and "N x PRIMARYs (possible cross-cell read)" for global-level problems
Use Case(s)
Users of VTOrc that deploy Vitess in many cells (as recommended)