Skip to content

Commit 3e4d81b

Browse files
authored
feat: custom slashing design doc (#61)
1 parent ef4ce62 commit 3e4d81b

File tree

1 file changed

+307
-0
lines changed
  • docs/custom-slashing

1 file changed

+307
-0
lines changed

docs/custom-slashing/dd.md

Lines changed: 307 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,307 @@
1+
# Custom Slashing Design Document
2+
3+
- Owner: @just-mitch
4+
- Approvers:
5+
- @LHerskind
6+
- @Maddiaa0
7+
- @aminsammara
8+
- Target DD Approval Date: 2025-05-09
9+
- Target Project Delivery Date: 2025-05-16
10+
11+
## Executive Summary
12+
13+
The `StakingLib` designates a "slasher" address which is able to slash arbitrary validators for arbitrary amounts.
14+
15+
The contract used as the slasher is currently `Slasher`, which takes directives from a `SlashingProposer`.
16+
17+
The `SlashingProposer` is an instance of `EmpireBase`, which operates in "rounds" of `M` L2 slots, during which at least `N` proposers must vote for a specific contract address "payload" to be executed.
18+
19+
The payload just calls "slash", with the list of validators to slash, and the amount to slash each.
20+
21+
So the L1 contracts currently allow arbitrary slashing as long as the motion has support from `N/M` validators.
22+
23+
In practice, however, there are only mechanisms built to create and vote for payloads to slash all validators in an epoch if the epoch is never proven, namely an out-of-protocol `SlashFactory` contract, and corresponding logic on the node to utilize it.
24+
25+
We want to expand this `SlashFactory` to allow nodes to programmatically create and vote for payloads to slash specific validators for specific amounts for specific "verifiable offences".
26+
27+
Specifically, we will automatically slash in the following cases:
28+
29+
1. (liveness) A block was proven, so slash all validators that did not attest to it.
30+
2. (data availability and finality) An epoch was not proven and either i. the data is unavailable, or ii. the data is available and the epoch was valid, so slash each validator that was in the epoch's committee.
31+
3. (safety) A validator proposed a block that was invalid, so slash the validator.
32+
33+
Last, we will add an override, which may be set by the node operator, which will configure the node to vote for a particular payload no matter what; this affords offline coordination to effect a slash.
34+
35+
## Requirements
36+
37+
The requirements with filled checkboxes are met by the design below.
38+
39+
- [x] There MUST be ready-made L1 contract(s) that can be used to slash specific validators for not participating in consensus.
40+
- [x] The Aztec Labs node client software ("the node") MUST automatically slash validators for not participating in consensus.
41+
- [x] It SHOULD be possible to slash more than one validator at a time.
42+
- [x] Coordinating the slash SHOULD NOT require any coordination between the validators beyond the existing voting/signalling mechanism; each validator SHOULD be able to inspect L1 and its state to determine:
43+
- If it agrees with the slash
44+
- How/where to vote/signal on L1
45+
- [x] Node operators SHOULD be able to configure their node to specify thresholds for what they consider "not participating".
46+
- [x] The "offence" that triggers the slash MAY be specified on L1.
47+
- [x] The amount to be slashed MAY be configurable without deploying a new factory contract.
48+
- [x] The node MUST NOT trigger a slash unless it is certain that the validator was "faulty" (in its opinion).
49+
- [ ] The threshold of number of validators (N/M) that need to signal/vote for the CustomSlashFactory payload MAY be configurable without deploying a new contract or a governance action.
50+
51+
## L1 Changes
52+
53+
We make no changes to the `Slasher` contract, or any other "in-protocol" contracts.
54+
55+
Refactor the `SlashFactory` to accept an array of validator addresses, amounts, and offences. I.e.
56+
57+
```solidity
58+
interface ISlashFactory {
59+
60+
61+
event SlashPayloadCreated(
62+
address payloadAddress, address[] validators, uint256[] amounts, uint256[] offences
63+
);
64+
65+
function createSlashPayload(
66+
address[] memory _validators,
67+
uint256[] memory _amounts,
68+
uint256[] memory _offences
69+
) external returns (IPayload);
70+
}
71+
```
72+
73+
The core function in the `SlashFactory` will look like:
74+
75+
```solidity
76+
function createSlashPayload(
77+
address[] memory _validators,
78+
uint256[] memory _amounts,
79+
uint256[] memory _offences
80+
) external override(ISlashFactory) returns (IPayload) {
81+
require(
82+
_validators.length == _amounts.length,
83+
ISlashFactory.SlashPayloadAmountsLengthMismatch(_validators.length, _amounts.length)
84+
);
85+
require(
86+
_validators.length == _offences.length,
87+
ISlashFactory.SlashPayloadOffencesLengthMismatch(_validators.length, _offences.length)
88+
);
89+
90+
(address predictedAddress, bool isDeployed) =
91+
getAddressAndIsDeployed(_validators, _amounts, _offences);
92+
93+
if (isDeployed) {
94+
return IPayload(predictedAddress);
95+
}
96+
97+
// Use a salt so that validators don't create many payloads for the same slash.
98+
bytes32 salt = keccak256(abi.encodePacked(_validators, _amounts, _offences));
99+
100+
// Don't need to pass _offences as they are not used in the payload.
101+
SlashPayload payload = new SlashPayload{salt: salt}(_validators, VALIDATOR_SELECTION, _amounts);
102+
103+
emit SlashPayloadCreated(address(payload), _validators, _amounts, _offences);
104+
return IPayload(address(payload));
105+
}
106+
```
107+
108+
For now, the `offences` field will effectively be an enum, with the following possible values:
109+
110+
- 0: unknown
111+
- 1: proven block not attested to
112+
- 2: unproven valid epoch
113+
- 3: invalid block proposed
114+
115+
The use of `uint256` for offences rather than an explicit enum allows for future flexibility, e.g. adding more offences and interpreting them off-chain, or, by using `uint256` rather than `uint8`, using a hash/commitment to some external data/proof.
116+
117+
Creating the payload via the `SlashFactory` will emit an event with the payload address, and the validator addresses/amounts/offences.
118+
119+
Aztec nodes will listen for these events, and then check if the validator committed the alleged offence. If so, they will vote/signal for the payload on L1.
120+
121+
## Node Changes
122+
123+
Most of the work is in the node.
124+
125+
### SlasherClient
126+
127+
The SlasherClient will remain the interface that the SequencerPublisher uses to adjust the transaction it sends to the forwarder contract. That is, the SequencerPublisher will continue to call `SlasherClient.getSlashPayload` to get the address of the payload to signal/vote for.
128+
129+
Its internal operations will be different, though.
130+
131+
It will instantiate "Watchers", which will have the following responsibilities:
132+
133+
- emit WANT_TO_SLASH events with the arguments to `createSlashPayload`
134+
- expose a function which takes a validator address, amount, and offence and returns whether it agrees with the slash
135+
136+
The SlasherClient has the following responsibilities:
137+
138+
- listen for WANT_TO_SLASH events and create a payload from the arguments
139+
- listen for the payload to be created and insert it into a priority queue
140+
- return the payload with the highest priority when `getSlashPayload` is called
141+
142+
### Payload priority
143+
144+
Validators will need to have a way to order the various slashing events they observe.
145+
146+
Each time a new payload is observed on L1, the node will:
147+
148+
1. Sum the amounts of new slash proposals, so we have a `totalSlashAmount` for each payload.
149+
2. Filter the payloads to only include those that the Watchers agree with.
150+
3. Insert the payload with metadata into a priority queue
151+
4. Sort the payloads by `totalSlashAmount`, largest to smallest.
152+
153+
Whenever `getSlashPayload` is called, the node will:
154+
155+
1. Check if there is an override payload. If so, signal/vote for it.
156+
2. Filter out payloads from the queue that are older than a configurable TTL.
157+
3. Return the first payload in the queue.
158+
159+
### Proven block not attested to
160+
161+
The first slashing event to be implemented will be for the case where a validator did not attest to a proven block.
162+
163+
This will be done by an `InactivityWatcher`, which will:
164+
165+
- listen for L2 blocks `chain-proven` events emitted from the `L2BlockStream`
166+
- for each slot, call `Sentinel.processSlot` to get a map of validators and whether they voted
167+
- emit a `WANT_TO_SLASH` event for each validator that missed more than `SLASH_INACTIVITY_CREATE_TARGET` slots, slashing them for the amount specified in `SLASH_INACTIVITY_CREATE_PENALTY`
168+
169+
When asked, it will agree to slash any validator that missed more than `SLASH_INACTIVITY_SIGNAL_TARGET` slots, so long as the amount is less than `SLASH_INACTIVITY_MAX_PENALTY`.
170+
171+
### A validator proposed an invalid block
172+
173+
This requires that full nodes have the ability to re-execute blocks.
174+
175+
Further, when executing a block, we will store invalid blocks in a cache, and emit an `invalid-block` event.
176+
177+
A `InvalidBlockWatcher` will take an executor as an argument, subscribe to the `invalid-block` event, and then emit a `WANT_TO_SLASH` event naming the proposer of the invalid block, slashing them for the amount specified in `SLASH_INVALID_BLOCK_PENALTY`.
178+
179+
When asked, it will agree to slash any validator that proposed an invalid block which it sees in its cache of invalid blocks, as long as the amount is less than `SLASH_INVALID_BLOCK_MAX_PENALTY`.
180+
181+
### A valid epoch was not proven
182+
183+
This requires that full nodes have the ability to re-execute blocks.
184+
185+
A `ValidEpochUnprovenWatcher` will listen to `chain-pruned` events emitted by the `L2BlockStream`, and emit a `WANT_TO_SLASH` event for all validators that were in the epoch that was pruned IF there were no `invalid-block` events emitted for that epoch, slashing them for the amount specified in `SLASH_PRUNE_PENALTY`.
186+
187+
When asked, it will agree to slash any validator that was in an epoch that was pruned and there were no `invalid-block` events emitted for that epoch, as long as the amount is less than `SLASH_PRUNE_MAX_PENALTY`.
188+
189+
### New configuration
190+
191+
- `SLASH_PAYLOAD_TTL`: the maximum age of a payload to signal/vote for
192+
- `SLASH_OVERRIDE_PAYLOAD`: the address of a payload to signal/vote for no matter what
193+
- `SLASH_PRUNE_ENABLED`: whether to create a payload for epoch pruning
194+
- `SLASH_PRUNE_PENALTY`: the amount to slash each validator that was in an epoch that was pruned
195+
- `SLASH_PRUNE_MAX_PENALTY`: the maximum amount to slash each validator that was in an epoch that was pruned
196+
- `SLASH_INACTIVITY_ENABLED`: whether to signal/vote for a payload for inactivity
197+
- `SLASH_INACTIVITY_CREATE_TARGET`: the percentage of attestations missed required to create a payload
198+
- `SLASH_INACTIVITY_SIGNAL_TARGET`: the percentage of attestations missed required to signal/vote for the payload
199+
- `SLASH_INACTIVITY_CREATE_PENALTY`: the amount to slash each validator that is inactive
200+
- `SLASH_INVALID_BLOCK_ENABLED`: whether to signal/vote for a payload for invalid blocks
201+
- `SLASH_INVALID_BLOCK_PENALTY`: the amount to slash each validator that proposed an invalid block
202+
- `SLASH_INVALID_BLOCK_MAX_PENALTY`: the maximum amount to slash each validator that proposed an invalid block
203+
204+
## Full Node Re-execution
205+
206+
There are two primary points in time where a full node would want to re-execute blocks:
207+
208+
1. When a block is proposed on the p2p network and is gathering attestations.
209+
2. When a block been proposed to the L1 and is part of the pending chain.
210+
211+
To slash all malicious validators, we only need to support the first case; we need to adjust the validator client to re-execute, even if the node is not in the committee for the block, and retain a cache of invalid blocks.
212+
213+
To avoid slashing honest validators who built on a bad block which they blindly accepted/synced from the previous committee/L1, we need to support the second case.
214+
215+
### A generic `BlockBuilder`
216+
217+
We will build a new `BlockBuilder` class which is a component of the `AztecNode`.
218+
219+
Its interface will be:
220+
221+
```typescript
222+
interface GlobalContext {
223+
chainId: Fr;
224+
version: Fr;
225+
blockNumber: Fr;
226+
slotNumber: Fr;
227+
timestamp: Fr;
228+
coinbase: EthAddress;
229+
feeRecipient: AztecAddress;
230+
gasFees: GasFees;
231+
}
232+
233+
interface BuiltBlockResult {
234+
block: L2Block;
235+
publicGas: Gas;
236+
publicProcessorDuration: number;
237+
numMsgs: number;
238+
numTxs: number;
239+
numFailedTxs: number;
240+
blockBuildingTimer: Timer;
241+
usedTxs: Tx[];
242+
}
243+
244+
interface ExecutionOptions {}
245+
246+
interface BlockBuilder {
247+
gatherTransactions(txHashes: TxHash[]): Promise<Tx[]>;
248+
executeTransactions(
249+
txs: Tx[],
250+
globals: GlobalContext,
251+
options: ExecutionOptions
252+
): Promise<BuiltBlockResult>;
253+
}
254+
```
255+
256+
`executeTransactions` will execute transactions against the current world state and archiver.
257+
258+
The caller then may compare the results against whatever they expect (e.g. the state roots a peer sent, or that they downloaded from L1).
259+
260+
We will then update the archiver to optionally take a `BlockBuilder` as an argument, and use this to validate blocks coming in on L1.
261+
262+
Further, the Sequencer and Validator clients will accept a `BlockBuilder` as an argument, which they will use to build/re-execute blocks.
263+
264+
Thus we have:
265+
266+
1. Block comes in on p2p
267+
1. validator is on committee
268+
1. block is good, broadcast attestation
269+
2. block is bad, add to invalid block cache
270+
2. validator is not on committee
271+
1. block is good, do nothing
272+
2. block is bad, add to invalid block cache
273+
2. Block comes in on L1
274+
1. block is good, add to state
275+
2. block is bad, add to invalid block cache
276+
277+
## Notes
278+
279+
The amount to slash should be high for testnet (e.g. the minimum stake amount). We can use a different amount for mainnet.
280+
281+
The threshold of number of validators (M/N) that need to signal/vote for the SlashFactory payload will be the same as the number of validators that need to signal/vote for any other slashing payload, i.e. this ratio is fixed at set per rollup.
282+
283+
This requires that M/N validators are listening to the same SlashFactory contract, and participating in the protocol.
284+
285+
If we do not have M/N validators participating, someone will need to make a "propose with lock" on the governance to deploy a new rollup instance. If that is not palatable, we could update the staking lib to allow the governance (not just the slasher) to slash validators; then someone could "propose with lock" to slash whatever validators governance decides.
286+
287+
In the future, we could also allow the slasher itself or governance to change who the slasher is.
288+
289+
## Timeline
290+
291+
Outline the timeline for the project. E.g.
292+
293+
- L1 contracts : 1 day
294+
- Refactor SlasherClient : 2 days
295+
- Implement ProvenBlockNotAttestedWatcher : 1-2 days
296+
- Implement Rexecute on Full Node : 2 days
297+
- Implement InvalidBlockWatcher : 1-2 days
298+
- Implement ValidEpochUnprovenWatcher : 1-2 days
299+
- Review/Polish : 2 days
300+
301+
Total: 10-12 days
302+
303+
The intent is to first merge functionality to slash inactive validators, then do the broader refactor needed for the later two cases.
304+
305+
## Disclaimer
306+
307+
The information set out herein is for discussion purposes only and does not represent any binding indication or commitment by Aztec Labs and its employees to take any action whatsoever, including relating to the structure and/or any potential operation of the Aztec protocol or the protocol roadmap. In particular: (i) nothing in these projects, requests, or comments is intended to create any contractual or other form of legal relationship with Aztec Labs or third parties who engage with this AztecProtocol GitHub account (including, without limitation, by responding to a conversation or submitting comments) (ii) by engaging with any conversation or request, the relevant persons are consenting to Aztec Labs’ use and publication of such engagement and related information on an open-source basis (and agree that Aztec Labs will not treat such engagement and related information as confidential), and (iii) Aztec Labs is not under any duty to consider any or all engagements, and that consideration of such engagements and any decision to award grants or other rewards for any such engagement is entirely at Aztec Labs’ sole discretion. Please do not rely on any information on this account for any purpose - the development, release, and timing of any products, features, or functionality remains subject to change and is currently entirely hypothetical. Nothing on this account should be treated as an offer to sell any security or any other asset by Aztec Labs or its affiliates, and you should not rely on any content or comments for advice of any kind, including legal, investment, financial, tax, or other professional advice.

0 commit comments

Comments
 (0)