Skip to content

Conversation

zelig
Copy link
Member

@zelig zelig commented Apr 18, 2025

UPDATE: draft version ready
solidity code is generated and appended to the swip

seriously work in progress

a better solution for the one operator one node in a neighborhood problem than variable stakes

  • a one off flat fee (per playing nh) will massively simplify the stake management UI of the redistribution game
  • much stronger guarantee for local redundancy
  • much stronger guarantee for load balancing, aka smart neighborhood management

@zelig zelig self-assigned this Apr 18, 2025
@zelig zelig added improvement enhancement of an existing protocol/strategy/convention protocol describes a process every swarm node must implement and adhere to labels Apr 18, 2025
@zelig zelig changed the title Add SWIP-39: smart neighbourhood management Add SWIP-39: (placeholder, WIP) smart neighbourhood management Apr 18, 2025
@zelig zelig marked this pull request as ready for review July 21, 2025 07:55
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces SWIP-39, a smart neighbourhood management system for decentralized service networks. The proposal aims to solve the "one operator, one node in a neighbourhood" problem through a balanced assignment mechanism that ensures fair load distribution and prevents sybil attacks.

Key changes include:

  • A comprehensive specification for balanced neighbourhood registry with random assignment
  • Smart contract implementation for managing node registration and neighbourhood assignments
  • Mathematical formulations for neighbourhood depth calculation and overlay address validation

@zelig zelig changed the title Add SWIP-39: (placeholder, WIP) smart neighbourhood management Add SWIP-39: balanced neighbourhood registry aka smart neighbourhood management Jul 21, 2025
@0xCardiE
Copy link
Collaborator

0xCardiE commented Jul 25, 2025

Thanks for the well-thought-out SWIP — the design is elegant and clearly addresses Sybil resistance and balanced assignment. I had a few questions and points I’d like to discuss for further clarity and robustness:

  • Overlay mining and gaming – how is the off-chain mining step protected against precomputation attacks or delay tactics to gain strategic placement?
  • Expired stake handling – have you considered slashing (partial or full) instead of just locking the stake on expiry?

@0xCardiE
Copy link
Collaborator

Could _upgradeDepth() become too expensive to execute as the number of assigned nodes grows?

The _upgradeDepth() function doubles the assignment and remaining lists, copies all existing nodes to their new positions using bitwise logic, and clears/rebuilds state — all in a single call. If the number of nodes reaches high volumes (e.g. 1,000+ or 10,000+), this could approach or exceed the block gas limit, making the function fail or stall the system.

Proposed solutions:
• Consider breaking _upgradeDepth() into incremental steps, e.g., by:
• Using checkpointing to process part of the upgrade each time it is triggered.
• Spreading the reassignment of existing entries across multiple assign() calls.
• Alternatively, lazy-initialize entries in the assign() function, only when a neighbourhood is actually being filled.

@0xCardiE
Copy link
Collaborator

Can the committers array become inefficient with a large number of registrants (e.g. 20k nodes)?

The committers[] array is iterated over in _expire(), _findEntryFor(), and _removeCommitter() using for loops. If a large number of nodes register, or if expired entries are not promptly cleared, the gas cost of these operations can grow linearly and become prohibitively expensive.

Proposed solutions:
• Maintain a mapping from address to index to allow constant-time lookup and removal.
• Use a circular buffer or a “head” pointer to avoid shifting array entries when removing expired ones.
• Alternatively, mark entries as expired/inactive with a boolean flag instead of removing them from the array immediately.

@zelig
Copy link
Member Author

zelig commented Jul 26, 2025

Can the committers array become inefficient with a large number of registrants (e.g. 20k nodes)?

I did not consider it realistic, since each registrant entry expires in max 256 blocks, that is in a matter of <4 game rounds and they lose their deposit if they refuse to pay, so likely all the potential players may organically wait out.

The committers[] array is iterated over in _expire(), _findEntryFor(), and _removeCommitter() using for loops. If a large number of nodes register, or if expired entries are not promptly cleared, the gas cost of these operations can grow linearly and become prohibitively expensive.

Proposed solutions: • Maintain a mapping from address to index to allow constant-time lookup and removal. • Use a circular buffer or a “head” pointer to avoid shifting array entries when removing expired onesr . • Alternatively, mark entries as expired/inactive with a boolean flag instead of removing them from the array immediately.

but they need to be removed at some point.... and I am not sure how a mapping that needs to be reindexed after every entry removed, will solve this.

@zelig
Copy link
Member Author

zelig commented Jul 27, 2025

Can the committers array become inefficient with a large number of registrants (e.g. 20k nodes)?

well, maybe. To be honest, there is also another way. We do not need to allow, just any length of the committer list. The length represents the queue, and the length of valid entries are the ones in the queue you can skip. This effectively quantifies the tries that you got (effectively mining) but also the realistic probability that that someone will come in and change the neighbourhood you (thought you were) assigned to. If this probability is high (there is a lot of nodes that can submit mined overlays), then it can easily happen, that whenever an assigned neighbourhood is read off, nodes will frontrun. So it would just make sense to limit this skip queue to a fix constant number. But this means that the committers list should effectively have a limited length. Now if we siply reject registrations beyond this limit, then the shorter this length, the harder it is for the same amount of currently aspiring nodes to commit. Now in order to avoid that the registration tx needs to be continuously retried (due to it most likely be frontrun by competing resistrants), we should introduce another proper FIFO queue (that is unlimited but does not need iteration). In this case the validity period starts when you enter the limited queue.

Proposed solutions: • Maintain a mapping from address to index to allow constant-time lookup and removal. • Use a circular buffer or a “head” pointer to avoid shifting array entries when removing expired ones. • Alternatively, mark entries as expired/inactive with a boolean flag instead of removing them from the array immediately.

Not sure I get how these structures would be useful: index needs reindexing or keeps inactive entries, head pointer just delays the problem and so does the inactive flag.

@significance
Copy link
Member

Thanks for the well-thought-out SWIP — the design is elegant and clearly addresses Sybil resistance and balanced assignment. I had a few questions and points I’d like to discuss for further clarity and robustness:

  • Overlay mining and gaming – how is the off-chain mining step protected against precomputation attacks or delay tactics to gain strategic placement?
  • Expired stake handling – have you considered slashing (partial or full) instead of just locking the stake on expiry?

i. the mining step is just offloading computation rather than POW, strategic placement is prevented by random allocation, economic disincentives to be quantified forwith
ii. in this model the stake as discussed if always burned here could otherwise be seen as a fee which perhaps should be considered as recirculating to rewards and/or burned

@significance
Copy link
Member

Could _upgradeDepth() become too expensive to execute as the number of assigned nodes grows?

The _upgradeDepth() function doubles the assignment and remaining lists, copies all existing nodes to their new positions using bitwise logic, and clears/rebuilds state — all in a single call. If the number of nodes reaches high volumes (e.g. 1,000+ or 10,000+), this could approach or exceed the block gas limit, making the function fail or stall the system.

Proposed solutions: • Consider breaking _upgradeDepth() into incremental steps, e.g., by: • Using checkpointing to process part of the upgrade each time it is triggered. • Spreading the reassignment of existing entries across multiple assign() calls. • Alternatively, lazy-initialize entries in the assign() function, only when a neighbourhood is actually being filled.

agree with this, some discussion around implementing binary trie or similar datastructure which will ensure uniform gas usage while providing for the necessary functionality

@significance
Copy link
Member

very good swip, a few thoughts for discussion and expansion in the doc:

  1. it is important to ensure currency of nodes, methods must be added that record recent activity and penalise nodes not taking part in this by at least removal from the allocated nodes pool, thus preventing squat attacks 🏋️
  2. a thorough quantisation and parameterisation of economic disincentives should be performed, including costs arising from capital illiquidity
  3. a withdrawal queue or delay should be considered to improve network integrity promises
  4. likewise an node onboarding processes which perhaps could include proof of well behaved protocol adherance prior to admission into rewards pool
  5. it would improve ui/security to provide facility to decouple keys: network key, withdrawal address similar to eth, maybe also a nominated admin address for web ui too
  6. a rigorous approach to nomenclature at this point would be prudent, given that the tree depths discussed here are distinct from those in the storage network itself

@awmacpherson
Copy link

Andrew from Shtuka here. This proposal looks like it will have considerable impact on staking dynamics, so it's important for our work on "Swarmonomics" to understand what the economic impact would be and give feedback if it gives cause for concern, or if the proposal lacks details that we would need to make that determination.

I came into this assuming that the goal of this SWIP was to help the node population achieve a uniform distribution across neighbourhoods at the depth reported in the Redistribution game. Let's call this number $d_R$. But the depth parameter described here — I'll call it $d_{39}$ — has no relation to $d_R$. So my next best assumption is that the goal is to achieve a node population uniformly distributed among depth $d_{39}$ neighbourhoods.

I am having difficulty understanding how deregistrations are handled, especially when a deregistration would result in a decrease in $d_{39}$.

  1. Is "level" and "depth" the same thing in this document?
  2. What is an "unassigned" neighbourhood? It sounds like it means a depth $d_{39}$ neighbourhood that has no registered nodes. If so, do we need to also keep track of depth $d_{39}$ neighbourhoods with more than one node registered?
  3. If I assume "level" means the same thing as $d_{39}$, $R$ tracks unassigned (empty?) neighbourhoods of depth $d_{39}$. So I guess $R=R_{d_{39}}$. I don't believe the claimed invariant $\mathrm{len}(R) = 2^{d_{39}} - N$ is maintained if deregistrations are allowed. It is possible to deregister a node, hence decrement $N$, without causing a neighbourhood to become empty or changing $d_{39}$ (see below).
  4. Is the desired property that node address assignments are always balanced (i.e. being "balanced" is an invariant property of the address table) or that they "converge" towards being balanced? If the latter, how quickly must they converge?
  5. Is $d_{39}$ actually supposed to be allowed to decrease? The implementation attached to the proposal doesn't include any function that decrements d.

Suppose we start at $d=0$ (no nodes). How is the following sequence of updates handled?

# d = 0
# nodes = []
REGISTER   0b01...
# d++ (d=1)
REGISTER   0b11...
# d++ (d=2)
# R ?= [0b10, 0b00]
DEREGISTER 0b11...
# decrement d? 
# if d=1 next registration must start with 0b1
# leave d the same?
# if d=2, next registration could start with 0b0, leaving 0b1 bucket empty

Here it seems to me that allowing $d$ to decrease is closest to the intention of the proposal.

What about this one?

# d = 3
# nodes = [0b000..., 0b010..., 0b100..., 0b110...]
REGISTER   0b001...
DEREGISTER 0b110...
DEREGISTER 0b100...
# d = 2
# nodes = [0b000..., 0b001..., 0b010...]
# depth d neighbourhood 0b00 now has 2 nodes
DEREGISTER 0b001...
# N decreases without another depth 2 neighbourhood becoming empty

In this case, the scheme of assigning only empty depth $d_{39}$ neighbourhoods can't do anything to incentivise 0b000... or 0b001 to "balance" their distribution by migrating out of the overpopulated 0b00 neighbourhood. This brings us back to the question of clarifying what "balanced" means.

Migration

Upon migration, should nodes get to keep their existing overlay addresses or do they need to mine new ones according to the new assignment scheme? If the latter, what happens to all the data currently stored in the reserves of staking nodes?

Gameability

Clearly, if the automatic randomised assignment of neighbourhoods is meant to control how many nodes end up in each neighbourhood, we need to consider whether it might be gamed by operators who have preferences about their address that differ from what they would be assigned randomly. There are a few reasons that mining a prefix may be desirable:

  • Some $d_R$-neighbourhoods have less active stake than others. The operator can make more profit by deploying a node into such neighbourhoods.
  • The operator may wish to deploy multiple nodes into the same $d_R$-neighbourhood and save money by deduplicating backend storage, making the neighbourhood look like it has more redundancy than it really does.

If mining prefixes is valuable, it may be necessary to make it more costly by adding more financial or time costs to the entry or exit procedure.

If stake is instantly withdrawable, the cost of "rerolling" one's neighbourhood assignment as described in this proposal is a couple of transactions and a 1 block wait (for obtaining randomness). By itself, this is not very costly. For example, rerolling until a desired 10 bit prefix is achieved, provided it is actually in the pool of assignable neighbourhoods, would most likely take just a few hours and cost a negligible amount in gas. If stake is not withdrawable, the minimum stake deposit is added to the cost of rerolling.

Allocating addresses from a restricted range may lead to unexpected dynamics. For example, the set of assignable neighbourhoods is smaller when $N$ is nearly a power of 2. Operators can therefore forecast their allocations with greater accuracy at such times, which may make rerolling worthwhile even if it is costly. This proposal therefore adds a significant timing element to mining prefixes. It also may incentivise "squatting" behaviours as @significance pointed out.

Relation to other proposals

In the Migration section of the proposal it is mentioned that some simplifications to the staking contract should occur before implementing this SWIP. I'm not sure which simplifications you mean other than fixed stake, but I'll comment on the latter.

  • In my view, there is no logical dependency between this proposal and fixed stake limits. It is true that, under the current elective stake scheme, random neighbourhood assignment could make it hard to identify the optimal stake deposit before registration. But if stake balance can be adjusted after assignment, the operator can compute the optimal amount and adjust after the randomness is drawn.
  • Fixed stake limits does imply that the $d_R$-neighbourhoods with less active stake also have a lower node count. If $d_R &lt; d_{39}$ (which should normally be the case unless something is very wrong), the proposed scheme makes it more likely that new nodes go into underpopulated $d_R$-neighbourhoods, but not certain, so the incentive to reroll remains. In an elective stake system like we have now, there is no general relation between the stake balance of a $d_R$-neighbourhood and the probability of being assigned to it.
  • If the stake system is updated to allow withdrawals unconditionally, then introducing a withdrawal delay/queue provides a controlled way to throttle prefix mining. If stake continues to not be withdrawable, then the minimum deposit is itself a cost of rerolling (and at the current level of 10 BZZ, probably plenty to make rerolling for a deduplication attack not worth it). Mixed schemes where exiting costs money and time may also be worth exploring.
  • Updates that would make the system simpler would indeed be welcome before introducing new complications.

Recommendations

  1. Clarify what property the proposal is intended to achieve. In particular, is the desired property an invariant or a dynamical property? Ideally, we should be able to prove that the proposal achieves the property (under idealised conditions, if the property is dynamical).
  2. Rather than a full implementation example, append a set of Solidity tests that should pass on a successful implementation.
  3. If the cost of rerolls is configured suitably, the commit-reveal scheme could be made to implement any address assignment system. Consider whether we could achieve a good enough balancing effect with a different scheme that is easier to model, such as assigning overlay addresses uniformly at random. (Queues and other stateful systems are hard to model when rational agents are involved.)

@significance
Copy link
Member

significance commented Aug 10, 2025

hi @awmacpherson thanks for this, very detailed and insightful

not comprehensive but some responses for you

  • depth has various levels just as in the physical property, so to some extent interchangeable yes
  • de/re registration in active discussion and we would welcome any input you have on this and thank you for what is here
  • fwiw i agree there are undesirably opportunities arising for the $N-k$ th node for small $k\in\mathbb{Z_+}$
  • as positions in the tree become available after deregistration, those causing the greatest imbalance could be best weighted in the randomised allocation, it is important to find the right tradeoff here and maybe the crux of the problem. an easy solution could be achieve by simply picking at random from the best $n$ addresses to achieve growth and uniformity, where $P=1/n$ for some desired probability (composed of the deregistrations and next logical)
  • variable stake is something that is definitely worth digging into in depth, and in particular whether this is directly proportional to storage space provision or not as a somewhat orthogonal enquiry. stake per unit storage with superfluous stake withdrawal seems like a good solution either way
  • iiuc it will need to be dynamical

@significance significance changed the title Add SWIP-39: balanced neighbourhood registry aka smart neighbourhood management add SWIP-39: balanced neighbourhood registry aka smart neighbourhood management Aug 10, 2025
@awmacpherson
Copy link

awmacpherson commented Aug 11, 2025

depth has various levels just as in the physical property, so to some extent interchangeable yes

If this is a response to whether "depth" and "level" are interchangeable, then I'm afraid I don't understand. To what extent are they interchangeable? What does "has various levels" mean?

as positions in the tree become available after deregistration, those causing the greatest imbalance could be best weighted in the randomised allocation, it is important to find the right tradeoff here and maybe the crux of the problem. an easy solution could be achieve by simply picking at random from the best $n$ addresses to achieve growth and uniformity, where $P = 1 / n$ for some desired probability (composed of the deregistrations and next logical)

Here "the tree" means the tree of all bitstrings (of length <= 256)? What does it mean for a position in the tree to become available? The way the proposal is written suggests that only bitstrings of length $d_{39}$, where $d_{39}$ is globally defined at any one time, are assigned at any one time. Is the intention actually that nodes corresponding to prefixes of different lengths can be assigned at the same time?

Here is my attempt to make sense of this: given a set $S$ of overlay addresses, each address $a$ has a shortest prefix $p(a)$ not shared by any other address in the set. Take the subtree $T(S)$ of the tree of all bitstrings spanned by the set of prefixes $p(a)$. Then it makes sense to ask if $T(S)$ is balanced as a binary tree. It sounds as though this is the type of "balancing" you might be after. One can then cook up a metric measuring how far $T(S)$ is from being balanced and always prefer to assign addresses that reduce this distance.

Note that assigning addresses uniformly at random already has a weak version of this property, which is that if $x$ and $y$ are leaves and $\ell(x) &lt; \ell(y)$, then $x$ is more likely to be assigned than $y$ because it corresponds to a larger address block.

@0xCardiE
Copy link
Collaborator

image

Made this, might be useful to add it to SWIP. If its correct :) let me know If its needed to change or add something

@0xCardiE
Copy link
Collaborator

Some semantics, for a term “Ether address” that is mentioned in SWIP multiple times, its technically incorrect term, needs to be “Ethereum address” as Ether is currency and address doesn’t belong to Ether but to network.

@significance
Copy link
Member

Here is my attempt to make sense of this: given a set S of overlay addresses, each address a has a shortest prefix p ( a ) not shared by any other address in the set. Take the subtree T ( S ) of the tree of all bitstrings spanned by the set of prefixes p ( a ) . Then it makes sense to ask if T ( S ) is balanced as a binary tree. It sounds as though this is the type of "balancing" you might be after. One can then cook up a metric measuring how far T ( S ) is from being balanced and always prefer to assign addresses that reduce this distance.

Note that assigning addresses uniformly at random already has a weak version of this property, which is that if x and y are leaves and ℓ ( x ) < ℓ ( y ) , then x is more likely to be assigned than y because it corresponds to a larger address block.

this is correct i believe. for the second part: yes, but i think it it is too weak and that we must pursue an onboarding/off-boarding queue approach

cc: @zelig 👁️

@significance
Copy link
Member

significance commented Sep 16, 2025

Here is my attempt to make sense of this: given a set S of overlay addresses, each address a has a shortest prefix p ( a ) not shared by any other address in the set. Take the subtree T ( S ) of the tree of all bitstrings spanned by the set of prefixes p ( a ) . Then it makes sense to ask if T ( S ) is balanced as a binary tree. It sounds as though this is the type of "balancing" you might be after. One can then cook up a metric measuring how far T ( S ) is from being balanced and always prefer to assign addresses that reduce this distance.

Note that assigning addresses uniformly at random already has a weak version of this property, which is that if x and y are leaves and ℓ ( x ) < ℓ ( y ) , then x is more likely to be assigned than y because it corresponds to a larger address block.

this is correct i believe. for the second part: yes, but i think it it is too weak and that we must pursue an onboarding/off-boarding queue approach

cc: @zelig 👁️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement enhancement of an existing protocol/strategy/convention protocol describes a process every swarm node must implement and adhere to
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants