Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Danilo Pianini, Roberto Casadei, Mirko Viroli

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

  • Break symmetry in peer networks
    • Base block to create algorithms that require an “initial point”
  • Send (summarize) data to a sink
  • Find a coordinator for the whole network
  • Find a single access point for the whole network

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

  • Base block to build multi-scale network overlays
  • Send (summarize) data to a local sink
    • e.g., for later global summarization
  • Find a coordinator for the local portion of the network
  • Intercept phenomena whose scale is neither device-local nor global
    • e.g., non-spotty heavy rain may cause flooding

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

  • Select leaders with a specified criterion
    • Assign demanding task to more performant nodes
    • Prefer static nodes over mobile ones for network stability
    • Coordinate activities in nodes with larger data-rate
    • Balance energy consumption by moving activities to high-battery nodes
  • The leader set can change dynamically
    • Enabling group tracking of underlying spatio-temporal phenomena

Self stabilisation

Ability to converge:

  • to correct states
  • in finite time
  • from any arbitrary initial state

hence, in spite of transitory changes (resiliently)


We call the system “self-stabilizing” if and only if, regardless of the initial state and regardless of the privilege selected each time for the next move, at least one privilege will always be present and the system is guaranteed to find itself in a legitimate state after a finite number of moves E. W. Dijkstra, “Self-stabilizing systems in spite of distributed control” 1974

Example of not-self-stabilising leader election

Despite no changes to the algorithm input, the output oscillates

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

  • Inputs can change, the leader set adapts in finite time
    • As far as the inputs are self-stabilising expressions
  • Same inputs $\Rightarrow$ Same output: reproducible results
  • Convergence to a stable state guaranteed in finite time
  • No formal guarantees on the transient

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

Self-stabilising Priority-Based Multi-Leader Election and Network Partitioning

What is outside there

  • In J. Beal and M. Viroli, Building blocks for aggregate programming of self-organising applications, FoCAS 2014
    • “Sparse spatial choice” or just S, designed for multi-leader election
    • Based expanding areas of influence and competition at the border
    • Likely self-stabilising, but never proven
  • In Y. Mo, J. Beal, and S. Dasgupta, An aggregate computing approach to self-stabilizing leader election, eCAS 2018
    • Focused on single leader election
    • Relies on network diameter estimation feedback loops
    • Self-stabilising (combines self-stabilising blocks in a self-stabilising manner)
  • In Y. Mo, G. Audrito, S. Dasgupta, and J. Beal, A resilient leader election algorithm using aggregate computing blocks, to appear
    • Extension of the former
    • Tunable for multi-leader election, but with a partition size proportional to the network diameter
    • Still relies on diameter estimation, collection, and propagation

Goals

  • Multi-leader election
  • Explicit priority control
  • Explicit partition size
  • Adaptable distance metric
    • Must work in any metric space, regardless if it maps a phyisical space or not
  • Possibly, no feedback loops
    • Diffusing information is more robust than accumulating it
    • Accumulation + diffusion weighs on performance and robustness

Trivial idea

  1. Make yourself a candidate leader
  2. Select the best leader including information from neighbors (if any)
  3. Gossip the information on the best leader to all neighbors
  4. If the best leader is too far away, re-run the algorithm ignoring nodes managed by the best leader
  • Basically, remove the first partition from the domain

Trivial idea running example, I

The network begins the computation

Trivial idea running example, II

Every device knows only itself, it assumes leadership and tells neighbors

Trivial idea running example, III

Gossiping of the best leader information induces the creation of partitions

Trivial idea running example, IV

Regions form, but the spreading now reaches the maximum allowed dimension (2 hops)

Trivial idea running example, V

Nodes that know they can not join the best leader they know restart the algorithm

Trivial idea running example, VI

The “second best” leader search wave moves across the network

Trivial idea running example, VII

Previous areas get disrupted!

Trivial idea running example, VIII

New regions begin to form again in parts of the network that can’t be controlled by the best node

Trivial idea running example, IX

All nodes joined the “second wave” of gossiping…

Trivial idea running example, X

…which does not cover the whole network, so a third restart happens

Trivial idea running example, XI

All nodes are now either stable or in their last recursion of the algorithm

Trivial idea running example, XII

Stable nodes do not propagate information on unstable partitions, slowing down convergence

Trivial idea running example, XIII

In one step, stability will be reached

Trivial idea running example, XIV

Done

Trivial idea

Pros

  • Works
  • Uses solely diffusion, with no accumulation
  • Can be implemented very succintly using frameworks such as aggregate computing
def leaderElection(id, priority, radius, metric) {
  let best = gossip([priority, id], min).get(1) -- best candidacy in the neighborhood
  if (distanceToWithMetric(best == id, metric) <= radius) {
    best -- We can be managed by the best known leader
  } else { leaderElection(id, priority, radius, metric) }
}

Cons

  • It is not self-stabilising for many implementations (including the one above)
    • Gossip is not self-stabilising
    • It can be made self-stabilise by running overlapping replicates
  • Discards useful information when re-stabilising!
    • We had information on the second-best leader that we re-computed…

Bounded Election

Instead recursively gossiping, parallelise the computation and compete at the border

  1. Every device produces its opinion on the best leader, which includes:
    1. the leader’s priority;
    2. the distance to the leader;
    3. the leader’s id
  2. At the beginning, every device candidates itself
  3. Every device $d$ observes the best candidates in its neighbourhood, discarding those that promote $d$ itself and those whose proposed leader is too far away.
  4. The new best leader is the best between itself and the best (if any) of those acquired from the neighbourhood.
  5. The best leader can be communicated to the neighbours.

Bounded Election running, I

The network begins the computation

Bounded Election running, II

Every device knows only itself, it assumes leadership and tells neighbors

Bounded Election running, III

Partitions begin to form, several nodes remain in inconsistent state (gray-filled)

Bounded Election running, IV

The first two partitions are formed! 25/32 nodes reached stability in two rounds

Bounded Election running, V

Stabilizing the outskirts of the network takes longer, especially if previous information is useless

Bounded Election running, VI

In unlucky cases, several nodes may reset when their former leader is embodied in another partition

Bounded Election running, VII

In the worst case, a partition restabilises in the time required for the new leader to reach the farthest node

Bounded Election running, VIII

The network is thus stable in eight rounds (compared to 14 of the recursive version)

Bounded Election

Pros

  • Works
  • Uses solely diffusion, with no accumulation
  • Can be implemented succintly using frameworks such as aggregate computing
  • Stabilises multiple partitions in parallel
  • Supports prioritisation and control of the distance metric
  • It proven to be self-stabilising (more in a moment)

Cons

  • As per the recursive version, there are pathological cases
    • Can be concocted in simulation
    • The base idea is to place high priority nodes inconveniently, and let the second-highest priority node move continuously across the border of the highest-priority partition, triggering continuous re-adaptation in the whole network

Self-stabilisation of Bounded Election

Proof structure:

  1. Pick a demonstrably self-stabilising fragment
    $\Rightarrow$ (the minimising share)
  2. Re-write parts of it in a way that preserves self-stabilisation
  3. Until you get to an implementation of Bounded Election

Proof

The minimising share pattern

We start from a proven self-stabilising fragment:

share(x <- e) { fR(minHoodLoc(fMP(x, s̄), e), x)}

Where x is a neighbour field, namely a mapping from each device in the neighbourhood of the executing device with some value.

Rewrite: identity for fR

  • fR(x, p) is a raising function w.r.t. partial orders of x and p, with p previous value of x
share(x <- e) { fR(minHoodLoc(fMP(x, s̄), e), x)}

$\big\Downarrow$

share(x <- e) { minHoodLoc(fMP(x, s̄), e) }

Rewrite: local candidacy for e

  • e is any self-stabilising expression
  • We rewrite it with a triple representing the local candidacy, where:
    • value is the local priority
    • distance is the distance from the leader
    • lead is the unique identifier of the local node
  • Moreover, we assume in our scope:
    • a priority value representing the leader strength
    • an id value representing a unique identifier for the local node
share(x <- e) { minHoodLoc(fMP(x, s̄), e) }

$\big\Downarrow$

local <- (value: -priority, distance: 0, lead: id)
share(x <- local) { minHoodLoc(fMP(x, s̄), local) }

Rewrite: return the id

  • We want to return only the leader id
  • It does not alter the share pattern
local <- (value: -priority, distance: 0, lead: id)
share(x <- local) { minHoodLoc(fMP(x, s̄), local) }

$\big\Downarrow$

local <- (value: -priority, distance: 0, lead: id)
share(x <- local) { minHoodLoc(fMP(x, s̄), local) }.lead

Rewrite: fMP

  • fMp(x, s̄) is a monotonic progressive function of x
  • are additional arguments that can be passed to the function as far as they are self-stabilising expressions
  • We assume a metric() function returning a field of distances from each neighbour
  • We assume a maximum partition radius
  • We replace it with a map and filter functions that:
    • add the distance to valid neighbor candidacies
    • remove invalid candidacies
  • It cannot return a field with a value lesser than the lesser value of x, and it is thus a valid replacement
local <- (value: -priority, distance: 0, lead: id)
share(x <- local) { minHoodLoc(fMP(x, s̄), local) }.lead

$\big\Downarrow$

local <- (value: -priority, distance: 0, lead: id)
share(x <- local) {
  minHoodLoc(
    x.map { (it.value, it.distance + metric(), id }
      .filter{ it.distance <= radius && it.id != local.id },
    local
  )
}.lead

Minimisation

  • minHoodLoc is a function selecting the minimum among the field values and the provided local value
  • We assume triples to be ordered based on their components

No rewriting needed, and since:

local <- (value: -priority, distance: 0, lead: id)
share(x <- local) {
  minHoodLoc(
    x.map { (it.value, it.distance + metric(), id }
      .filter{ it.distance <= radius && it.id != local.id },
    local
  )
}.lead

is a valid implementation of Bounded Election,

Bounded Election is self stabilising

Evaluation

Scale free Edge Random
Devices switch priority at runtime Some devices are static All devices move randomly
Tests adaptation transitories Tests prioritization effectiveness Searches for pathological cases

Metric

Measuring adaptation is not so easy

  • We concocted a metric of instability
  • Informally:
    • how many times every node changed its leader over how many times it could have done so, in the last 10 seconds
    • the lesser, the better

Formally

  • let $I$ be the set of all possible device UIDs;
  • let $l^d_t \in I$ be the leader selected in device $d$ at round $t$;
  • let $T$ be the current round;
  • for each device $d$, we consider the set $L^d_{TR} = { l^d_{T}, l^d_{T-1}, \dots, l^d_{T-R} }$ comprising the leaders perceived in a mobile window spanning the last $R+1$ rounds;
  • we then consider the $R$ couples of subsequent leaders $S^d = { (l^d_i,\ l^d_j)\ |\ \forall j = i + 1;\ l^d_i, l^d_j \in R }$;
  • we define a function over 2-ples $c: I^2 \to {0, 1}$ $$ c((k_1, k_2))= \begin{cases} 1,\ k_1 = k_2\ 0,\ \text{otherwise}\ \end{cases} $$ that outputs $1$ iff the elements of the tuple are equal;
  • we then count how many times in the last $R + 1$ rounds the leader has not changed, normalised on the considered number of rounds: $C^d = \frac{\sum_{x \in S^d} c(x)}{|S^d|}$, we consider this a measure of local instability;
  • finally, we obtain our global metric of global instability $Y = \frac{\sum_{d \in I} C^d}{|I|}$ by normalising the sum of the local contributions.

Compared algorithms

  • Bounded election (proposed)
  • Baseline: recursive election (recursive)
    • To measure if and how better parallelisation is
  • Baseline: sparse-choice (classic)
    • To compare with the state of the art, as it is the closest and the only one using solely propagation (hence, likely the fastest)

Details

Scale-free network

Expectation: Sparse-choice and Bounded Election perform similarly, the recursive version is worse

  • Bounded Election has consistently the fastest convergence
  • All algorithms self-stabilised
  • The recursive version is very sensible to changes in the leader priority (using central leaders makes the algorithm outperform the classic one), Bounded election and Sparse-choice are not

Expectation: Bounded Election performs best, the recursive version should outperform Sparse-choice or be close (prioritization of static nodes should compensate for occasional large disruptions)

Asymmetric nodes

  • The algorithms behave as forecasted
  • Bounded Election performs best, as it can select the stable leaders via prioritization and is faster than the recursive to recover from disruptions.
  • Sparse-choice’s average performance is similar to the recursive one, but more predictable

Expectation: Nodes moving randomly favor Sparse-choice, that should perform best as it minimises changes within partitions by competing only on the border.

Randomly moving nodes

  • The behaviour is surprising: Bounded Election performs best again
  • Sparse-choice inner-partition stability seems to generate instability globally
    • Our current hypotesis is that, although the partition core is more stable in the competing solutions, most nodes are in the outskirts and do not enjoy it
    • Probably, with so much randomness, occasional network-wide disruptions are less impacting than continuous border adjustments

Conclusion

Bounded Election is a proven self-stabilising algorithm for multi-leader election that induces network partitioning and suppors explicit prioritisation and arbitrary metrics.

  • It performs better than other multi-leader election algorithms in most scenarios
  • Corner cases where the behaviour is bad can get built
    • but it looks pretty unlikely that they occur

What lies ahead

  • More testing is needed to assess how frequently the configuration can induce unwanted behaviour
  • Tackle cases where the behaviour is not ideal
  • Towards a new family of self-stabilising leader election algorithms?