Survive Protocol
This document describes the election protocol survive. It is intended to select a single active instance from a cluster of instances with fast convergence if the active instance fails. All members of a cluster are well-known and must be able to reach each other bidirectionally to prevent the cluster from splitting up.
Terms and Definitions
| Term | Definition |
|---|---|
| FSM | finite-state machine |
| check | a condition that is continuously tested on a node |
| cluster | a group of instances |
| instance | a survive FSM running on a node |
| node | a host with instances of independant clusters |
Design
Survive is designed to ensure that a single active instance is selected from a cluster of instances. As there is no quorum-like mechanism used it is acceptable that a cluster of instances may partition into multiple clusters, each cluster electing an active instance.
In the worst case a cluster of n instances can partition into n clusters, each with a single active instance. An application which relies on the active instance selection must be able to handle such situations gracefully or implement additional locking.
Overview
A cluster of instances acting under a survice protocol instance are identified by:
- an unique instance ID
(uint64) - a PSK used for HMAC-based authentication
- reachability information of instances in a cluster
(IP addresses and port)
The instances needs to comply the following assumptions:
- they may vanish at any time (i.e. due to node failures)
- they will not make Byzantine failures
- they can communicate with each other over IP, although the communication may not be always reliable
State Engine
A survice instance implements the following FSM:
INIT-
This is the initial state when a survive protocol instance starts.
Possible transitions to:
FAILED: when any critical check failsSTANDBY: when all critical check passes
STANDBY-
In this state the instance will participate in cluster communitcation and is ready to take over if the active instance fails.
Possible transitions to:
FAILED: when any critical check failsACTIVE: when this instance is elected
ACTIVE-
In this state the instance is the active instance of the cluster.
Possible transitions to:
FAILED: when any critical check failsSTANDBY: when this instance is preempted by another instance.
FAILED-
This state is reached when a critical check does not pass or the instance is going to shutdown. The instances on other nodes get a notification that the instance has left the cluster for faster convergence time.
Possible transitions to:
INIT: when all critical checks have been passed- termination: during instance shutdown