Evolution of Consensus Algorithms

Blockchain technology

  • Concurrency: each computer in the network executes events independently at the same time as other computers in the network.
  • No common physical clock: there is no single global clock that determines the sequence of events happening across all the computers in the network. For a distributed system to work, we need a way to determine the order of events. However, in a set of computers operating concurrently, it is sometimes impossible to say that one of two events occurred first, as computers are spatially separated. So, time and order of events are fundamental obstacles in a system of distributed computers that are spatially separated(Lamport,1978).
  • Independent failure of components: components in a distributed system are faulty. This is why it’s called “fault-tolerant distributed computing. No distributed network is of zero faults. These faults occur in different computers/nodes in a network are independent to each other in a distributed system. Real systems are subject to a number of possible flaws or defects, such as, a process crashing, messages being lost, distorted, or duplicated, a network partition delaying or dropping messages or even a process going completely haywire and sending messages according to some malevolent plan.
  • No shared memory: This is a key feature that requires “message-passing” for communication. This feature implies the absence of the common physical clock. This message passing may between one or more computers in the system. The coordination and communication of the distributed systems are achieved by message passing. Message passing can be done synchronously or asynchronously.
  • Crash-fail: The component stops working without warning (e.g., the computer crashes)( Process failure models can be fail-stop or crash where in fail-stop model other processes can learn that the process has failed)
  • Omission: The component sends a message but it is not received by the other nodes (e.g., the message was dropped)(Process failure models can be receive omission errors, send omission errors or general omission errors)
  • Byzantine: The component behaves arbitrarily. This type of fault is irrelevant in controlled environments (e.g., Google or Amazon data centers) where there is presumably no malicious behavior. Instead, these faults occur in what’s known as an “adversarial context.” Basically, when a decentralized set of independent actors serve as nodes in the network, these actors may choose to act in a “Byzantine” manner. This means they maliciously choose to alter, block, or not send messages at all.

Synchronous message passing

  • There is a known upper bound on the message communication delay.
  • There is a known bounded drift rate for the local clock of each processor with respect to real-time. The drift rate between two clocks is defined as the rate at which their values diverge.
  • There is a known upper bound on the time taken by a process to execute a logical step in the execution.

Asynchronous message passing

  • Some of computers/nodes being faulty.
  • The network is not reliable and messages may fail to deliver, be delayed, or be out of order.
  • There is no global clock to help determine the order of events.

Byzantine agreement problem and consensus problem

1. Byzatine agreement problem

2. Consensus problem

FLP impossibility result.

  1. Using synchrony assumptions.
  2. Use non-determinism.

1. Consensus using synchrony assumptions

Paxos

  1. Proposers
  2. Acceptors
  3. Learners

Condition 1: Acceptors must accept the first proposed value that they receive.

  • The proposer chooses a new proposal version number (n) and sends a “prepare request” to the acceptors.
  • If acceptors receive a prepare request (“prepare,” n) with n greater than that of any prepare request they had already responded to, the acceptors send out (“ack,” n, n’, v’) or (“ack,” n, ^ , ^).
  • Acceptors respond with a promise not to accept any more proposals numbered less than n.
  1. Promise never to accept a proposal less than n (n is the new proposal number)
  2. Respond with the proposal with the highest number less than n that the acceptor has accepted.)
  • Acceptors suggest the value (v) of the highest-number proposal that they have accepted, if any. Or else, they respond with ^.

Condition 2: If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.

  • If the proposer receives responses from a majority of acceptors, then it can issue an accept request (“accept,” n, v) with number n and value v.
  • n is the number that appeared in the prepare request.
  • v is the value of the highest-numbered proposal among the responses.
  • If the acceptor receives an accept request (“accept,” n, v), it accepts the proposal unless it has already responded to a prepare request with a number greater than n.
  • Whenever an acceptor accepts a proposal, it responds to all learners (“accept,” n, v).
  • Learners receive (“accept,” n, v) from a majority of acceptors, decide v, and send (“decide,” v) to all other learners.(Variations of this process can be used where either all acceptors inform corresponding learners of their decisions or acceptors respond to a distinct set of learners who then propagate the message to the rest of the learners.)
  • Learners receive (“decide,” v) and the decided v.

Raft

Raft server states and transitions
  1. Leader Election
  2. Log Replication
  3. Safety

Byzantine fault tolerant algorithms

Practical Byzantine Fault Tolerant(PBFT)

  1. A new transaction happened on a client and was broadcast to the primary.
  2. The primary multicasted it to all the backups.
  3. The backups executed the transaction and sent a reply to the client.
  4. The client wanted x + 1 replies from backups with the same result. This was the final result, and the state transition happened.

2. Non deterministic consensus algorithms

Nakamoto consensus algorithms

  • Block rewards: Nodes must use a huge amount of energy in-order to compete to become the leader. In Order to economically motivate nodes to repeatedly perform computationally expensive puzzles a block reward is introduced. By this nodes get chance of randomly winning a large reward for appending a correct block.
  • Sybil resistance: which is resisting attacks one person tries to take over the network by creating multiple accounts, nodes or computers. This may lead to out-voting honest nodes or even 51% attack(1) which may lead to double spending.
  • Peer-to-peer gossip: gossip protocol is a peer to peer communication protocol which is based on the way epidemics spread.Assume a node is only connected to a subset of other nodes. Then use the peer-to-peer protocol where messages are being gossiped between nodes.

Proof of Work (PoW)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store