Back to Blog

Consensus Algorithms: PoA, IBFT or Raft?

The Kaleido platform offers three types of consensus algorithms between two blockchain protocols. With the Go-ethereum client it offers Proof-of-Authority, and Quorum client supports Istanbul BFT¬†and Raft. Let’s take a closer look at them so you can decide with confidence which algorithm best fits the business needs for your consortium.

Background

Geth is an open source blockchain protocol implementing the standard Ethereum specification plus other useful features such as a pluggable consensus engine. Geth is among the most widely used implementations of Ethereum nodes. It supports both the standard DAG based Proof-of-Work consensus and Proof-of-Authority (PoA) consensus with its implementation called Clique.

The Geth nodes in the Kaleido platform are based on release version v1.7.3 stable.

Quorum is an open source blockchain protocol specially designed for use in a private blockchain network, where there is only a single member owning all the nodes, or, a consortium blockchain network, where multiple members each own a portion of the network. Quorum is derived from Ethereum by modifying the Geth client. The latest release as of the writing of this blog is based on Geth 1.7.2.

Some of the key features of Quorum include:

  • Privacy via private transactions: members of a Quorum network can send private transactions that are addressed to a subset of nodes, such that the contents of the transaction are not exposed to non-privy members.
  • Peer permissioning: a Quorum network can be configured to run in *permissioned* mode such that all nodes must be explicitly listed in an access control list enforced by all nodes. This prevents foreign nodes from tapping into the network and replicating blocks as is the case in permissionless networks.
  • Flexible consensus: described in greater detail later in this post, Quorum supports Raft and IBFT as valid consensus options. Both support transaction finality (i.e. lack of chain forking) and offer shorter block intervals than proof-of-work.

For more information on Quorum, visit the github.com repository or the Quorum wiki.

What Is “Consensus” Anyway?

If you have not come across distributed systems before, the concept of “consensus” may be foreign to you. In short, because distributed systems act independently when processing information (i.e. executing transactions) and updating state, there must be non-disputable agreement in the resulting states among the nodes. The process of achieving the agreement among the distributed nodes’ states is called *consensus*. For a concise introduction to consensus algorithms, visit this published course by Duke University.

There are many different kinds of consensus algorithms. In the blockchain world you come across myriad *proof-of-X* variations. These types of algorithms tend to rely on laws of physics (limit on computing speed) or economics (incentives for honest behaviors or disincentives for dishonest behaviors) to guarantee agreement; they apply to the cryptocurrency networks in a public setting. Discussions on this class of algorithms are out of the scope of this blog. Visit this blog post for a comprehensive run down.

A private or consortium blockchain has different properties, in contrast to public blockchains, that afford opportunities to use alternative consensus algorithms. One key aspect is the fact that all nodes must be explicitly allowed to join the network, aka peer permissioning. In addition, clearly designated node identities apply accountability toward the participating members. These properties make it possible to employ consensus algorithms that are designed for a distributed network comprised entirely of trusted (honest) nodes. Examples include Paxos (as seen in Kafka Zookeeper implementations) and Raft (to be discussed below). For networks where participants don’t assume honesty toward each other, there are a number of Byzantine Fault Tolerant variants (e.g. Istanbul, addressed later in this post) that can be employed.

Clique (Proof-of-Authority)

Geth nodes in Kaleido use Proof-of-Authority (PoA) as the consensus algorithm. In particular, the implementation by the Geth team is called Clique. Clique uses digital signatures to seal the blocks and achieve data immutability. For consensus, PoA relies on a set of trusted nodes called Authorities that use a simplified messaging algorithm to achieve better performance than typical PBFT algorithms. There is only one round of messages exchanged among the authorities in PoA, compared to 3 rounds in PBFT. Thus, better performance is one of the claims of PoA when measured against other BFT algorithms, especially PBFT.

A PoA network can tolerate up to N/2 – 1 byzantine authority nodes. Namely, it can operate correctly when a simple majority of the authority nodes, N/2 + 1, are honest. For each block, multiple authority nodes are allowed to propose. The algorithm relies on Ethereum’s GHOST protocol to resolve forks that can result from multiple authorities proposing at the same time. Compared to PBFT algorithms, the design of PoA sacrifices consistency (forking can happen) for better availability (faster block committal). And in particular Clique offers eventual consistency PBFT vs. PoA analysis, when the forks get sorted out by the GHOST protocol.

PoA produce blocks at a configurable but fixed interval, regardless of whether there are transactions to include. Kaleido, as of this writing, sets the block interval to 5 seconds for all nodes using PoA.

Clique is an appropriate choice for a network containing parties that do not trust each other. When considering this algorithm, makes sure to understand that forking will happen in a network with more than 4 authorities. In Kaleido, all nodes are currently established as authorities. So in a network of 6 nodes, 5 for the users plus the System Monitor node, up to 2 authority nodes can propose blocks for any given block interval. The formula is `N – (N/2 + 1)`.

Another rather important aspect of Clique is that each block is sealed by only one signature – that of the proposer’s. This is significantly different than PBFT where at least a super majority of validators provide their signatures to every block. The implication is that the data immutability guarantee is weaker than PBFT. However, it still provides a strong guarantee across a chain with many blocks, because every proposer is only allowed to sign once every N/2 + 1 blocks. Given the strong tamper proof feature of the blockchain itself, this means that in order to attack the data on the blockchain the attacker would still have to compromise the whole set of authority nodes.

Example block in a Geth chain using Clique PoA:

> eth.getBlock(10000)
{
 difficulty: 2,
 extraData: "0xd783010703846765746887676f312e392e36856c696e75780000000000000000c845989a1fd72cc3d45a9fc1a96429621f716461ef9b6fffca9e72f2fcdcdcbb719c36f47e26da5c749c59efd5692fe8fb5b664580297fb2d11f7eb8b9bb45da01",
 gasLimit: 4712388,
 gasUsed: 857520,
 hash: "0x4eac41573e910786b94536eb14eb8fb70fcce6ab62e6034d72a63660edf9b789",
 logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
 miner: "0x0000000000000000000000000000000000000000",
 mixHash: "0x0000000000000000000000000000000000000000000000000000000000000000",
 nonce: "0x0000000000000000",
 number: 10000,
 parentHash: "0x0f588f510c528f3101220b70268d4162bc56dd693c1ed9c8f3456273ffa31df9",
 receiptsRoot: "0xbbdc940f6b7efb1930667aba9647ba133d75770d418ccdc0e388191610890ca9",
 sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
 size: 6762,
 stateRoot: "0xa9df63c158d4aeff04c3313f50b05bea1f37a57c515a6ee048a94e0d9c8a92c3",
 timestamp: 1526067551,
 totalDifficulty: 19858,
 transactions: ["0x9829a5e1c1d0b360e72e1d5bf6955d49d306d6b81decfe1576251c8bb7dc614e", "0xbdc1e9d58acc97eacc18b856738e0963af79b3e203093f9652281119327870ed", "0x8617a28e960c17d257a27b3f508fe4d01cc8b4eb4ab1310c9d5f7a8240bebdff"],
 transactionsRoot: "0x986a94e7fdc3f54a48c8b8785b64e55355f2fe8374685f6d309786c235a9ca32",
 uncles: []
}

Istanbul Byzantine Fault Tolerance (IBFT)

Quorum has integrated a Byzantine Fault Tolerance consensus algorithm to allow mutually distrusting nodes to form a network where the majority of the nodes are assumed to be correct at any point in time. This has significant implications to expanding the usage of Quorum to many real-world scenarios.

Istanbul BFT, or IBFT, is an implementation of the Practical Byzantine Fault Tolerance algorithm with modifications. The main difference from Raft, or any other Crash Fault Tolerance algorithms, is that while Raft followers blindly trust their leader, in IBFT each block requires multiple rounds of voting by the set of validators to arrive at a mutual agreement, which is recorded as a collection of signatures on the block content. A validator never assumes the “leader”, or “block proposer”, to be correct or honest. Instead it verifies the proposed block just like other consensus engines operating in an untrusted environment (Proof-of-Work, etc.)

By definition, Byzantine Fault Tolerance means a network can continue to function correctly even if some nodes are dishonest and attempt to propose invalid blocks, or blocks that benefit certain parties at the expense of others. In particular, a PBFT implementation, which IBFT is one of, can tolerate up to *f* number of dishonest (faulty) nodes in a network of 3f + 1 nodes. This roughly translates to 1/3 of faulty nodes being tolerated. This can be even more roughly translated to “super-majority rules” algorithm.

When considering IBFT, it’s important to understand the properties of the nodes. If they belong to parties that you don’t want to assume will always act fairly and honestly, like your fiercest competitors, or may be susceptible to compromises, like those deployed in a less-than-ideal security environment, then IBFT is a necessary choice.

A network using IBFT will always produce blocks at a constant interval, regardless of whether there are pending transactions. This means there can be many blocks with zero transactions if the transaction load is low. Kaleido, as of this writing, sets the block interval to 10 seconds for all nodes using IBFT.

Blocks produced by IBFT are strongly protected against tampering through the collection of signatures from the proposer and the voting validators. It will be impossible to rewrite the block content without having access to all the private signing keys of the proposer and the validator nodes. This provides strong guarantees to the immutability of the resulting blockchain.

The list of validators that get involved in voting for each block can be dynamically expanded or shrunk, by asking existing validators to vote:

  • To add a new validator, at least 2/3 of the existing validators must call `istanbul.propose(new_node_address, true);`
  • To evict an existing validator, at least 2/3 of the existing validators must call `istanbul.propose(new_node_address, false);`

Finally, the block interval is configurable, with the default being 1 second. Kaleido today sets the block interval to 10 second for all IBFT networks, with all nodes of the network participating in block voting as validators.

Example block in a Quorum chain using IBFT:

> eth.getBlock(2001)
{
 difficulty: 1,
 extraData: "0xd783010702846765746887676f312e372e33856c696e75780000000000000000f89ed594abbcf2086d69783e7c7bd88193860da082ac5a6fb84131cd6969c6eddd7cef61ce42c881566966922ebf530868b8769db1cb2742031a6045ee9dd1857a04654c672bdd3a05d25435ec9caa98a2388ab2cd1c9e91dad900f843b841f829ba18c2d4a55da9b62179f25488ae7205069e79d94cbf2da3b9ecb62746e50620c1ac308b01e0f23d85de147ec25826e8a03560b8efb068fd08894f8d8b5701",
 gasLimit: 113962254,
 gasUsed: 278693,
 hash: "0x65f173061375a984adc24edec314262f2c47d91ed026d11b0654ecb40747b497",
 logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
 miner: "0x0000000000000000000000000000000000000000",
 mixHash: "0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365",
 nonce: "0x0000000000000000",
 number: 2001,
 parentHash: "0xfb307e7034bcf9553308afaf876991e0a6510ccdb4ac9ed5a918475cf7f9e269",
 receiptsRoot: "0x1ccca06221387a6e390167e39e1f08647686fe4150891fce5e5073a40d1dced4",
 sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
 size: 2708,
 stateRoot: "0xf0e5b4667d71b8505be207177df97e65a763aa09174da5c28350df279ed519f5",
 timestamp: 1521830564,
 totalDifficulty: 2002,
 transactions: ["0x7b031bda65dc52d2940ba221d81c3d4287bcc00b41f63693d60828f2fffdb15a", "0x91da07e8f350b55d1411dbf25ce8c4f675be494c2be73891c562265413d82b15"],
 transactionsRoot: "0x67e1cd0b9f6bd9973b11c5891a5b9711862191e9571b9b38f7b1eb1ea82b49bc",
 uncles: []
}

Raft

Quorum uses a Raft implementation in `etcd` to provide a Crash Fault Tolerant (CFT) consensus engine. Raft is a well-known consensus algorithm developed by researchers at Stanford University. The `etcd` implementation has been battle-tested in many distributed frameworks including Kubernetes, Docker Swarm, CloudFoundry Diego containers, etc.

This is a CFT consensus, as opposed to BFT (Byzantine Fault Tolerance) because in Raft the leader is assumed to always act correctly (honestly). All the followers blindly replicate the entries proposed by the leader with no questions asked. If the leader crashes, the remainder of the network will automatically elect a new leader after a set period of timeout, and the network will continue to function. When the crashed node recovers, it will become a follower and start replicating the blocks it has missed while offline.

Blocks minted in a Quorum Raft consensus network, as of this writing, are not protected by either a unique hash, as in Proof-of-Work, or by signatures, as in other consensus algorithms like BFT. As a result, the blockchain data can be re-written rather easily by modifying historical data (transaction inputs) and re-calculating the block hash (as well as other relevant fields that must be kept coherent, like transactions trie root, etc.). Other means of data protection may need to be employed to ensure the integrity of the blockchain data.

Raft consensus does not mint blocks unless there are pending transactions. This can result in significant storage savings, especially when the transaction load is low, because no empty blocks containing zero transactions are being minted.

Another advantage of using Raft is faster block time compared to IBFT. The leader mints a block within 50ms of receiving the transaction (per the default setting), and flowing a proposed block through the Raft cluster and collecting majority acknowledgements is a very fast process. Under most typical network conditions, average block times are consistently sub-seconds.

Example block in a Quorum chain using Raft:

> eth.getBlock(2)
{
 difficulty: 131072,
 extraData: "0x",
 gasLimit: 802677719,
 gasUsed: 26620,
 hash: "0xc7afe8d21071120517fac800c5619b26efcbfba23f7974effcaf1f100eb1671a",
 logsBloom: "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
 miner: "0x0000000000000000000000000000000000000000",
 mixHash: "0x0000000000000000000000000000000000000000000000000000000000000000",
 nonce: "0x0000000000000000",
 number: 2,
 parentHash: "0x341fed9780faa832a5207e1b7ccbc1d27903ff4d39e92fb02241d707cb49a488",
 receiptsRoot: "0xa13fa2556b7c907468667b128ebd08c4b69d9826e946252dbd230a46d9b1fe8d",
 sha3Uncles: "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
 size: 652,
 stateRoot: "0xc8ed03350b0c8b5a272e8602ca49c91a52b67beb04463e755d120f33ce52d786",
 timestamp: 1522098611952556500,
 totalDifficulty: 262144,
 transactions: ["0x68ebea2905de935128660065eeb019d893f9fe86eff2fe06f56f6eb3a63646a8"],
 transactionsRoot: "0x7c77d4ffc657dfbe00b3d7dba1924041bc1a34b91ee95776ae740c924c31c144",
 uncles: []
}
Prev ConsenSys Unveils Kaleido in Collaboration with Amazon Web Services to Simplify Enterprise Blockchain Adoption Next The Collusion Conundrum