Chris Natoli

Bitcoin for mathematicians, part 4: The Bitcoin network

published 13 January 2018

( previous part )

This series of blog posts is an exposition of Satoshi Nakamoto’s original whitepaper from 2008, supplemented by bitcoin.org’s developer guide. Since there are already plenty of Bitcoin explainers for laypeople, the intended audience of this series is more niche, so that mathematicians could benefit from a brief yet still precise explanation of Bitcoin.

Nodes and wallets

The Bitcoin peer-to-peer network is the decentralized graph of all nodes that follow the Bitcoin protocol. Nodes generally come in two types, full nodes and spv (simplified payment verification) nodes. The former are machines that store the entire blockchain locally (which at the time of writing is 150 gb). Whenever a new block is mined and then broadcast to the network, full nodes verify that all transactions in the block are valid (e.g., no double-spending, signatures are verified) before appending it to their local copy of the blockchain and broadcasting the verified block to other nodes. Full nodes must constantly listen to the network in case other nodes have a longer chain.

In contrast, spv nodes store just the chain of block headers locally (only 40 mb at the time of writing). If they want to verify a particular transaction, they must request the relevant merkle tree data from full nodes to perform simplified payment verification. The lighter requirements permit smartphones to run spv nodes.

Miners are usually full nodes, although some mining machines are instead connected to a subnetwork of other miners to pool their computations and thus don’t store the entire blockchain. Most full nodes are not miners, however; indeed, very few nodes on the network are miners.

End-users – such as consumers who buy goods in bitcoin and cryptocurrency traders – run a special class of nodes called wallets, which are usually but not necessarily spv nodes. In theory, a wallet is just a set of private keys associated with bitcoin addresses, so a wallet could simply be written on a piece of paper and stuffed under one’s mattress. However, most wallets are programs that facilitate transactions – by generating keypairs, distributing bitcoin addresses, listening for outputs sent to those addresses, signing transactions that spend those outputs, and broadcasting those transactions – and therefore must run nodes connected to the bitcoin network.

Some wallets called full-service wallets do all of the aforementioned tasks on a single machine connected to the internet. However, since the private keys are stored on the same machine, they are prone to being stolen over the internet. Therefore some wallets are split into two, a signing-only wallet and a networked wallet, only the latter of which is connected to the internet. The signing-only wallet generates keys and signs transactions, while the networked wallet distributes addresses, listens for payments, constructs unsigned transactions, and broadcasts transactions after the offline wallet signs them. One implementation of this system uses two machines, the signing-only machine being totally isolated from the internet, and a usb drive to transfer public keys and transaction data between the two. Another implementation uses a dedicated device called a hardware wallet as the signing-only wallet, whose specialized firmware prevents most attacks.

Consensus

Since there is no central authority that maintains a single ledger of all valid transactions – i.e., decides which blockchain is the blockchain and which transactions ought to be recorded – rules must be included in the Bitcoin protocol to maintain agreement across the network. In particular, the consensus rules are the set of rules that full nodes use to validate blocks. Nodes are said to be in consensus if their local blockchains are identical.

Users can introduce new features to the blockchain or to transactions by changing the consensus rules on their full nodes and then publicizing the new rules in the hope that other nodes will upgrade their software to adopt them. For example, one might introduce a new space-saving feature, in which case blocks mined according to the new consensus rules would be rejected by non-upgraded nodes. Such a feature is not backwards-compatible. Alternatively, one might introduce a new security feature, in which case blocks mined according to the old consensus rules would be rejected by upgraded nodes. Since blocks mined according to the new consensus rules are still accepted by non-upgraded nodes, this feature is backwards-compatible. Depending on whether or not the new feature is backwards-compatible, consensus rules can lead to different types of forks in the blockchain.

If the feature is not backwards-compatible, the resulting split in the blockchain is called a hard fork. Suppose a sizeable subset of the network refuses to upgrade to the new consensus rules. Then the split cannot be resolved by the tendency to adopt the longest blockchain, so both branches remain permanently. Depending on the ratio of upgraded nodes to non-upgraded nodes, one of the branches might be socially negligible.

If the feature is backwards-compatible, the resulting split is called a soft fork. Suppose that upgraded nodes control a majority of mining computations per second (called the hash rate). Eventually, they’ll produce a longer chain than that of non-upgraded nodes. Since non-upgraded nodes consider blocks mined according to the new consensus rules to be valid, they’ll switch to the longer blockchain produced by upgraded nodes, thus resolving the fork.

To ensure that soft forks will eventually be resolved, some features wait until a preset, high percentage of the hash rate is from miners who publicly stated they will adopt the new rules. Once that percentage is attained, those miners and other interested full nodes start validating blocks according to the new consensus rules. These forks are called miner activated soft forks. Contrast this with user activated soft forks, in which the proposer of a new backwards-compatible feature waits until a preset time or blockchain length, at which point interested nodes upgrade to the new consensus rules.

Bitcoin vs. Bitcoin Cash

Before August 2017, the size of a block was capped at 1 mb. This limited the number of transactions that can fit in a single block, which in turn limited the number of recorded transactions to roughly 4 per second – far smaller than the transaction rate of centralized electronic payment systems. But as Bitcoin’s popularity steadily increased, so did the number of transactions being broadcast to the network. Consequently, users either had to wait several blocks for their transactions to be recorded on the blockchain or pay a higher transaction fee to incentivize miners to include their transaction over others’.

This problem prompted much debate in the Bitcoin community, from which two major solutions emerged. One solution, called Segregated Witness or segwit, uses some bookkeeping tricks to artificially increase the maximum block size and improve security: it moves the signing data from each transaction to a segregated section at the end of the block, changes the maximum block size from one million bytes to one million units, and counts one byte of usual block data as one unit but one byte of signing data as one fourth of a unit. Technical details aside, the advantage of segwit is that it’s backwards-compatible.

The second solution simply increases the maximum block size to 8 mb without segwit. This required a hard fork, which occurred on 1 August 2017. The branch followed by updgraded nodes is now called Bitcoin Cash, while the branch followed by non-upgraded nodes is now called Bitcoin (or sometimes mainline Bitcoin). The latter activated the segwit consensus rules on 24 August, although adoption has been slow across the mainline network.