You've been following cryptocurrencies in the news: ludicrous transaction fees, unstable prices, hard forks (whatever those are)... What's going on here? Surely, it can't be so hard to create a digital currency that works?
Before running off to create the next great cryptocurrency, I'd highly recommend taking some time to really grok Bitcoin. Bitcoin gives us a really great opportunity to study the clever application of some wonderful techniques from computer science. In this blog post, I'll summarize some of the areas that I find most interesting. All of these areas are covered in greater detail in Andreas Antonopoulos' excellent book, Mastering Bitcoin.
All data in computing at some level gets serialized into bits—1's and 0's. In many situations, we'd like to have something that represents a sort of fingerprint of our data. In this way, we can talk about a specific piece of data without having to transmit it in its entirety to ensure that we're talking about the same piece of data.
A cryptographic hash function is a mathematical function that takes a binary value that represents a piece of information (like a book, movie, image, website, chat message, etc.) as its input and returns a number. This number is potentially very large, but the space required to store it is fixed and may be far smaller than the input data—32 bytes in the case of
SHA256. In cryptographic applications, it is vital that changing any bit in the original data will change the output in an unpredictable way. With this condition met, it is computationally infeasible for an attacker to discover the input given only the hashed output.
const crypto = require('crypto'); const hash = crypto.createHash('sha256') .update('Neither a borrower nor a lender be; for loan doth oft lose both itself and friend, and borrowing dulls the edge of husbandry.', 'utf8') .digest('hex'); console.log(hash); // Prints: // 06a6d5a208edba59dc15a8573f8bb019742fb23060b5ceccdb7dee2b4eaf782e
Understandably, Bitcoin makes extensive use of cryptographic hashing e.g. payment authorization, transaction verification, mining, addresses, bloom filters, etc.
Digital signatures are a means of proving a piece of data's authenticity. When data is signed, its signature will change if any bit is modified in a way that only the holder of the private key can know. In Bitcoin, signatures are used to prove that a user has permission to spend an unspent transaction output.
To digitally sign a piece of data, we hash it to produce a 32-byte number in the case of
SHA256. Then, we encrypt the result using our private key. Anyone with our public key can verify that we signed the data by decrypting the signature using our public key and comparing the result to a hash of the data. If they are working with a different version of the data than what we signed, the result will not match.
In Bitcoin, when we want to spend an unspent transaction output that was addressed to a hash of our public key, we sign it using our private key and provide our public key. Other nodes can check that the signature is correct, meaning the spend is valid, given the unspent transaction output details and the public key.
Elliptic Curve Cryptography
To generate public and private keys, Bitcoin uses elliptic curve cryptography. Elliptic curve cryptography was chosen for Bitcoin over RSA because it offers better security for a fraction of the storage space. Elliptic curve cryptography takes advantage of a strange symmetry that exists in a class of mathematical curves defined by
$$ y^2 = x^3 + ax + b $$
The curve can vary in shape by changing
b, but the symmetry property remains the same. In Bitcoin's elliptic curve,
a = 0 and
b = 7 as is defined in Secp256k1.
As is shown in the figure above, any line that intersects two points on the curve will also intersect at a third point unless the line is perfectly vertical. We can use this property to establish public and private cryptographic keys as follows.
- A "generator" point
Gis chosen from the set of all points on the curve.
- We choose a random private key
kfrom a cryptographic source of randomness. This is just a large number.
- We multiply
k * Gto calculate the public key.
But wait! How do we multiply a number by a point on a graph? This is where elliptic curve cryptography gets really interesting. In the same way that the
+ operator is defined for numbers, we can define
+ for points on an elliptic curve. For points
P2, we define
P1 + P2 = P3 such that
P3 is determined by drawing a line through
P2. Because of the symmetry property of elliptic curves, this line will intersect at a third point,
P3', on the curve as shown in the image above. We then reflect this point about the x-axis. In other words, we make its y-component negative. This gives us
P3. With addition defined, multiplication follows. In
k * G, we apply the addition operation
k-1 times, e.g.
G + G + G + .... When adding
G + G, instead of intersecting two points on the elliptic curve, the line will be tangent to the curve since the two points overlap. That's totally fine. We can still carry out our addition operation as described above since the line will always intersect one additional point on the curve, and we will arrive at our public key. It's impossible for someone who knows our public key to try to determine our private key
k. However, the public key and private key are mathematically related such that the public key can be used to encrypt data, and the private key can be used to decrypt them or vice versa (digital signatures).
For the sake of simplicity, I did leave one important piece of information out, but it doesn't really change the concept above. In Secp256k1,
y are restricted to just prime integers that are modulo a very large number defined in the standard. Modulus is a mathematical operation that returns the remainder of a division operation, e.g.
10 modulo 3 = 1 and
10 modulo 100 = 10.
Special care must be taken in the design and verification of elliptic curve-based algorithms. Consider Dual_EC_DRBG—a random number generator based on elliptic curves. Parameters were chosen by the NSA such that a backdoor was created to enable easy cracking of encryption used in popular software tools by those with knowledge of a particular set of numbers.
A Merkle tree is a tree structure of hashes of hashes such that the root node is a hash that encodes all of the information in the leaves.
ABCDEFGH / \ ABCD EFGH / \ / \ AB CD EF GH / \ / \ / \ / \ A B C D E F G H
In the diagram above,
AB represents the hash of
A concatenated with
SHA256(A + B).
ABCD represents the hash of
AB concatenated with
SHA256(AB + CD)—and so on...
In bitcoin, Merkle trees provide lightweight wallets (Simplified Payment Verification) with an easy way to verify that a transaction is part of a block without downloading the entire block. This is helpful for wallets that may be running in environments where resources are limited such as mobile wallets. First, let's take a look at how blocks are stored in Bitcoin.
|1-9 (varint)||number of transactions|
|32||hash of previous block's header|
|32||Merkle root hash|
|4||approx. seconds from Unix Epoch|
|4||nonce chosen by miner to solve Proof-Of-Work|
Blocks in bitcoin are stored as contiguous bits that are parsed into the data structures shown above. A wallet may elect to only download the block headers from the network. Since each block header is only 80 bytes, storing all of the block headers requires nearly 10,000 times less disk space than storing the entire blockchain.
As you can see, the block header contains the Merkle root hash. The leaves of the Merkle tree are the hashes of each transaction contained in the block. Each hash is attained by applying
SHA256 to the transaction and applying it again to the resulting hash. If there is an odd number of transactions, the last transaction hash is repeated. If a lightweight wallet wishes to verify that transaction
F exists in the block, it only needs to ask the network for hashes
ABCD. From there, it can complete the remaining hashes to see if it arrives at
ABCDEFGH. Because of the mathematical properties of
SHA256, finding fake values for
ABCD that lead to the correct root hash would be infeasible.
Bloom filters, without a doubt, are one of the most fascinating data structures. A bloom filter is a probabilistic data structure that can answer the question of whether a value is absent from a set while maintaining a constant space requirement and a constant lookup time. The caveat is that a bloom filter can only answer, with certainty, that an item is not in a set. If the bloom filter responds that an item does exist in a set, it may be a false positive.
A bloom filter consists of an array of bits (a bit field) and a set of hashing functions that each return a number that corresponds to the index of a bit in the bit field. To encode a value, we pass the value as an input to each hashing function. We set the bit at each returned index to 1. If the bit at a given index is already 1, no change occurs. To ask the whether a value already exists in the bloom filter, we run the value through each hashing function. If any function returns an index in the bit field that is still 0, we can say for certain that the value has not yet been encoded.
In Bitcoin, bloom filters allow lightweight wallets to request the transactions they care about without revealing the user's identity. For example, the wallet may encode its addresses into a bloom filter and send the bit field in a request to the network. The answering node returns a list of transactions that involve addresses for which the bloom filter returns a positive result. The list of transactions returned will contain many false positives. The false positives help hide which addresses actually belong to the requester.
Though none of these technologies are particularly new—bloom filters are nearly 50 years old—studying their clever application in solving incredibly difficult problems that impact the lives of millions of people around the world is fascinating to me and hopefully to you too. I hope as the evolution of cryptocurrencies progresses, we'll continue to look to the old to inform the new.