But wait! How do we multiply a number by a point on a graph? This is where elliptic curve cryptography gets really interesting. In the same way that the
+ operator is defined for numbers, we can define
+ for points on an elliptic curve. For points
P2, we define
P1 + P2 = P3 such that
P3 is determined by drawing a line through
P2. Because of the symmetry property of elliptic curves, this line will intersect at a third point,
P3', on the curve as shown in the image above. We then reflect this point about the x-axis. In other words, we make its y-component negative. This gives us
P3. With addition defined, multiplication follows. In
k * G, we apply the addition operation
k-1 times, e.g.
G + G + G + .... When adding
G + G, instead of intersecting two points on the elliptic curve, the line will be tangent to the curve since the two points overlap. That's totally fine. We can still carry out our addition operation as described above since the line will always intersect one additional point on the curve, and we will arrive at our public key. It's impossible for someone who knows our public key to try to determine our private key
k. However, the public key and private key are mathematically related such that the public key can be used to encrypt data, and the private key can be used to decrypt them or vice versa (digital signatures).
For the sake of simplicity, I did leave one important piece of information out, but it doesn't really change the concept above. In Secp256k1,
y are restricted to just prime integers that are modulo a very large number defined in the standard. Modulus is a mathematical operation that returns the remainder of a division operation, e.g.
10 modulo 3 = 1 and
10 modulo 100 = 10.
Special care must be taken in the design and verification of elliptic curve-based algorithms. Consider Dual_EC_DRBG—a random number generator based on elliptic curves. Parameters were chosen by the NSA such that a backdoor was created to enable easy cracking of encryption used in popular software tools by those with knowledge of a particular set of numbers.
A Merkle tree is a tree structure of hashes of hashes such that the root node is a hash that encodes all of the information in the leaves.
/ \ / \
AB CD EF GH
/ \ / \ / \ / \
A B C D E F G H
In the diagram above,
AB represents the hash of
A concatenated with
SHA256(A + B).
ABCD represents the hash of
AB concatenated with
SHA256(AB + CD)—and so on...
In bitcoin, Merkle trees provide lightweight wallets (Simplified Payment Verification) with an easy way to verify that a transaction is part of a block without downloading the entire block. This is helpful for wallets that may be running in environments where resources are limited such as mobile wallets. First, let's take a look at how blocks are stored in Bitcoin.
|1-9 (varint)||number of transactions|
|32||hash of previous block's header|
|32||Merkle root hash|
|4||approx. seconds from Unix Epoch|
|4||nonce chosen by miner to solve Proof-Of-Work|
Blocks in bitcoin are stored as contiguous bits that are parsed into the data structures shown above. A wallet may elect to only download the block headers from the network. Since each block header is only 80 bytes, storing all of the block headers requires nearly 10,000 times less disk space than storing the entire blockchain.
As you can see, the block header contains the Merkle root hash. The leaves of the Merkle tree are the hashes of each transaction contained in the block. Each hash is attained by applying
SHA256 to the transaction and applying it again to the resulting hash. If there is an odd number of transactions, the last transaction hash is repeated. If a lightweight wallet wishes to verify that transaction
F exists in the block, it only needs to ask the network for hashes
ABCD. From there, it can complete the remaining hashes to see if it arrives at
ABCDEFGH. Because of the mathematical properties of
SHA256, finding fake values for
ABCD that lead to the correct root hash would be infeasible.
Bloom filters, without a doubt, are one of the most fascinating data structures. A bloom filter is a probabilistic data structure that can answer the question of whether a value is absent from a set while maintaining a constant space requirement and a constant lookup time. The caveat is that a bloom filter can only answer, with certainty, that an item is not in a set. If the bloom filter responds that an item does exist in a set, it may be a false positive.
A bloom filter consists of an array of bits (a bit field) and a set of hashing functions that each return a number that corresponds to the index of a bit in the bit field. To encode a value, we pass the value as an input to each hashing function. We set the bit at each returned index to 1. If the bit at a given index is already 1, no change occurs. To ask the whether a value already exists in the bloom filter, we run the value through each hashing function. If any function returns an index in the bit field that is still 0, we can say for certain that the value has not yet been encoded.
In Bitcoin, bloom filters allow lightweight wallets to request the transactions they care about without revealing the user's identity. For example, the wallet may encode its addresses into a bloom filter and send the bit field in a request to the network. The answering node returns a list of transactions that involve addresses for which the bloom filter returns a positive result. The list of transactions returned will contain many false positives. The false positives help hide which addresses actually belong to the requester.
Though none of these technologies are particularly new—bloom filters are nearly 50 years old—studying their clever application in solving incredibly difficult problems that impact the lives of millions of people around the world is fascinating to me and hopefully to you too. I hope as the evolution of cryptocurrencies progresses, we'll continue to look to the old to inform the new.