Understanding blockchain
I don’t like cryptocurrency. My reasons for that are kind of complicated and a bit technical, and I’m writing a two-part series about them. Now, I want to explain myself in a way that’s accessible to a non-technical audience, but people have used a thousand imprecise analogies to explain blockchain, and all of them seem to fall flat. So in this first part, I’m going to explain what blockchain is, what problems it solves, how it works, and what the alternatives are, without papering over details using analogies.
Digital signatures
In order to really understand blockchain (which is the technology underlying cryptocurrency), you have to understand two cryptographic concepts first.
A cryptographic hash is a program (such as sha1
) that scrambles up some information in a reproducible way, and does it so thoroughly that it can’t be unscrambled.
For instance, the sha1
hash of the phrase “hello world” is 2aae6c35c94fcfb415dbe95f408b9ce91ee846ed
.
Any other phrase would correspond to a totally different value.
Moreover, there’s no way to take that number and turn it back into “hello world,”
so if I hadn’t told you that that hash corresponded to “hello world,” you’d have no way of knowing what I’d hashed to get it.
A digital signature is like a regular signature, in that it is irrefutable proof that a particular person approved a particular thing. If you sign this document, the idea is that you’ve approved it, and no one (including you) can deny that. In practice, how this works is that a program generates for you a public key and a private key. These are like a username and password — everyone knows your public key, but only you know your private key. The fact that only you know your private key is how you prove you are the person corresponding to the public key, the same way that your password is how you prove to a website that you’re the person corresponding to a particular username.
If I want to sign a document (say, one that says “I’m transferring $20 to Claire”), I take the document and my private key and enter them into a program that creates a signature, which is a short string of letters and numbers. Then, anybody can take the document, my public key, and the signature I made, and do some math to prove that the signature is legitimate. Only I can create a legitimate signature, because only I know my private key.
Why blockchain?
Suppose Alan is a painter, and he paints a lovely painting. In order to prove it’s authentic, he writes a certificate of authenticity. Then he sells the painting to Betty on July 5th. And in order to prove that Betty didn’t steal the painting, he writes at the end of the certificate, “Alan sold this painting to Betty on July 5th.” Then, he hashes the certificate and digitally signs the hash.
Now Betty wants to sell the painting to Claire. So she writes at the end of the certificate, “Betty sold this painting to Claire on July 10th,” and then she hashes the whole certificate and signs it herself.
Our certificate now looks like this:
- This is a certificate of authenticity for a painting by Alan.
- Alan sold this painting to Betty on July 5th.
- The hash of lines 1 and 2 of this document is
3hd234dgf
. - Alan signed the hash
3hd234dgf
and his signature is9h98hdf
. - Betty sold this painting to Claire on July 10th.
- The hash of lines 1-5 of this document is
xxdi92g2
. - Betty signed the hash
xxdi92g2
and her signature is234kn53
.
So does this certificate work to prove who legitimately owns the painting? Suppose Dale steals the painting from Claire on July 15th and then wants to forge a certificate of authenticity that will “prove” he bought it legitimately. So he tries to sell it to himself, by adding a line saying “Claire sold this painting to Dale on July 15th.” But in order for the new certificate to be valid, the hash of the newly-amended certificate would have to be signed by Claire. So without Claire’s permission, Dale can’t add anything new to the certificate. Looks like the certificate is secure so far.
So he tries to change line 5 to read “Betty sold this painting to Dale.”
If he can’t add to the certificate, maybe he can change what’s already there.
This doesn’t work, because the hash of lines 1-5 isn’t xxdi92g2
anymore.
Now it’s 89m34hdf
, and Betty’s old signature is invalid, because it was computed using the old hash xxdi92g2
.
And he can’t change line 5 and forge a new signature, because he doesn’t have Betty’s secret key.
So far, it looks like it’s impossible for Dale to create a falsified certificate.
Then he asks Betty for help. He gives her some cash to sweeten the deal. Betty copies the first three lines of the certificate and then, on line four, writes “Betty sold this painting to Dale on July 15th.” She hashes the certificate, signs the hash, and now Dale has a valid certificate of authenticity.
But then what if Claire accuses Dale of stealing the painting? Both of them have valid certificates, so which one is the real one? Well, if we look at the dates, Betty sold the painting to Claire on July 10th, and then again to Dale on July 15th. Clearly the sale to Dale was the illegitimate one. Dale still hasn’t quite gotten away with it.
But what’s stopping Betty from writing “sold to Dale on July 1st” on the certificate instead? Now it looks like Claire has the invalid certificate. Betty has just, in effect, sold a certificate of authenticity twice.
This is called the “double-spending problem,” and that’s what blockchain is meant to solve. If we can solve the double-spending problem, then we can use cryptography to create unforgeable records of financial transactions. It’s enough to build a whole digital currency system on. So let’s talk about some solutions to the problem.
Solution 1: Trust someone
The whole problem here is that it’s easy for an untrustworthy person to fake dates. So let’s trust someone trustworthy instead. Suppose the government steps in and says, “Okay, anybody who wants to date any document for any reason can submit it to us. We’ll take the hash of the document, write the date after that, and digitally sign it.”
If you trust the government, then this solution works fine. But is it really worth putting all our faith in the idea that the government is never corrupt?
Solution 2: Rely on someone, but don’t trust them
In order to make sure that the government isn’t faking signatures, we demand that they publish every document they sign, when they sign it. Watchdogs monitor the government publications, and if a signature is wrong, they’ll sue the government for corruption.
But this solution kind of fragile, though, isn’t it? How many signatures will the watchdogs need to keep track of, over the years? Billions, surely. And if anything slips through the cracks, then the government can easily forge a timestamp from the past and say “oh you must have just lost your record of this one.”
Let’s innovate a bit. Now, every time the government timestamps something, they hash the document, they hash the last timestamp they published, they write the date, and then they sign the whole thing. So now the government is publishing something that looks like this:
| Record number: 101
| Hash of record 100: 89d1234
| Attached document hash: 12f4e67
| Time: 2021-03-05 10:00
| Government's signature: 56c8b01
|
|--> Hash of record 101 = aa12345 --------->|
|
| Record number: 102 |
| Hash of record 101: aa12345 <-----|
| Attached document hash: e34e678
| Time: 2021-03-05 10:05
| Government's signature: f78901f
|
|--> Hash of record 102 = 3bb87ab --------->|
|
| Record number: 103 |
| Hash of record 102: 3bb87ab <-----|
| Attached document's hash: 945cd89
| Time: 2021-03-05 10:10
| Government's signature: 7a901c3
|
|--> Hash of record 103 = 3bb87ab ---> ...
All that a watchdog needs to do now is keep track of the latest record in the stream. If the government tries to publish a fradulently-early timestamp, it’s only got two options. It could put the bad record at the end of the chain, but that would mean that the chain is no longer in order. How could an earlier record come after a later one? This proves fraud easily — and since the records are signed, the evidence is undeniable.
It could also try to put the bad record earlier in the chain. Suppose the government makes a new record 101, which is subtly different from the original record 101, and it tries to replace the old version with the new one. The new record has a different hash from the old one, so this means the government has to create a new record 102 in order for their history to stay consistent, because the hash of record 101 is stored in record 102. And so on and so forth, until you need a new version of the latest block. And any watchdog who’s been keeping track of things will realize, hey! There are two contradictory versions of the latest block! So if the government tries to alter the history of the system in any way, we’ll find the inconsistencies in the record as soon as anyone tries to audit it.
This is the simplest form of a blockchain.1 In my example, I called them “records,” but they’re usually called blocks, and they’re all chained together by hashes. For simplicity, I only included one document per block, but actual blockchains typically publish a bundle of documents in each blocks. A good example of this sort of software is Google’s Trillian project.
Solution 3: Rely on a group of people
So far, our solution only works on one server. This isn’t very robust — if the server crashes, the whole thing falls down. So let’s generalize it to a group of servers. In order to append a block to the chain, the servers have a vote. If the majority of them approve of the block, then the servers sign it and it gets appended to the chain. This way, we don’t even have to rely on any single server: So long as most of the servers are working properly, the blockchain will persist.
This is a distributed blockchain — “distributed” because it’s distributed across a group of computers. More precisely, it’s a permissioned blockchain, because a server needs special permission in order to be part of the group. After all, not just anyone can vote in the elections — unless everyone recognizes you as an official elector, your votes and your signatures will mean nothing. A good example of a permissioned blockchain is Hyperledger.
Solution 4: Rely on no one in particular
Solution 3 is still somewhat expensive, though. Running N servers splits up the cost, but since so many requests are being served, it’s still expensive overall. So what if we opened the system up to everybody? This is called a permissionless blockchain.
Are there any drawbacks? Well, it doesn’t cost anything to sign up as a voter — running a server that is constantly signing user-submitted documents is expensive, but if anyone can vote, that means that even a tiny laptop doing barely any work can. So suppose I set up a million tiny virtual computers and sign them all up as voters. I’ve sneakily gained a majority of the votes for myself. This is called a 51% attack. Now I can reject valid blocks and approve invalid blocks. A watchdog might notice, but what can they do? I’m not the government, and I’ve made no legal commitment to not try to exploit the system. Now everyone needs to go and find a new blockchain, because I have control over this one.
Solution 4.1: Rely on work
How can we stop someone from doing a 51% attack so easily? We want to limit the number of votes any one person can cast. And the simplest way to do that is to attach some kind of cost to voting.
Ok let’s back up a bit. Take a little trip with me. There’s a cryptographic puzzle called hashcash, and it works like this: I give you a word, like “example,” and a number, four. Now find a string of letters and numbers beginning with “example” such that, when you hash it, the hash ends in four zeros.
The more zeros at the end, the longer it takes for someone to guess an answer. And when they do have that answer, you know they worked for it, because the only way to find it is by guessing over and over again.
For example, if we guess numbers starting from zero, here’s how long it takes to find a solution to various hashcash problems based on the word “example”:
# of zeros | First solution | Number of guesses |
---|---|---|
1 | sha1 of example23 is f14801ab151cd0de061a7050f11b49258fb89770 | 23 |
2 | sha1 of example809 is 7dec1d12017a106308fc7f904f6c61deee309700 | 809 |
3 | sha1 of example1795 is 5961cbc2ee4920465b6a0fa4cfb28d56b8f39000 | 1795 |
4 | sha1 of example218808 is f4c5095cd234bbfffda611df07da074d33de0000 | 218808 |
So suppose we’ve got a giant network of people trying to manage a permissionless blockchain. People keep asking to have documents timestamped, and I want to collate those documents’ hashes into a block, timestamp them, and append that block to the chain.
We’re not voting anymore. Instead, I take a group of documents, I take the hash of the last block in the chain, I digitally sign the whole thing, and then I start trying to solve a hashcash puzzle with it. The puzzle is hard enough that it takes everyone in the system guessing constantly for around 10 minutes for a single person to get lucky and solve it.2 Suppose I’m the lucky one who finds a solution first. This is called mining a block. No one else has solved this puzzle yet, so I publish the block, everyone appends it to their copies of the chain, and the process continues. Unless my block is invalid, in which case everyone else rejects it.
Because my right to append a block to the blockchain depends on my ability to prove that I’ve done a certain amount of computational work, this system is called proof of work. This is the big one — this is how Bitcoin works.
What if someone doesn’t want to append that block to their chain? Well, everyone else is moving on without them, and if they want to remain a part of the network, they’re going to have to keep trying to append to the latest block. If someone figures out the answer to a hashcash puzzle from yesterday, no one’s going to accept the block they produce, because when the blockchain branches, people flock to the longest branch.
You can still do a 51% attack, in theory. Suppose I go back to yesterday’s puzzle and I mine out enough blocks that I create an alternative branch which is longer than the current one. Then, everyone who is following the rules will come over to my branch of the blockchain, and I’ll have rewritten history. This works, but it’ll cost you a lot of money in computing power, because you need to solve Hashcash puzzles on your own faster than everyone else in the network put together — and you’re already behind, so you have a lot of catching up to do. It’s so hard that it’s almost impossible, given a large enough community of miners with no massive conspiracies going on.
But mining like this is expensive, and if we want a large community of miners, we need to incentivize people to mine. This is where cryptocurrency comes in. In order to perform a cryptocurrency transaction, I publish a message saying, “I transfer ten bitcoins from myself to Alan. I’ll pay one bitcoin as a transaction fee,” and I sign it with my private key. When somebody mines a block with that message in it, they claim that transaction fee for themselves. And, depending on how the blockchain in question works, they might be entitled to mint some brand new bitcoins for themselves, too.
If anyone overdraws their account, or mints too many new bitcoins, their block is invalid and no one else will accept it. So long as a majority of the mining isn’t controlled by any one group, nobody can break the rules.
This mining incentive — the transaction fees and the newly-minted money — is why people spend computing power mining in the first place. If people value bitcoins enough, then you can turn a profit mining bitcoin.
Solution 4.2: Rely on money
But of course, now we reach the problem with proof-of-work. With all these people solving puzzles, we’re wasting a huge amount of power. No one has to sponsor the system — it’s all happening in the free market — but collectively we’re wasting far, far more energy than any of our other solutions did.
This is a real problem, and it renders proof-of-work basically irredeemable in my opinion. But there is an alternative. Proof-of-stake resurrects the voting, with a twist. Instead of being implicitly trusted, everyone who wants to vote has to stake a certain amount of their own money as a sort of investment. Everyone who offers a stake has a random chance of being selected in each vote.3 The larger your stake, the more of a chance you have of being chosen.
All the voters vote “valid” or “invalid.” If the block is deemed valid, it is appended to the chain. Anyone who voted with the majority gets a share of the incentives. Anyone who voted with the minority loses their stake entirely.
This means that doing a 51% attack in a proof-of-stake system requires an unfeasibly large upfront investment of stakes. And, since the identities of all the voters are not known until the votes are all cast, engaging in fraud by deliberately voting wrong is extremely risky, because you might lose all of your expensive stakes if your voting bloc is smaller than you thought.
Like proof-of-work, proof-of-stake also depends on the existence of a cryptocurrency. Otherwise, what are people going to put up as the stakes? Proof-of-work is still a lot more common than proof-of-stake among permissionless blockchains, but there do exist some blockchains that use proof-of-stake, such as Algorand.
The situation so far
So, today we learned several ways of timestamping documents in a provably-correct way. But which of these solutions is best? Obviously it’s not 4.1 — the waste inherent to proof-of-work is disgusting. It’s tempting to say 4.2, but I don’t think so. I think that strategy 3 — the permissioned blockchain — is a lot better. In fact, I think that permissionless blockchains are really pretty bad in general. And I’ll tell you why… in part two.
— Pan-fried, 2021-03-21. Back to top