## Week One - Introduction to Crypto and Cryptocurrencies

### Hash function

Takes any string as input
fixed-size output (e.g. 256bits just as BitCoin)
efficiently computable

#### Security properties:

Collision-free
Nobody can find x and y such that x!=y and H(x)=H(y).

Note: Collisions do exist. The possible outputs are finite(string of 256 bits in size), while the possible inputs can be a string of any size.

To find a collision of any hash function: Trying $2^{130}$ randomly chosen inputs and chances are 99.8% that two of them will collide. Well, it takes too long to matter.
For some possible H’s, there’s a faster way to find collisions; for others, we haven’t known yet.
No H has been proven collision-free.

Hiding
(r|x) means that take all the bits of r and put after them all the bits of x.
If r is chosen from a probability distribution that has high min-entropy, then given H(r|x), it is infeasible to find x.
High min-entropy means that the distribution is “very spread out”, so that no particular value is chosen with more than negligible probability.

Puzzle-friendly
For every possible output value y, if k is chosen from a distribution with high min-entropy, then it is infeasible to find x such that H(k|x)=y.

#### Applications

Hash as message digest.
If we know H(x)=H(y), it’s safe to assume that x=y.
To recognize a file that we saw before, just remember its hash.
It’s useful because the hash is small, while a file may be very big.

Commitment
Want to “seal a value in an envelope”, and “open the envelope” later.
Commit to a value, reveal it later.

To seal msg in envelope:
(com, key) := commit(msg) — then publish com
To open envelope:
publish key, msg
anyone can use verify() to check validity

Commitment API:
commit(msg):=(H(key|msg),key) , where key is a random 256-bit value
verify(com, key, msg):=(H(key|msg)==com)

Security properties:
Hiding: Given com, infeasible to find msg.
Binding: Infeasible to find msg != msg' such that verify(commit(msg), msg') == true.

Search Puzzle
Given a “puzzle ID” id (from high min-entropy distribution) and a target set Y,
try to find a “solution” x such that H(id|x)∈Y.
Puzzle-friendly property implies that no solving strategy is much better than trying random values of x.

#### SHA256

SHA-256 takes the message that you’re hashing, and it breaks the message up into blocks that are 512 bits in size(add some padding at the end 100…00).
IV: 256 bit value(look up in a standards document).
c: the compression function. Take 768 bits(256+512) and run through this function and out comes 256 bits.

### Hash Pointer

Hash pointer is pointer to where some info is stored, and (cryptographic) hash of the info.
If we have a hash pointer, we can ask to get the info back, and verify that it hasn’t changed.

Data structure of blockchain:

This is blockchain, and it’s similar to linked list.
If some adversary wants to change a block’s data(e.g., the left one), then the content of that block is changed. And the middle block’s hash pointer is not consistent to the left block any more. So the adversary has to change middle block’s header, and next block’s header and so on.

Merkle tree
Tree holds many items, but just need to remember the root hash.
Can verify membership in O(log n) time/space.

### Digital Signature

What we want from signatures is two things:

1. Only you can make your signature, but anyone who sees your signature can verify that it’s valid.
2. Signature is tied to a particular document, because signature is not just a signature, it signifies your agreement or endorsement of a particular document.

API for digital signatures

Requirements for signatures
·Valid signatures verify
verify(pk, message, sign(sk, message))==true
·Can’t forge signatures
An adversary who knows your public key and gets to see signatures on some other messages, can’t forge your signature on some message that he wants to forge it on.

Practical stuff
algorithms are randomized
need good source of randomness
limit on message size
fix: use Hash(message) rather than message
fun trick: sign a hash pointer
signature “covers” the whole structure

### Simple Cryptocurrencies

#### GoofyCoin

Goofy can create new coins.

A coin’s owner can spend it.

The recipient can pass on the coin again.

Problem: Double-spending attack
the main design challenge in digital currency

#### ScroogeCoin

CreateCoin transaction creates new coins.

PayCoin transaction consumes (and destroys) some coins, and create new coins of the same total value.

Valid if:
-consumed coins valid
-total value out = total value in
-signed by owners of all consumed coins

Note: Coins are immutable, that is, coins can’t be transferred, subdivided, or combined.

## Week Two - How Bitcoin Achieves Decentralization

Why consensus protocols?
Traditional motivation: reliability in distributed systems
Distributed key-value store enables various applications: DNS, public key directory, stock trades …

## Distributed consensus

The protocol terminates and all correct nodes decide on the same value
This value must have been proposed by some correct node

Why consensus is hard?
Nodes may crash
Nodes may be malicious
Network is imperfect (Not all pairs of nodes connected; faults in network; latency)

Consensus algorithm (simplified)

1. New transactions are broadcast to all nodes
2. Each node collects new transactions into a block
3. In each round a random node gets to broadcast its block
4. Other nodes accept the block only if all transactions in it are valiid (unspent, valid signatures)
5. Nodes express their acceptance of the block by including its hash in the next block they create

What can a malicious node do? - Double Spending

Honest nodes will extend the longest valid branch. In the above image, the green block and the red block are identical. So chances are that the next node will extend the red block, and so on, which makes the double-spending attack.

However, from Bob’s view, it looks like this:

Double-spend attack only occurs with 1 confirmations. If Bob is patient enough and wait for some other confirmations, he’s not likely to suffer double-spend.
Double-spend probability decreases exponentially with number of confirmations.
(Most common heuristic: 6 confirmations)

### Incentives and Proof of Work

Incentive 1: block reward
Creator of block gets to

• include special coin-creation transaction in the block
• choose recipient address of this transaction

Note block creator gets to “collect” the reward only if the block ends up on long-term consensus branch.

Incentive 2: transaction fees
Creator of transaction can choose to make output value less than input value
Remainder is a transaction fee and goes to block creator
Purely voluntary, like a tip.

PoW property
1: difficult to compute
Only some nods bother to compute - miners

2: parameterizable cost
Nodes automatically re-calculate the target every two weeks.
Goal: average time between blocks = 10 minutes

3: trivial to verify
Nonce must be published as part of block
Other miners simply verify that $H(nonce||prev_hash||tx||...||tx)

Key security assumption
Attacks infeasible if majority of miners weighted by hash power follow the protocol.

for individual miner:

**What can a “51% attacker” do?
Steal coins from existing address? ×

Suppress some transactions?
· From the block chain √
· From the P2P network ×

Change the block reward? ×

Destroy confidence in Bitcoin? √√

## Week Three

### Bitcoin transactions

Comparison between an account-based ledger and a transaction-based ledger:
An account-based ledger

If we want to check whether a transaction is valid, we might need to scan backwards until genesis. That’s quite inconvenient and inefficiency.

A transaction-based ledger

The verification just needs finite scan with hash pointers.

Then, we can easily realize functions like merging value and joint payments.
Merging value

(The slides shown on Coursera has some mistakes, so I have to screenshot on the course)

Joint payments

Here’s what a Bitcoin transaction look like:

### Bitcoin Scripts

Note that in Bitcoin transaction, instead of assigning a public key, Bitcoin uses script.

Design goals

• Build for Bitcoin (inspired by Forth)
• Simple, compact
• Support for cryptography
• Stack-based
• Limits on time/memory
• No looping

Instructions
256 opcodes total (15 disabled, 75 reserved)

• Arithmetic
• If/then
• Logic/data handling
• Crypto: Hashes, signature verification, multi-signature verification

Instruction:

<sig> : Input script. Push data onto the stack.
<pubKey> : Push data onto the stack.

OP_DUP : Duplicate instruction. Take the value on the top of the stack, pop it off, and then write two copies back to the stack.

OP_HASH160 : Take the top value on the stack and compute a cryptographic hash of it.

pubKeyHash : The hash of public key that was actually used by the recipient when trying to claim the coins.
pubKeyHash? : Specified by the sender of the coins. The public key that the sender specified, had to be used to generate the signature to redeem these coins.

OP_EQUALVERIFY : Check if the two values at the top of the stack equal. It the two values aren’t equal, an error’s gonna be thrown and the script will stop executing. It they are, the instruction will consume those two data items that are at the top of the stack.

OP_CHECKSIG : Verify the entire transaction was successfully signed. Pop those remaining two items off of the stack, check the signature is valid.

OP_CHECKMULTISIG :
Built-in support for joint signatures
Specify n public key
Specify t (threshold)
Verification requires t signatures
(BUG: Extra data value popped from the stack and ignored)

### Bitcoin Blocks

There’s a special transaction, which is the coinbase transaction:

Since this transaction creates new coins, it’s prev_out has a null hash pointer.
Miners can put anything in coinbase.
The value is slightly more than the set value, which contains transaction fees.

### Bitcoin Network

P2P Network

• Ad-hoc protocol (runs on TCP port 8333)
• Ad-hoc network with random topology
• All nodes are equal
• New nodes can join at any time
• Forget non-responding nodes after 3 hr

#### Propagation

Transaction propagation
· Transaction valid with current blockchain
· (default) script matches a whitelist - avoid unusual scripts
· Haven’t seen before - Avoid infinite loops
· Doesn’t conflict with others I’ve relayed - avoid double-spends

Block propagation
Relay a new block when you hear it if:
· Block meets the hash target
· Block has all valid transactions - Run all scripts, even if you wouldn’t relay
· Block builds on current longest chain - Avoid forks

#### Nodes

Fully-validating nodes
· Permanently connected
· Store entire block chain (Storage cost: 20 GB)
· Hear and forward every node/transaction

Thin/SPV clients(not fully-validating)
Ideas: don’t store everything
· Store block headers only (1000x cost saving)
· Request transactions as needed - to verify incoming payment
· Trust fully-validating nodes

### Limitations & Improvements

Hard-coded limits in Bitcoin
· 10 min. average creation time per block
· 1 M bytes in a block
· 20,000 signature operations per block
·· 100 M satoshis per bitcoin
·· 21 M total bitcoins maximum
·· 50,25,12,5… bitcoin mining reward
·· : These affect economic balance of power too much to change now

Throughput limits in Bitcoin
· 1 M bytes/block (10 min)
· >250 bytes/transaction
· 7 transactions/sec

Cryptographic limits in Bitcoin
· Only 1 signature algorithm (ECDSA/P256)
· Hard-coded hash functions

#### Forking

Hard-forking
If old nodes didn’t update software, they will reject all the new transactions/blocks.
Old nodes will never catch up.

Soft forks
Observation: we can add new features which only limit the set of valid transactions

Need majority of nodes to enforce new rules

Old nodes will approve.

Risks exist that old nodes might mine now-invalid blocks, since there’re new limits on blocks.

Soft fork possibilities
· New signature schemas
· Extra per-block metadata - Shove in the coinbase parameter; Commit to UTXO tree in each block

Hard forks
· New op codes
· Changes to size limits
· Changes to mining rate
· Many small bug fixes (like bug in MULTISIG)

Currently seem very unlikely to happen.

## Week Five

### Mining Hardware

#### CPU

Throughput on a high-end PC = 10-20 MHz ≈ 2^24
139461 years on average to find a block today

#### GPU

GPUs designed for high-performance graphics - high parallelism & high throughput
First used for Bitcoin ca. October 2010
Implemented in OpenCL - Later: hacks for specific cards

· easily available, easy to set up
· parallel ALUs
· bit-specific instructions
· can drive many from 1 CPU
· can overclock

· poor utilization of hardware
· poor cooling
· large power draw
· few boards to hold multiple GPUs

Throughput on a good card = 20-200 MHz ≈ 2^27
173 years on average to find a block w/100 cards

#### FPGA

First used for Bitcoin ca. June 2011
Implemented in Verilog

· higher performance than GPUs - excellent performance on bitwise operations
· better cooling
· extensive customization, optimization

· higher power draw than GPUs designed for - frequent malfunctions, errors
· poor optimization of 32-bit adds
· fewer hobbyists with sufficient expertise
· more expensive than GPUs
· marginal performance/cost advantage over GPUs

Throughput on a good card = 100-1000 MHz ≈ 2^30
25 years on average to find a block w/100 boards

#### Bitcoin ASIC

· special purpose - approaching known limits on feature sizes, less than 10x performance improvement expected
· designed to be run constantly for life
· require significant expertise, long lead-times
· perhaps the fastest chip development ever

## Week Six

### Anonymity

Anonymity in computer science:

Pseudonymity: Bitcoin addresses are public key hashes rather than real identities
Unlinkability: different interactions of the same user with the system should not be linkable to each other