Merkle Trees

Merkle trees are fundamental to blockchain efficiency. They enable verification of large datasets with minimal data transfer—you can prove a transaction exists without downloading the entire blockchain.

What Is a Merkle Tree?

A Merkle tree (or hash tree) is a binary tree where:

Leaf nodes contain hashes of data blocks
Internal nodes contain hashes of their children
The root summarizes the entire dataset

Text

                    ┌─────────────┐
                    │   Root Hash │
                    │   (H_ABCD)  │
                    └──────┬──────┘
                           │
             ┌─────────────┴─────────────┐
             │                           │
       ┌─────┴─────┐               ┌─────┴─────┐
       │   H_AB    │               │   H_CD    │
       └─────┬─────┘               └─────┬─────┘
             │                           │
       ┌─────┴─────┐               ┌─────┴─────┐
       │           │               │           │
    ┌──┴──┐     ┌──┴──┐         ┌──┴──┐     ┌──┴──┐
    │ H_A │     │ H_B │         │ H_C │     │ H_D │
    └──┬──┘     └──┬──┘         └──┬──┘     └──┬──┘
       │           │               │           │
    ┌──┴──┐     ┌──┴──┐         ┌──┴──┐     ┌──┴──┐
    │  A  │     │  B  │         │  C  │     │  D  │
    └─────┘     └─────┘         └─────┘     └─────┘

    H_A = hash(A)
    H_B = hash(B)
    H_AB = hash(H_A + H_B)
    H_ABCD = hash(H_AB + H_CD)

Why Merkle Trees Matter

Efficiency Gains

Without Merkle trees, proving a transaction exists requires the entire block:

Download all transactions
Check if target transaction is present
Data: O(n) where n = number of transactions

With Merkle trees:

Download only the Merkle proof
Verify by hashing up to root
Data: O(log n)

Text

For 1 million transactions:
Without Merkle: ~1,000,000 hashes to verify
With Merkle:    ~20 hashes to verify

That's 50,000x more efficient!

Use Cases

SPV (Simple Payment Verification): Light wallets verify transactions without full blockchain
State synchronization: Quickly identify which parts of state differ
Data integrity: Detect tampering in any part of dataset
Efficient updates: Change one leaf, update only O(log n) nodes

Building a Merkle Tree

Step-by-Step Construction

TypeScript

import { createHash } from "crypto";

function sha256(data: Buffer): Buffer {
  return createHash("sha256").update(data).digest();
}

// Combine two hashes
function hashPair(left: Buffer, right: Buffer): Buffer {
  return sha256(Buffer.concat([left, right]));
}

function buildMerkleTree(leaves: Buffer[]): Buffer[][] {
  if (leaves.length === 0) {
    return [[Buffer.alloc(32)]]; // Empty tree
  }

  // Start with leaf hashes
  let currentLevel = leaves.map((leaf) => sha256(leaf));
  const tree: Buffer[][] = [currentLevel];

  // Build tree bottom-up
  while (currentLevel.length > 1) {
    const nextLevel: Buffer[] = [];

    for (let i = 0; i < currentLevel.length; i += 2) {
      const left = currentLevel[i];
      // If odd number of nodes, duplicate the last one
      const right = currentLevel[i + 1] ?? currentLevel[i];
      nextLevel.push(hashPair(left, right));
    }

    tree.push(nextLevel);
    currentLevel = nextLevel;
  }

  return tree;
}

// Get the root hash
function getMerkleRoot(leaves: Buffer[]): Buffer {
  const tree = buildMerkleTree(leaves);
  return tree[tree.length - 1][0];
}

Example with Real Data

TypeScript

// Transactions as buffers
const transactions = [
  Buffer.from("Alice pays Bob 5 BTC"),
  Buffer.from("Bob pays Carol 2 BTC"),
  Buffer.from("Carol pays Dave 1 BTC"),
  Buffer.from("Dave pays Eve 0.5 BTC"),
];

const tree = buildMerkleTree(transactions);

console.log(
  "Leaf level:",
  tree[0].map((h) => h.toString("hex").slice(0, 8)),
);
console.log(
  "Level 1:",
  tree[1].map((h) => h.toString("hex").slice(0, 8)),
);
console.log("Root:", tree[2][0].toString("hex").slice(0, 8));

// Output:
// Leaf level: ['a1b2c3d4', 'e5f6a7b8', 'c9d0e1f2', 'a3b4c5d6']
// Level 1: ['f7e8d9c0', 'b1a2f3e4']
// Root: ['d5c6b7a8']

Merkle Proofs

A Merkle proof proves that a leaf is part of the tree without revealing other leaves.

Proof Structure

Text

To prove Transaction B is in the tree:

                    ┌─────────────┐
                    │    Root     │  ← Known (in block header)
                    └──────┬──────┘
                           │
             ┌─────────────┴─────────────┐
             │                           │
       ┌─────┴─────┐               ┌─────┴─────┐
       │   H_AB    │               │ * H_CD *  │  ← Proof element
       └─────┬─────┘               └───────────┘
             │
       ┌─────┴─────┐
       │           │
    ┌──┴──┐     ┌──┴──┐
    │*H_A*│     │ H_B │  ← Proof element    ← Target (we have B)
    └─────┘     └──┬──┘
                   │
                ┌──┴──┐
                │  B  │  ← Data we're proving
                └─────┘

Proof for B: [H_A, H_CD] + positions
Verifier computes:
1. H_B = hash(B)
2. H_AB = hash(H_A + H_B)
3. Root' = hash(H_AB + H_CD)
4. Check: Root' == Root

Proof Generation

TypeScript

interface MerkleProof {
  leaf: Buffer;
  leafIndex: number;
  proof: Array<{
    hash: Buffer;
    position: "left" | "right";
  }>;
  root: Buffer;
}

function generateProof(leaves: Buffer[], targetIndex: number): MerkleProof {
  const tree = buildMerkleTree(leaves);
  const proof: MerkleProof["proof"] = [];

  let index = targetIndex;

  // Walk up the tree, collecting siblings
  for (let level = 0; level < tree.length - 1; level++) {
    const isRightNode = index % 2 === 1;
    const siblingIndex = isRightNode ? index - 1 : index + 1;

    if (siblingIndex < tree[level].length) {
      proof.push({
        hash: tree[level][siblingIndex],
        position: isRightNode ? "left" : "right",
      });
    } else {
      // Odd number of nodes, sibling is self (duplicated)
      proof.push({
        hash: tree[level][index],
        position: "right",
      });
    }

    // Move to parent index
    index = Math.floor(index / 2);
  }

  return {
    leaf: leaves[targetIndex],
    leafIndex: targetIndex,
    proof,
    root: tree[tree.length - 1][0],
  };
}

Proof Verification

TypeScript

function verifyProof(proof: MerkleProof): boolean {
  let hash = sha256(proof.leaf);

  for (const step of proof.proof) {
    if (step.position === "left") {
      hash = hashPair(step.hash, hash);
    } else {
      hash = hashPair(hash, step.hash);
    }
  }

  return hash.equals(proof.root);
}

// Usage
const proof = generateProof(transactions, 1); // Prove transaction at index 1
const isValid = verifyProof(proof);
console.log("Proof valid:", isValid); // true

Merkle Trees in Bitcoin

Block Header Structure

Text

Bitcoin Block Header (80 bytes):
┌────────────────────────────────────────┐
│ Version (4 bytes)                      │
│ Previous Block Hash (32 bytes)         │
│ Merkle Root (32 bytes)  ←──────────────┤ Hash of all transactions
│ Timestamp (4 bytes)                    │
│ Difficulty Target (4 bytes)            │
│ Nonce (4 bytes)                        │
└────────────────────────────────────────┘

SPV Verification

Light clients can verify transactions with minimal data:

TypeScript

async function verifySPV(
  txHash: string,
  blockHeader: BlockHeader,
  merkleProof: MerkleProof,
): Promise<boolean> {
  // 1. Verify block header is valid (enough PoW)
  if (!verifyBlockHeader(blockHeader)) {
    return false;
  }

  // 2. Verify merkle proof leads to block's merkle root
  if (!merkleProof.root.equals(blockHeader.merkleRoot)) {
    return false;
  }

  // 3. Verify proof is valid
  if (!verifyProof(merkleProof)) {
    return false;
  }

  // 4. Verify the proven data matches our transaction
  const proofTxHash = sha256(merkleProof.leaf);
  if (proofTxHash.toString("hex") !== txHash) {
    return false;
  }

  return true;
}

Merkle Trees in Solana

Transaction Merkle

Solana blocks contain a Merkle root of transaction signatures:

Text

Entry in Solana Block:
┌─────────────────────────────────────────┐
│ PoH Hash                                │
│ Transaction Merkle Root  ←──────────────┤
│ Transactions [...]                      │
└─────────────────────────────────────────┘

Account State Merkle

Solana's bank uses Merkle-like structures for state:

Text

Account State:
┌─────────────────────────────────────────┐
│ Accounts Hash (Merkle-ish)              │
│                                         │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐    │
│ │Account A│ │Account B│ │Account C│    │
│ └─────────┘ └─────────┘ └─────────┘    │
└─────────────────────────────────────────┘

Advanced: Merkle Patricia Tries

Ethereum uses a more sophisticated structure—the Merkle Patricia Trie:

Text

Combines:
1. Merkle tree (cryptographic proofs)
2. Patricia trie (prefix compression)
3. Radix tree (path optimization)

Enables:
- Proof of account existence
- Proof of account non-existence
- Efficient storage updates

Patricia Trie Structure

Text

Storing keys: "dog", "doge", "cat"

Traditional trie:
       root
      /    \
     c      d
     |      |
     a      o
     |      |
     t      g
     *     / \
          *   e
              *

Patricia trie (compressed):
       root
      /    \
    cat    dog
     *     / \
          *  e
              *

Fewer nodes = more efficient

Performance Analysis

Merkle Tree Complexity

Operation	Time Complexity	Space Complexity
Build tree	O(n)	O(n)
Generate proof	O(log n)	O(log n)
Verify proof	O(log n)	O(log n)
Update leaf	O(log n)	O(1)
Find root	O(1)	O(1)

Practical Performance

TypeScript

// Benchmark merkle operations
function benchmark() {
  const sizes = [100, 1000, 10000, 100000];

  for (const size of sizes) {
    const leaves = Array.from({ length: size }, (_, i) =>
      Buffer.from(`Transaction ${i}`),
    );

    // Build tree
    const buildStart = performance.now();
    const root = getMerkleRoot(leaves);
    const buildTime = performance.now() - buildStart;

    // Generate proof
    const proofStart = performance.now();
    const proof = generateProof(leaves, Math.floor(size / 2));
    const proofTime = performance.now() - proofStart;

    // Verify proof
    const verifyStart = performance.now();
    const valid = verifyProof(proof);
    const verifyTime = performance.now() - verifyStart;

    console.log(`Size: ${size}`);
    console.log(`  Build: ${buildTime.toFixed(2)}ms`);
    console.log(`  Proof: ${proofTime.toFixed(2)}ms`);
    console.log(`  Verify: ${verifyTime.toFixed(2)}ms`);
    console.log(`  Proof size: ${proof.proof.length} hashes`);
  }
}

Common Merkle Tree Variants

Binary Merkle Tree (Standard)

Each node has exactly 2 children
Most common in blockchains

Sparse Merkle Tree

Fixed depth, most leaves empty
Enables proofs of non-membership
Used for account states

Merkle Mountain Range

Append-only, multiple peaks
Good for growing datasets
Used in some blockchains for transaction history

Key Takeaways

Merkle trees summarize large datasets in a single hash
Merkle proofs enable O(log n) verification
Blockchain headers store Merkle roots, not full data
SPV clients verify transactions without full blockchain
Space efficiency enables light clients and sharding

Common Mistakes

Forgetting odd-node handling: Trees with odd leaves need duplication or other handling
Order dependency: Merkle roots are order-dependent; same data in different order = different root
Not salting leaves: Without per-leaf salt, identical leaves produce identical hashes

Try It Yourself

Build a Merkle tree: Implement the algorithms above and verify a proof manually
Calculate proof size: For a block with 2,048 transactions, how many bytes is a Merkle proof? (Assuming SHA-256)
Explore real proofs: Use a block explorer API to fetch a Merkle proof for a real Bitcoin transaction

Next: Cryptographic Hashing - Deep dive into the hash functions that power Merkle trees and blockchain security.