Gossip Protocol

Solana uses a gossip protocol for peer-to-peer communication between nodes. This decentralized communication layer enables cluster discovery, data propagation, and network health monitoring without relying on central coordination.

What Is Gossip?

Gossip protocols spread information like rumors in a social network:

Text
GOSSIP SPREADING PATTERN
════════════════════════

Round 0: Node A has info
    [A]--info

Round 1: A tells B and C
    [A]-->[B]
     └──>[C]

Round 2: A,B,C each tell 2 more
    [A]-->[D]     [B]-->[F]     [C]-->[H]
     └──>[E]      └──>[G]       └──>[I]

Round 3: Exponential spread continues
    ...8 more nodes learn

Round N: Entire network knows

Time to full propagation: O(log n) rounds
For 1000 nodes: ~10 rounds
For 1M nodes: ~20 rounds

Why Gossip?

PropertyGossip Advantage
Fault toleranceNo single point of failure
ScalabilityO(log n) propagation
ResilienceWorks with node churn
SimplicityNo leader election needed
BandwidthEven distribution

Solana's Gossip Implementation

CRDS (Cluster Replicated Data Store)

Solana's gossip is built on CRDS:

TypeScript
interface CRDSValue {
  // Who created this value
  from: PublicKey;

  // Signature proving authenticity
  signature: Signature;

  // Logical timestamp
  wallclock: number;

  // The actual data
  data: CRDSData;
}

type CRDSData =
  | ContactInfo // Node addresses
  | Vote // Consensus votes
  | LowestSlot // Pruning info
  | SnapshotHashes // State snapshots
  | EpochSlots // Slot availability
  | LegacyVersion // Software version
  | NodeInstance; // Instance identifier

Contact Info

Nodes advertise themselves through ContactInfo:

TypeScript
interface ContactInfo {
  // Node identity
  pubkey: PublicKey;

  // Network addresses
  gossip: SocketAddr; // Gossip protocol
  tpu: SocketAddr; // Transaction submission
  tpuForwards: SocketAddr; // Transaction forwarding
  tvu: SocketAddr; // Block validation
  repair: SocketAddr; // Block repair requests
  serveRepair: SocketAddr; // Serve repair responses
  rpc: SocketAddr; // RPC API
  pubsub: SocketAddr; // WebSocket subscriptions

  // Metadata
  shredVersion: number; // Protocol version
  wallclock: number; // Update timestamp
}

Message Types

Text
GOSSIP MESSAGE TYPES
════════════════════

1. PULL REQUEST
   └── "Give me values I don't have"
   └── Uses bloom filter for efficiency

2. PULL RESPONSE
   └── "Here are values you're missing"
   └── Responds to pull request

3. PUSH MESSAGE
   └── "Here's new info I have"
   └── Proactively shares updates

4. PRUNE MESSAGE
   └── "Stop sending me duplicates"
   └── Optimizes bandwidth

5. PING/PONG
   └── "Are you alive?"
   └── Maintains peer health

How Gossip Works

Push Protocol

Proactive information sharing:

TypeScript
// Simplified push protocol
class GossipPush {
  private activeSet: Set<PublicKey>; // Peers we push to
  private pendingPush: CRDSValue[] = []; // Values to push

  // When we have new information
  onNewValue(value: CRDSValue) {
    this.pendingPush.push(value);
  }

  // Periodic push (every ~100ms)
  async pushToActiveSet() {
    if (this.pendingPush.length === 0) return;

    // Select random subset of active set
    const targets = this.selectRandomPeers(6);

    for (const peer of targets) {
      await this.sendPushMessage(peer, this.pendingPush);
    }

    this.pendingPush = [];
  }

  // Prune on duplicates
  onPruneMessage(from: PublicKey, origins: PublicKey[]) {
    // Stop pushing these origins to this peer
    // They're getting it from someone else
  }
}

Pull Protocol

Request missing information:

TypeScript
// Simplified pull protocol
class GossipPull {
  private crds: Map<string, CRDSValue>;

  // Periodic pull (every ~5 seconds)
  async pullFromPeer(peer: PublicKey) {
    // Create bloom filter of what we have
    const filter = this.createBloomFilter();

    // Request what we're missing
    const response = await this.sendPullRequest(peer, filter);

    // Add new values to our store
    for (const value of response) {
      if (this.isValid(value) && this.isNewer(value)) {
        this.crds.set(this.keyFor(value), value);
      }
    }
  }

  // Bloom filter for efficiency
  createBloomFilter(): BloomFilter {
    const filter = new BloomFilter(10000, 0.01);

    for (const [key, value] of this.crds) {
      filter.add(key);
    }

    return filter;
  }
}

Bloom Filters

Bloom filters enable efficient "what do I have" queries:

Text
BLOOM FILTER CONCEPT
════════════════════

Problem: How to efficiently ask "what values do you have?"
Sending all keys would be expensive.

Solution: Bloom filter (probabilistic set membership)

Properties:
├── Space efficient: ~10 bits per element
├── False positives: Possible (tunable rate)
├── False negatives: Impossible
└── Perfect for "I might have this"

Node A: "Here's my bloom filter (1000 elements, 1% false positive)"
Node B: "You're missing these 50 values" (skips values in filter)

Even with false positives, we sync correctly:
├── False positive → A already has it → ignored
├── True positive → A already has it → ignored
└── NegativeA needs it → sent

Cluster Discovery

Bootstrap Process

Text
NEW NODE JOINING
════════════════

1. Start with entrypoint(s)
   └── Well-known addresses for each cluster
   └── mainnet-beta.solana.com:8001
   └── devnet.solana.com:8001

2. Initial gossip pull
   └── Request ContactInfo from entrypoints
   └── Learn about other nodes

3. Build peer table
   └── Store all discovered nodes
   └── Select active set for pushing

4. Verify cluster identity
   └── Check genesis hash matches
   └── Check shred version matches
   └── Reject incompatible peers

5. Start normal gossip
   └── Push/pull cycle begins
   └── Discover entire cluster in minutes

Peer Selection

TypeScript
// How nodes select gossip peers
interface PeerSelection {
  // Active set: Peers we actively push to
  activeSet: {
    size: 6; // Fanout factor
    selection: "stake_weighted_random";
    rotation: "periodic"; // Rotate to ensure connectivity
  };

  // Pull peers: Nodes we pull from
  pullPeers: {
    selection: "random_with_stake_bias";
    frequency: "5_seconds";
  };

  // Repair peers: Nodes for block repair
  repairPeers: {
    selection: "by_availability";
    considers: ["slots_available", "latency", "reliability"];
  };
}

Vote Propagation

Consensus votes spread through gossip:

Text
VOTE PROPAGATION
════════════════

Validator A votes on slot 100:
│
├── Creates Vote CRDSValue
│   ├── from: A's pubkey
│   ├── signature: A's signature
│   └── data: Vote { slot: 100, hash: "...", timestamp: ... }
│
├── Pushes to active set (6 peers)
│
└── Exponential spread:
    Round 1: 6 nodes
    Round 2: 36 nodes
    Round 3: 216 nodes
    Round 4: 1296 nodes
    ...

Full cluster (~1700 validators) in ~4 rounds
At 100ms per round: ~400ms for vote propagation

Vote Processing

TypeScript
// Handling received votes
class VoteReceiver {
  private processedVotes: Set<string>;

  onVoteReceived(vote: Vote, from: PublicKey) {
    // 1. Deduplicate
    const voteKey = `${vote.from}-${vote.slot}`;
    if (this.processedVotes.has(voteKey)) {
      return; // Already seen
    }
    this.processedVotes.add(voteKey);

    // 2. Verify signature
    if (!this.verifyVoteSignature(vote, from)) {
      return; // Invalid vote
    }

    // 3. Check if vote is for known slot
    if (!this.knownSlot(vote.slot)) {
      return; // Haven't seen this slot yet
    }

    // 4. Update consensus state
    this.towerBft.recordVote(vote);

    // 5. Forward to peers (push)
    this.gossip.pushValue(vote);
  }
}

Turbine: Block Propagation

While gossip handles metadata, Turbine handles block data:

Text
TURBINE vs GOSSIP
═════════════════

Gossip:
├── Small data: ContactInfo, Votes, Metadata
├── All nodes equal
├── Eventual consistency OK
└── ~1-5 KB messages

Turbine:
├── Large data: Block shreds
├── Tree structure for efficiency
├── Latency critical
└── ~1 KB shreds, many per block

Turbine Tree:
                    [Leader]
                       │
          ┌────────────┼────────────┐
          ▼            ▼            ▼
       [Layer 1]   [Layer 1]   [Layer 1]
          │            │            │
     ┌────┼────┐  ┌────┼────┐  ┌────┼────┐
     ▼    ▼    ▼  ▼    ▼    ▼  ▼    ▼    ▼
   [L2] [L2] [L2] ...

Propagation: O(log n) steps
Bandwidth per node: O(1) - only forward to children

Querying Gossip Data

TypeScript
import { Connection } from "@solana/web3.js";

const connection = new Connection("https://api.mainnet-beta.solana.com");

// Get cluster nodes (from gossip ContactInfo)
const clusterNodes = await connection.getClusterNodes();

console.log(`Cluster has ${clusterNodes.length} nodes`);

for (const node of clusterNodes.slice(0, 5)) {
  console.log({
    pubkey: node.pubkey,
    gossip: node.gossip,
    tpu: node.tpu,
    rpc: node.rpc,
    version: node.version,
  });
}

// Get vote accounts (votes propagated via gossip)
const voteAccounts = await connection.getVoteAccounts();

console.log(`Active validators: ${voteAccounts.current.length}`);
console.log(`Delinquent validators: ${voteAccounts.delinquent.length}`);

for (const voter of voteAccounts.current.slice(0, 3)) {
  console.log({
    nodePubkey: voter.nodePubkey,
    votePubkey: voter.votePubkey,
    stake: voter.activatedStake,
    commission: voter.commission,
    lastVote: voter.lastVote,
  });
}

Network Health Monitoring

TypeScript
// Monitor gossip health indicators
async function checkNetworkHealth(connection: Connection) {
  // 1. Cluster size
  const nodes = await connection.getClusterNodes();
  const nodeCount = nodes.length;

  // 2. Active vs delinquent validators
  const voteAccounts = await connection.getVoteAccounts();
  const activeCount = voteAccounts.current.length;
  const delinquentCount = voteAccounts.delinquent.length;

  // 3. Slot progression (gossip enables coordination)
  const slot1 = await connection.getSlot();
  await new Promise((r) => setTimeout(r, 5000));
  const slot2 = await connection.getSlot();
  const slotsPerSecond = (slot2 - slot1) / 5;

  // 4. Recent block production
  const epochInfo = await connection.getEpochInfo();
  const performanceSamples = await connection.getRecentPerformanceSamples(1);

  return {
    nodeCount,
    validators: {
      active: activeCount,
      delinquent: delinquentCount,
      healthRatio: activeCount / (activeCount + delinquentCount),
    },
    performance: {
      slotsPerSecond,
      expectedSlotsPerSecond: 2.5, // 400ms per slot
      tps:
        performanceSamples[0]?.numTransactions /
          performanceSamples[0]?.samplePeriodSecs || 0,
    },
  };
}

Gossip Optimization

Stake-Weighted Selection

Higher-stake nodes get more gossip traffic:

TypeScript
// Stake-weighted peer selection
function selectGossipPeers(
  allPeers: { pubkey: PublicKey; stake: number }[],
  count: number,
): PublicKey[] {
  const totalStake = allPeers.reduce((sum, p) => sum + p.stake, 0);
  const selected: PublicKey[] = [];

  for (let i = 0; i < count; i++) {
    // Weighted random selection
    let target = Math.random() * totalStake;

    for (const peer of allPeers) {
      target -= peer.stake;
      if (target <= 0 && !selected.includes(peer.pubkey)) {
        selected.push(peer.pubkey);
        break;
      }
    }
  }

  return selected;
}

// Benefits of stake-weighted gossip:
// 1. High-stake (important) nodes stay well-connected
// 2. Vote propagation prioritizes validators
// 3. Sybil resistance - can't cheaply create many peers

Prune Optimization

Text
PRUNE MESSAGES
══════════════

Problem: Duplicate messages waste bandwidth

Node A pushes to B, C, D
Node B pushes to C, D, E
Node C receives duplicates from A and B

Solution: Prune messages

C to A: "Prune origins [B]"
Meaning: "I'm already getting B's data elsewhere"

A stops pushing B's data to C
Network self-optimizes into spanning tree

Security Considerations

Eclipse Attacks

Text
ECLIPSE ATTACK
══════════════

Attack: Surround a node with malicious peers
Result: Victim only sees attacker's view of network

Prevention in Solana:
├── Stake-weighted peer selection
│   └── Attackers need significant stake
│
├── Diverse peer selection
│   └── Include peers from different IP ranges
│
├── Outbound connections
│   └── Node chooses who to connect to
│
└── Signature verification
    └── Can't forge messages from validators

Spam Prevention

TypeScript
// Rate limiting gossip messages
interface GossipRateLimits {
  // Push messages
  pushPerSecond: 100;
  pushBurstSize: 200;

  // Pull requests
  pullRequestsPerMinute: 20;

  // Per-origin limits
  valuesPerOriginPerMinute: 50;

  // Size limits
  maxMessageSize: 65_535; // 64KB
  maxValuesPerPush: 100;
}

// Invalid message handling
function handleGossipMessage(msg: GossipMessage, from: SocketAddr) {
  // Check rate limits
  if (rateLimiter.isExceeded(from)) {
    return; // Drop message
  }

  // Verify signatures
  if (!verifySignature(msg)) {
    banPeer(from, "1_hour");
    return;
  }

  // Check message freshness
  if (msg.wallclock < Date.now() - 10_000) {
    return; // Too old, likely replay
  }

  // Process valid message
  processMessage(msg);
}

Debugging Gossip

Bash
# View gossip stats (if running a validator)
solana-gossip spy --entrypoint mainnet-beta.solana.com:8001

# Check node's view of cluster
solana cluster-nodes

# Verify connectivity
solana ping <validator-pubkey>

# Check gossip port accessibility
nc -zv your-node.com 8001

# Monitor gossip traffic (requires validator access)
solana-validator monitor

Key Takeaways

  1. Gossip enables decentralized coordination without central servers
  2. Push + Pull protocols ensure both fast propagation and consistency
  3. Bloom filters make pull requests bandwidth-efficient
  4. Stake-weighted selection provides Sybil resistance
  5. Turbine handles blocks while gossip handles metadata

Next: Tower BFT - Solana's consensus mechanism built on gossip.