Distributed Systems

Raft

Distributed leader election system implementing core Raft consensus mechanics.

RustTokioAxumConsensusDistributed Systems

<1

Re-election Time

3

Cluster Size

150–300

Election Timeout

50

Heartbeat Interval

01

Problem

Problem statement, constraint shape, and the gap this project explores.

Problem Statement

How do you guarantee exactly one leader in a distributed cluster under concurrent failures?

Challenge

Built a distributed leader election system implementing core Raft consensus mechanics, including quorum-based voting, randomized election timeouts, and heartbeat stabilization over HTTP RPC.

Why Existing Approaches Failed

Distributed systems require consensus — multiple nodes must agree on a single leader even when messages are delayed, nodes crash, and elections happen simultaneously. Raft is the most understandable consensus algorithm, but implementing it correctly requires handling a surprising number of edge cases around term numbering, vote splitting, and heartbeat timing.

02

Constraints

Non-negotiable boundaries that shaped the implementation.

Nodes

3-node cluster

Transport

HTTP RPC over localhost

Failure Model

Node crash and restart

Election Timeout

Randomized to prevent split votes

Convergence

<1 second after leader failure

03

Architecture

The primary design surface: flow, subsystem roles, and state boundaries.

Architecture Brief

Each node runs an Axum HTTP server exposing /vote and /heartbeat endpoints. Nodes maintain Raft state (Follower, Candidate, Leader) and a current term. A Tokio interval drives election timeout checks. The leader broadcasts heartbeats to prevent unnecessary elections.

Execution Flow
01

Timeout

02

Increment term

03

Broadcast RequestVote

04

Collect votes

05

If majority: become Leader

06

Broadcast Heartbeat

07

Followers reset timeout

01

Raft State Machine

Manages Follower → Candidate → Leader transitions with term tracking.

02

Election Timer

Randomized timeout (150–300ms) triggers candidacy if no heartbeat received.

03

Vote RPC

HTTP POST /vote — candidates request votes from all peers.

04

Heartbeat RPC

HTTP POST /heartbeat — leader suppresses elections across followers.

05

Quorum Checker

Grants leadership only when majority (⌊N/2⌋+1) votes received.

04

Engineering Tradeoffs

Design review notes: what was optimized and what was deliberately left behind.

EDR-01
Decision

HTTP vs raw TCP for RPC

Why Chosen

HTTP simplifies request routing, debugging, and future extension. For a 3-node local cluster, the latency difference is negligible.

Alternative Rejected

Lower latency (raw TCP is faster)

Impact

HTTP via Axum

EDR-02
Decision

Randomized timeouts vs fixed

Why Chosen

Fixed timeouts cause simultaneous elections — all candidates vote for themselves, no majority forms. Randomization breaks symmetry.

Alternative Rejected

Deterministic election timing

Impact

Randomized election timeouts (150–300ms)

EDR-03
Decision

In-memory state vs persistent log

Why Chosen

The implementation focuses on leader election, not log replication. Persistent state is required for full Raft but out of scope here.

Alternative Rejected

Crash recovery of committed log entries

Impact

In-memory state only

05

Failure Modes

Incident-style notes for the ways the design can break.

Split vote

FM-01
Impact

All nodes become candidates simultaneously. Each votes for itself, no majority.

Mitigation

: randomized timeouts make simultaneous elections statistically rare.

IPv4/IPv6 mismatch

FM-02
Impact

Axum binds on 0.0.0.0 (IPv4) but peer URLs used ::1 (IPv6). Silent connection failures.

Mitigation

by normalizing all addresses to 127.0.0.1.

Term regression on restart

FM-03
Impact

Restarted node has term 0, causing it to accept stale heartbeats.

Mitigation

— full Raft requires persisting currentTerm to disk.

Heartbeat lost under load

FM-04
Impact

HTTP timeouts under high CPU caused missed heartbeats triggering unnecessary elections.

Mitigation

with explicit timeout configuration on RPC clients.

06

Benchmarks

Environment first, numbers second. Metrics should be inspectable, not ornamental.

Test Environment
Runtime

Rust / Tokio

Workload

3-node cluster

Stack

Rust, Tokio, Axum, Consensus

Scope

Distributed Systems

Evidence

Project-level benchmark notes

Performance Results
Re-election Time

<1second

Cluster Size

3nodes

Election Timeout

150–300ms (randomized)

Heartbeat Interval

50ms

Quorum

2/3nodes

07

Lessons Learned

Engineering takeaways from the implementation, including remaining work.

01

Raft looks simple on paper but has many subtle correctness requirements

especially around term numbering and vote grant conditions.

02

Network address configuration errors (IPv4/IPv6 mismatches) produce silent failures that look like algorithm bugs.

03

Randomized timeouts are essential and elegant

a small implementation detail that prevents a catastrophic failure mode.

04

Future: implement log replication and persistence to complete the full Raft protocol and support state machine commands.

Akshat