HashiCorp’s Raft is a mature implementation of the Raft protocol, but its codebase has some readability challenges. This article covers the cleanup process to improve clarity and maintainability. The code is published on github.
The first notable issue is the absence of the state pattern. Given Raft’s finite-state machine nature, it is well-suited for this design, yet HashiCorp’s implementation does not follow it. Instead, it starts and stops separate loops for the follower, candidate, and leader states, duplicating the main loop logic across them. By adopting a proper state pattern—where each state directly handles input, I was able to make the main loop more distinct and eliminate redundancy.
State transitions in Raft can be triggered by multiple goroutines, making them prone to race conditions. Previously, a node transitioned states by directly updating its state and term, forcing the current state's loop to exit and a new one to start. I found this deviates from state pattern and make the transition harder to track.
In my design, transitions are explicitly managed through dispatchTransition
, which sends a transition message to the receiveTransitions
loop. This loop then delegates the transition to the current state. Each state implements its own HandleTransition
method, which directly follows Raft’s state transition diagram, ensuring that only valid transitions occur.
Figure 1: Raft consensus algorithm.
The second issue is the confusion between Configuration
and Config
. The concept of Configuration
is entirely about Membership
. To clarify this, I replaced Configuration
with the Membership
struct, which manages both the latest and committed memberships. It also encapsulates all membership-related operations, such as checking, updating, and creating new memberships.
The HandleMembershipChange
method in the leader state now utilizes these methods to either reject membership changes or accept, dispatch, and apply them. This eliminates the scattered handling logic found in functions like configurationChangeChIfStable
.
The third issue is the lack of encapsulation in the leader state's replication logic. For example, Raft's replicate
method takes a followerReplication
instance and runs the replication loop, while replicationTo
attempts to send logs to a follower up to the latest index.
To address this, I refined followerReplication
into peerReplication
, encapsulating all replication-related methods within it. Additionally, the leader state should directly manage its peer replication map, rather than delegating it to Raft.
Here's a summary of the re-encapsulation of these methods, where pr stands for peerReplication
:
original | cleanup | intent
--------------------------------------------------------------------------------------------------
Raft.replicate | pr.run | run the replication loop for a peer
Raft.replicateTo | pr.replicate | send the latest logs
Raft.heartbeat | pr.heartbeat | run the heartbeat loop
Raft.sendLatestSnapshot | pr.sendLatestSnapshot | send latest snapshot to the peer
Raft.pipelineReplicate | pr.runPipeline | run replication in pipeline mode
Raft.pipelineSend | pr.pipelineReplicate | send latest logs in pipeline mode
Raft.pipelineDecode | pr.receivePipelineResponses | loop to handle peer responses in pipeline mode
Raft.startStopReplication | Leader.startReplication | start a peerReplication for each peer, skipping those that are already running.
| Leader.startPeerReplication | directly start a peerReplication
| Leader.stopPeerReplication | directly stop a peerReplication
Helper methods like waitForReplicationTrigger
, waitForHeartbeatTrigger
, and waitForBackoff
were added to eliminate repetitive select
statements, making the replication and heartbeat logic clearer. Additionally, key methods were rewritten to eliminate goto
statements and labels, resulting in a more linear flow that improves readability and makes the logic easier to follow.
A small but potentially confusing detail is CommitTimeout
. Despite its name, it is actually an interval that periodically triggers replication to synchronize the leader's commit index
on followers—it has nothing to do with commit timeouts. To clarify its purpose, I renamed it to CommitSyncInterval
.
With the more powerful Membership
and peerReplication
structs, I was able to fully implement the staging feature. Previously, the staging step was skipped, and a new peer became a voter
immediately. In my implementation, a dedicated staging
struct ensures that only one peer can be staged at a time. It waits for peerReplication
to sync logs and for the new membership to stabilize
before promoting the peer to a voter. When a new leader takes over, it checks for any ongoing staging and completes it.
The fourth issue is the confusion between application state
and the FSM
. The actual FSM
is Raft
, which follows a state transition diagram. Raft has a loop that receives commits and requests from other loops, applying them to the application state
. However, the application state
is unlikely to be another FSM
.
To address this, the FSM
interface - responsible for applying commits, handling snapshots, and restoring state—has been refined into the CommandsState
interface. Additionally, a MembershipApplier
interface has been introduced to handle membership-related commits. The AppState
struct now manages CommandsState
, MembershipApplier
, supporting channels, and relevant indices.
Furthermore, the runFSM
loop has been refined into the receiveMutations
method, simplifying its logic and making it more explicit.
The fifth issue is the poor organization of Raft’s core code. The Raft struct is declared in api.go
, while raft.go
is overloaded with various responsibilities, including the loops for follower, candidate, and leader states, leader stepup and stepdown, candidate running election and requests handling.
In the refined design, the code is cleanly split based on functionality:
-
raft_api.go
– Contains only the exposed API methods that clients use to interact with Raft. -
raft.go
– Focuses solely on defining the Raft struct and managing its critical loops, such as the main loop, receiveHeartbeat loop, receiveTransitions loop, and receiveSnapshotRequest loop. -
raft_builder.go
– Implements the builder pattern to facilitate flexible object creation and streamline the initialization of a new Raft node. - State-Specific Files – No more state loops in raft.go. Each state has its own file and implements methods to satisfy the State interface. Further more, state specific code is defined there:
-
state_candidate.go
– containsrunElection
-
state_leader.go
– containsstepUp
andstepDown
.
-
-
raft_internals.go
– Houses all internal Raft methods that support states and loops. These methods have been renamed and refactored for clarity. Some notable method renamings include:
original | cleanup | intent
------------------------------------------------------------------------------------------------------
processRPRC | handleRPC | check RPC type and delegate handling to matching handler
appendEntries | handleAppendEntries | handle the appendEntries request
requestVotes | handleRequestVote | handle vote request
installSnapshot | handleInstallSnapshot | handle install snapshot request
timeoutNow | handleCandidateNow | handle request to transition raft to candidate immediately
The sixth issue is the overly complicated transport. I found that TCP-based network transport is sufficient for both testing and real use. As a result, redundant interfaces and functions in transport.go
and tcp_transport.go
have been removed. Additionally, inmem_transport.go
is dropped, as testing is now done directly with netTransport
.
To simulate network partitions, I introduced the ConnGetter
interface:
-
TransparentConnGetter
(used in real deployments) does not interfere with how netTransport connects nodes. -
BlockableConnGetter
(used in tests) can block connections to simulate network partitions.
The following redundant components have been removed: serverAddressProvider
, getConnFromAddressProvider
, and getProviderAddressOrFallback
.
Several refinements were made after consolidating transport logic:
- A single
NewNetTransport
replaces multiple scattered methods (NewNetworkTransport
,NewNetworkTransportWithLogger
,NewNetworkTransportWithConfig
). - The
backoff
struct now encapsulates exponential backoff logic, simplifying thelisten
loop. -
handleCommand
is split intohandleMessage
and dedicated handlers for different message types. Helper functions (dispatchWaitRespond
,sendUntilStop
,receiveUntilStop
) eliminate repetitive code. -
netConn
is renamed topeerConn
, encapsulating related methods (SetDeadline
,sendMsg
,readResp
). -
genericRPC
is renamed tounaryRPC
, andsendRPC
is refactored tostreamRPC
to better reflect their behavior. -
netPipeline
is renamed toreplicationPipeline
, making its role explicit—it only sends requests and reads responses. Running loops is now handled bypeerReplication
. - The transport now sends
heartbeat RPCs
to a dedicated heartbeat channel. Raft explicitly processes them in a loop, aligning with how RPCs are handled in mainLoop. The previousheartbeatFn
andheartbeatFnLock
fields were removed as they added unnecessary complexity and obscured logic.
The seventh issue is the leader verification process being unnecessarily complex.
- When a client sends a
verification request
, it goes to the mainLoop. - The leader registers it in each
peerReplication
and triggers an immediateheartbeat
. - After the
heartbeat
, peerReplication notifies pending verification requests of success or failure. - Requests count the votes, and if verification fails or has enough votes, they
resend
themselves to Raft's mainLoop.
Resending verification requests to verifyCh
complicates the logic. This mechanism was initially designed to implicitly force a leader stepdown when the heartbeat
receives a response with a higher term
during verification, but it is unnecessary. In the cleanup code, the leader always steps down if the heartbeat
receives a higher term response, regardless of the verification process. As soon as the request succeeds or fails, result is sent immediately to client. Removing the resend
logic makes verifyRequest handling cleaner. The leaderState.notify and cleanNotify are removed. Furthermore, the vague notifyAll
is renamed to verifyAll
for clarity.
Previously, leader self-verification was handled via Raft.checkLeaderLease
in mainLoop. In the cleanup, I refactored it into selfVerify
and checkFollowerContacts
, encapsulated within Leader. The selfVerify
loop now starts when a leader steps up and exits when it steps down. It continuously monitors follower contact and forces a stepdown if the leader loses contacts to the majority.
Aside from the refinements mentioned above, improvements were also made to other areas, including the observer, voting, and testing. Every detail has been polished to enhance readability and eliminate friction and confusion.
Tests now run in parallel to improve speed, allowing them to run more frequently.
Thanks for reading through! Hope you find this useful.
Top comments (0)