06 Apr 2023

3x Faster Mongodb Controlled Failovers

I recently modified our failover protocol at work for MongoDB in a way that reduces the interruption from 14 seconds down to 3.5 seconds by altering election configurations ahead of controlled failovers. This was tested on a 3.4 cluster but should hold true up until modern versions. Starting in 4.0.2 it’s less valuable for controlled failovers but still useful as a tuning setting for uncontrolled failovers.

How it works

The premise is to make the shard call a new election as fast as possible by reducing electionTimeoutMillis and heartbeatIntervalMillis.

Procedure:

// on the primary
cfg = rs.conf()
cfg.settings["electionTimeoutMillis"] = 1000
cfg.settings["heartbeatIntervalMillis"] = 100
rs.reconfig(cfg)

// wait 60 seconds for propagation
rs.stepDown()

// wait for 60 seconds for election to settle
// connect to primary

cfg = rs.conf()
cfg.settings["electionTimeoutMillis"] = 10000
cfg.settings["heartbeatIntervalMillis"] = 1000
rs.reconfig(cfg)

This is valuable to tune also if you’re on high quality, low latency networks. You’re missing faster failovers in non-controlled circumstances every time mongo insists on waiting 10 seconds before allowing an election, even when receiving failing heartbeats.

PS - While browsing the docs I found this ^_^ which is non-intuitive since I would expect no writes to one shard but no impact to other shards. Presumably it’s a typo and cluster means replicaset.

During the election process, the cluster cannot accept write operations until it elects the new primary.