In some systems, implementing a distributed resetting counter can be a critical challenge. This is particularly common in applications like restaurant management software, where each day, visitor numbers are issued sequentially starting from 0. The counter resets daily and increments with every order placed.
While several methods exist for implementing this functionality, some solutions are overly complex. In this post, I’ll outline a simpler and more efficient solution that leverages Redis to tackle this challenge.
What is a Resetting Distributed Counter?
Imagine visiting a restaurant, where each customer is assigned a unique number when placing an order. Every day, the counter resets to 0, and each new customer gets the next available number, incremented by 1.
Now imagine multiple restaurants using our services and they need this functionality implemented.
This functionality defines a resetting distributed counter:
- Resetting: The counter restarts from 0 at the beginning of each day.
- Distributed: The counter operates across multiple nodes, in a SaaS solution.
Redis can be used as an in-memory data store to manage the counter efficiently. Redis’s high-speed operations, support for atomic commands, and Time-To-Live (TTL) functionality make it an ideal choice for this use case.
Advantages of Redis:
- High Throughput: Redis operates entirely in memory, providing fast read/write performance.
-
Atomic Operations: Redis commands like
INCR
andSET
are atomic " however under huge traffic we might want to . - Automatic Reset: TTL ensures the counter resets automatically every 24 hours without external scheduling.
- Resource Efficiency: The counter is stored in memory, reducing reliance on database queries.
Consideration:
Redis is an in-memory store, which means its data is not persisted by default. To mitigate risks of data loss (e.g., server restarts), we periodically persist counter values in a database. This ensures that even if Redis fails, the system can recover gracefully.
Implementation Details
Key Design Choices:
-
Counter Granularity: Each organization (e.g., restaurant) gets its own counter, identified by a key such as
orgCounter-{orgUUID}
. - TTL with Randomization: To handle peak loads (e.g., high traffic at midnight), the TTL is randomized between 24 and 27 hours to stagger resets across organizations.
- Fallback Mechanism: In case Redis is unavailable, a fallback query to the database retrieves and increments the counter.
Redis Counter Service
Below is the implementation of the counter service:
const redis = require('redis');
const client = redis.createClient();
const BASE_TTL = 86400; // 24 hours in seconds
const TTL_RANDOMIZATION = 10800; // 1-3 hours randomization
// Generate a random TTL between 24 and 27 hours
function getRandomTTL() {
return BASE_TTL + Math.floor(Math.random() * TTL_RANDOMIZATION);
}
// Function to manage the resetting distributed counter
async function setOrgCounterTTL(orgUUID) {
const key = `orgCounter-${orgUUID}`;
const ttl = getRandomTTL();
// Check if the counter exists
const existingCounter = await client.get(key);
if (existingCounter !== null) {
// If the counter exists, increment it
const newValue = await client.incr(key);
return newValue;
} else {
// If the counter does not exist, initialize it with a TTL
await client.set(key, 1, {
EX: this.getRandomTTL(),
NX: true
});
return 1;
}
}
module.exports = { setOrgCounterTTL };
Order Creation Flow
The order creation service integrates with the Redis-based counter:
const { setOrgCounterTTL } = require('./distributedCounterService');
// Handle new order creation
async function createOrder(orgUUID) {
try {
// Fetch the current counter value
const counter = await setOrgCounterTTL(orgUUID);
// Assign the counter to the new order
const newOrder = {
organization: orgUUID,
numberForTheDay: counter,
// Add other order details here...
};
// Save the new order to the database
await saveOrderToDatabase(newOrder);
} catch (error) {
console.error("Error creating order:", error);
// Fallback to database in case Redis is unavailable
const fallbackCounter = await fallbackToDatabaseCounter(orgUUID);
console.warn(`Fallback counter value: ${fallbackCounter}`);
}
}
We can insure there are no race condition occur when we execute increase counter by two approaches, the first approach will use watch and multi and watch command. The watch command watches the key for any changes across the transaction, and if it was increased or reset during the transaction, the watch command will return null, if it does return null we will retry. Another thing to consider with Redis transactions , they aren't supported in keys clustered across multiple nodes. To handle this issue wee just partition the organizations on specific slot using Redis hash tag.
async setOrGetCounter(key: string) : Promise<number>{
await this.client.watch(key);
const multi = await this.client.multi();
await multi.get(key);
await multi.incr(key);
const [exits, curr] = await multi.exec();
if (exits === null) {
await multi.set(key, 1, {
EX: this.getRandomWithin24HoursTTL(),
NX: true
});
}
if (curr === null) {
return this.setOrGetCounter(key); // Optionally retry
}
if (exits) {
console.log("existing counter", curr, exits);
return parseInt(exits);
}
return 1;
}}
The second solution is using lua scripting within our NodeJS application, note here this is ideal, Redis is single threaded and this will be executed as one command:
async setOrGetCounter(key: string): Promise<number> {
const luaScript = `
local current = redis.call("GET", KEYS[1])
if current then
return redis.call("INCR", KEYS[1])
else
redis.call("SET", KEYS[1], 1, "EX", ARGV[1], "NX")
return 1
end
`;
const ttl = this.getRandomWithin24HoursTTL();
const result = await this.client.eval(luaScript, 1, key, ttl);
return parseInt(result);
}
Additional Considerations
Eventual Consistency
In distributed systems, eventual consistency is a key principle. While Redis ensures near-real-time updates, there is a small chance of temporary inconsistencies (e.g., network partitions or Redis failures). This tradeoff is acceptable given the performance benefits.
Peak Traffic and Randomized TTL
For multi-tenant systems, peak traffic can strain Redis since synchronization between tenants happens accidently of first order of the day. Adding TTL randomization ensures that counters across organizations reset at slightly different times, distributing the load evenly for daily resetting.
Enabling users to reset counter manually
Each shift may need to reset the counter manually when they start, so provide an endpoint for them to do so within a click on the UI.
Data Persistence
The counter is attached directly to the order service, whenever there is a new order, we will ask for a new number from redis.
Fallback Strategy ( Graceful degradation )
In case Redis is unavailable:
- Fetch the last persisted counter value from the database.
- Increment and use this value temporarily until Redis is restored.
Redis clusters
Redis doesn't support transaction if keys are location on different slots/nodes. We can overcome this issue by partitioning organizations on specific slot using hash tag.
Conclusion
Using Redis for a resetting distributed counter provides a scalable, high-performance solution that eliminates the bottlenecks of traditional database-based approaches. With atomic operations, TTL randomization, and fallback mechanisms, this implementation is robust, efficient, and capable of handling peak traffic while maintaining reliability.
Ahmed Rakan
Top comments (0)