ZeeshanAli-0704

Posted on Jan 15

Rate limiting : Global, Tumbling Window, and Sliding Window

#systemdesignwithzeeshanali #systemdesign

Rate limiting is a critical aspect of system design to ensure fair usage, prevent abuse, and maintain system stability. It controls how many requests a user or a client can make in a given timeframe. The three most common rate-limiting strategies are Global, Tumbling Window, and Sliding Window.

Below, I will explain each with its use cases, advantages, disadvantages, and examples.

1. Global Rate Limiting

Definition: Limits the total number of requests across all users or clients globally within a given time window.
Key Characteristics:
- Applies to the entire system, not individual users.
- Focuses on protecting the backend infrastructure from overload.

Example:

If the global rate limit is set to 10,000 requests per second, it means that the system will reject any requests after the 10,000th request in that second.

Use Case:

APIs or services that experience high traffic and need to prevent total overload.
Protecting expensive or rate-sensitive backend operations, such as database queries or third-party API calls.

Advantages:

Simple to implement.
Prevents system-wide resource exhaustion.

Disadvantages:

Does not differentiate between clients. A single client can exhaust the entire limit, affecting others.

2. Tumbling Window Rate Limiting

Definition: Divides time into fixed, non-overlapping intervals (windows) and counts the requests within each window.
Key Characteristics:
- Resets the counter at the start of each new window.
- Requests are limited to a specific count within each time window.

Example:

A tumbling window with a 1-minute duration allows 100 requests per minute per user. The count resets at the start of the next minute.

Use Case:

Scenarios where strict boundaries for request limits are acceptable.
Ideal for simpler rate-limiting requirements.

Advantages:

Easy to implement and understand.
Ensures fairness within each time window.

Disadvantages:

Can lead to a "burst" problem: Users can send requests at the end of one window and the beginning of the next window, effectively doubling the allowed rate momentarily.

3. Sliding Window Rate Limiting

Definition: Keeps a rolling record of requests and checks how many have been made in the last fixed duration from the current time.
Key Characteristics:
- Offers a more accurate and consistent rate-limiting mechanism.
- Avoids the "burst" problem of tumbling windows.

Example:

If the sliding window is 1 minute and the limit is 100 requests per minute, the system will count requests over the past 60 seconds from the current timestamp, irrespective of window boundaries.

Use Case:

APIs requiring smoother and more consistent rate limiting.
Preventing abuse without allowing short-term bursts.

Advantages:

Prevents bursty traffic patterns.
Provides better fairness across users.

Disadvantages:

More complex to implement than tumbling windows.
Requires maintaining a time-based log of requests (e.g., timestamps in a queue or array).

How to Implement Rate Limiting

Data Structures

Fixed Counter (Global and Tumbling): Use a simple counter variable for requests within a time window.
Sliding Window (Efficient): Use a Deque (Double-Ended Queue) or a Priority Queue to store timestamps of requests.

Implementation Examples

Here are implementations of Global, Tumbling Window, and Sliding Window rate-limiting strategies in JavaScript.

1. Global Rate Limiting (Tumbling Window)

This implementation tracks the total number of requests across the system within a fixed time window.

class GlobalRateLimiter {
  constructor(limit, windowSize) {
    this.limit = limit; // Max requests allowed
    this.windowSize = windowSize * 1000; // Window size in milliseconds
    this.startTime = Date.now();
    this.requestCount = 0;
  }

  allowRequest() {
    const currentTime = Date.now();

    // Reset the counter if the window has passed
    if (currentTime - this.startTime >= this.windowSize) {
      this.startTime = currentTime;
      this.requestCount = 0;
    }

    // Check if within limit
    if (this.requestCount < this.limit) {
      this.requestCount++;
      return true; // Request allowed
    }
    return false; // Request denied
  }
}

// Example usage
const globalLimiter = new GlobalRateLimiter(100, 10); // 100 requests per 10 seconds

setInterval(() => {
  if (globalLimiter.allowRequest()) {
    console.log("Request allowed");
  } else {
    console.log("Rate limit exceeded");
  }
}, 100); // Simulate a request every 100ms

2. Tumbling Window Rate Limiting (Per User)

This implementation uses a map to track the request count for each user within their tumbling window.

class TumblingWindowRateLimiter {
  constructor(limit, windowSize) {
    this.limit = limit; // Max requests per user
    this.windowSize = windowSize * 1000; // Window size in milliseconds
    this.userRequests = new Map(); // Store user request data
  }

  allowRequest(userId) {
    const currentTime = Date.now();

    if (!this.userRequests.has(userId)) {
      this.userRequests.set(userId, { count: 1, startTime: currentTime });
      return true;
    }

    const userData = this.userRequests.get(userId);

    // Reset the user's window if the time has passed
    if (currentTime - userData.startTime >= this.windowSize) {
      this.userRequests.set(userId, { count: 1, startTime: currentTime });
      return true;
    }

    // Check if the user is within their limit
    if (userData.count < this.limit) {
      userData.count++;
      return true;
    }

    return false; // User has exceeded their limit
  }
}

// Example usage
const userLimiter = new TumblingWindowRateLimiter(5, 10); // 5 requests per 10 seconds per user

for (let i = 0; i < 7; i++) {
  const result = userLimiter.allowRequest("user1");
  console.log(`Request ${i + 1}:`, result ? "Allowed" : "Blocked");
}

3. Sliding Window Rate Limiting (Per User)

This implementation uses a Map and a queue (array) to store timestamps for each user's requests.

class SlidingWindowRateLimiter {
  constructor(limit, windowSize) {
    this.limit = limit; // Max requests per user
    this.windowSize = windowSize * 1000; // Window size in milliseconds
    this.userRequests = new Map(); // Store user requests as timestamps
  }

  allowRequest(userId) {
    const currentTime = Date.now();

    if (!this.userRequests.has(userId)) {
      this.userRequests.set(userId, []);
    }

    const requestQueue = this.userRequests.get(userId);

    // Remove timestamps outside the window
    while (requestQueue.length > 0 && requestQueue[0] <= currentTime - this.windowSize) {
      requestQueue.shift();
    }

    // Check if the user is within their limit
    if (requestQueue.length < this.limit) {
      requestQueue.push(currentTime);
      return true; // Request allowed
    }

    return false; // Request denied
  }
}

// Example usage
const slidingLimiter = new SlidingWindowRateLimiter(5, 10); // 5 requests per 10 seconds per user

setInterval(() => {
  const result = slidingLimiter.allowRequest("user1");
  console.log("Request:", result ? "Allowed" : "Blocked");
}, 1500); // Simulate a request every 1.5 seconds

Comparison of the Implementations

Global Rate Limiter: Simple but applies to all users collectively.
Tumbling Window Rate Limiter: Tracks each user independently with strict boundaries.
Sliding Window Rate Limiter: Smooth and accurate for high-traffic systems, but requires maintaining timestamps.

Comparison Table

Strategy	Granularity	Accuracy	Use Case	Complexity
Global	System-wide	Low	Protect infrastructure from overload	Low
Tumbling Window	Per time window	Moderate	Enforcing strict limits per window	Low
Sliding Window	Rolling time interval	High	Smoother and consistent rate limiting	Moderate to High

More Details:

Get all articles related to system design
Hastag: SystemDesignWithZeeshanAli

systemdesignwithzeeshanali

Git: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli

Top comments (3)

Suneeh • Jan 15

Implementing a Fixed Window (tumbling per user) can also be achieved via Memory Cache.
When a Request comes in, check if your Limiting Key exists in the Cache and decrease it, or add it with a expiration time of whatever your Window is.

It felt super comfortable to write it that way for me. Maybe give it a try!

ZeeshanAli-0704 • Jan 15

Hi Suneesh,

Thanks for sharing you thoughts.

Can you help me with some blogs around this?

Suneeh • Jan 15

Sadly I can't. I just implemented this at work, so sadly I'm not allowed to share the code.

Also I wanna mention that I'm working with C# and .Net8. There is a MemoryCache object that stores any object you want for X seconds. So its perfect for handling rate limiting with a specified window.

DEV Community

Rate limiting : Global, Tumbling Window, and Sliding Window

1. Global Rate Limiting

Example:

Use Case:

Advantages:

Disadvantages:

2. Tumbling Window Rate Limiting

Example:

Use Case:

Advantages:

Disadvantages:

3. Sliding Window Rate Limiting

Example:

Use Case:

Advantages:

Disadvantages:

How to Implement Rate Limiting

Data Structures

Implementation Examples

1. Global Rate Limiting (Tumbling Window)

2. Tumbling Window Rate Limiting (Per User)

3. Sliding Window Rate Limiting (Per User)

Comparison of the Implementations

Comparison Table

Top comments (3)

Read next

Kickstarting Weekly System Design Deep Dives: Building Scalable Systems

CDNs in Distributed Systems: Beyond Caching for Better Performance

A Curious Encounter: Unraveling the Roles of Microservices, API Gateways, and API Servers

Scaling Node.js: Handling 1 Million Requests Like a Pro