Introduction
Rate limiting is a technique used to control the number of requests a system processes within a specific time frame. It helps prevent abuse, ensures fair usage, protects against DDoS attacks, and maintains system stability.
This blog explores the most commonly used rate-limiting algorithms, their advantages and disadvantages, and how to implement them in Java.
1️⃣ Token Bucket Algorithm
How It Works
- A bucket holds a fixed number of tokens.
- Tokens are added to the bucket at a constant rate.
- Each request consumes a token.
- If the bucket is empty, excess requests are denied until new tokens are added.
Pros & Cons
✅ Allows short bursts of traffic while controlling overall request rate.
✅ Efficient for distributed systems.
❌ If the bucket drains quickly, requests may be blocked until tokens are refilled.
Java Implementation
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;
class TokenBucketRateLimiter {
private final Semaphore tokens;
private final int capacity;
public TokenBucketRateLimiter(int capacity, int refillRatePerSecond) {
this.capacity = capacity;
this.tokens = new Semaphore(capacity);
// Refill tokens periodically
new Thread(() -> {
while (true) {
tokens.release(refillRatePerSecond);
if (tokens.availablePermits() > capacity) {
tokens.drainPermits();
tokens.release(capacity);
}
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}).start();
}
public boolean allowRequest() {
return tokens.tryAcquire();
}
}
2️⃣ Leaky Bucket Algorithm
How It Works
- Requests enter a queue (bucket).
- Requests are processed at a fixed rate (like water leaking from a bucket).
- If the queue overflows, extra requests are discarded.
Pros & Cons
✅ Ensures a steady flow of requests.
✅ Prevents sudden traffic spikes from overloading the system.
❌ Can introduce delays if the queue is full.
Java Implementation
import java.util.LinkedList;
import java.util.Queue;
class LeakyBucketRateLimiter {
private final Queue<Long> queue;
private final int capacity;
private final long leakRateMillis;
public LeakyBucketRateLimiter(int capacity, int leakRatePerSecond) {
this.capacity = capacity;
this.leakRateMillis = 1000L / leakRatePerSecond;
this.queue = new LinkedList<>();
}
public synchronized boolean allowRequest() {
long currentTime = System.currentTimeMillis();
while (!queue.isEmpty() && queue.peek() <= currentTime - leakRateMillis) {
queue.poll();
}
if (queue.size() < capacity) {
queue.add(currentTime);
return true;
}
return false;
}
}
3️⃣ Fixed Window Counter Algorithm
How It Works
- A counter tracks the number of requests per fixed time window (e.g., per minute).
- If the count exceeds the allowed limit, further requests are rejected until the next window.
Pros & Cons
✅ Easy to implement.
✅ Works well for predictable traffic patterns.
❌ Can lead to spikes at window boundaries.
Java Implementation
import java.util.concurrent.atomic.AtomicInteger;
class FixedWindowRateLimiter {
private final int limit;
private final long windowSizeMillis;
private long windowStart;
private final AtomicInteger requestCount;
public FixedWindowRateLimiter(int limit, int windowSizeSeconds) {
this.limit = limit;
this.windowSizeMillis = windowSizeSeconds * 1000L;
this.windowStart = System.currentTimeMillis();
this.requestCount = new AtomicInteger(0);
}
public synchronized boolean allowRequest() {
long now = System.currentTimeMillis();
if (now - windowStart >= windowSizeMillis) {
windowStart = now;
requestCount.set(0);
}
return requestCount.incrementAndGet() <= limit;
}
}
4️⃣ Sliding Window Counter Algorithm
How It Works
- Uses smaller sub-windows within a fixed window.
- More accurate and distributes requests evenly.
Pros & Cons
✅ More accurate than Fixed Window Counter.
✅ Reduces request bursts at window boundaries.
❌ More complex to implement.
5️⃣ Sliding Window Log Algorithm
How It Works
- Stores timestamps of each request.
- Removes timestamps outside the allowed time window.
Pros & Cons
✅ High accuracy in rate limiting.
❌ High memory usage due to storing request timestamps.
6️⃣ Adaptive Rate Limiting
How It Works
- Uses machine learning or heuristics to adjust rate limits dynamically.
- Can consider factors like server load, request patterns, and user behavior.
Pros & Cons
✅ Dynamically adjusts limits for better efficiency.
❌ Complex implementation.
Comparison Table
Algorithm | Burst Handling | Smoothness | Memory Usage | Complexity |
---|---|---|---|---|
Token Bucket | ✅ Yes | ✅ Yes | 🔹 Low | 🔹 Simple |
Leaky Bucket | ❌ No | ✅ Yes | 🔹 Low | 🔹 Simple |
Fixed Window Counter | ❌ No | ❌ No | 🔹 Low | 🔹 Simple |
Sliding Window Counter | ✅ Yes | ✅ Yes | 🔹 Medium | 🔹 Medium |
Sliding Window Log | ✅ Yes | ✅ Yes | 🔴 High | 🔴 Complex |
Adaptive Rate Limiting | ✅ Yes | ✅ Yes | 🔴 High | 🔴 Complex |
Interview Preparation
Commonly Asked Questions:
1️⃣ What is the difference between Token Bucket and Leaky Bucket?
2️⃣ Which algorithm prevents bursts better?
3️⃣ How would you implement rate limiting in a distributed system?
4️⃣ How do you ensure fairness in rate limiting?
Conclusion
Choosing the right rate-limiting algorithm depends on your system's requirements. For APIs, Token Bucket is widely used due to its efficiency. If smooth processing is essential, Leaky Bucket is a good choice. For better accuracy, Sliding Window methods work well.
Want a deeper dive into API Gateway-based rate limiting? Let me know! 🚀
Top comments (0)