DEV Community

Cover image for Building Production-Ready Rate Limiters in Go: A Complete Implementation Guide
Aarav Joshi
Aarav Joshi

Posted on

Building Production-Ready Rate Limiters in Go: A Complete Implementation Guide

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Rate limiting is a crucial technique for controlling resource usage and maintaining system stability in high-performance applications. I've spent considerable time implementing and optimizing rate limiters in production environments, and I'll share comprehensive insights into building efficient rate limiting solutions in Go.

Rate limiting fundamentally controls the flow of requests or operations within a defined time window. In Go, we can implement this using several algorithms, each with specific use cases and trade-offs.

The Token Bucket Algorithm is one of the most effective approaches. It maintains a bucket of tokens that replenishes at a fixed rate. Each request consumes one token, and when the bucket is empty, requests are rejected. Here's an implementation:

type TokenBucket struct {
    rate       float64
    maxTokens  float64
    tokens     float64
    lastUpdate time.Time
    mu         sync.Mutex
}

func NewTokenBucket(rate, maxTokens float64) *TokenBucket {
    return &TokenBucket{
        rate:       rate,
        maxTokens:  maxTokens,
        tokens:     maxTokens,
        lastUpdate: time.Now(),
    }
}

func (tb *TokenBucket) Allow() bool {
    tb.mu.Lock()
    defer tb.mu.Unlock()

    now := time.Now()
    elapsed := now.Sub(tb.lastUpdate).Seconds()
    tb.tokens = math.Min(tb.maxTokens, tb.tokens+(elapsed*tb.rate))
    tb.lastUpdate = now

    if tb.tokens >= 1 {
        tb.tokens--
        return true
    }
    return false
}
Enter fullscreen mode Exit fullscreen mode

For distributed systems, we need a more sophisticated approach. Redis-based rate limiting provides excellent scalability:

func RedisRateLimiter(ctx context.Context, key string, limit int, window time.Duration) bool {
    pipe := redisClient.Pipeline()
    now := time.Now().UnixNano()

    pipe.ZRemRangeByScore(ctx, key, "0", fmt.Sprintf("%d", now-window.Nanoseconds()))
    pipe.ZAdd(ctx, key, &redis.Z{Score: float64(now), Member: now})
    pipe.ZCard(ctx, key)
    pipe.Expire(ctx, key, window)

    cmds, err := pipe.Exec(ctx)
    if err != nil {
        return false
    }

    return cmds[2].(*redis.IntCmd).Val() <= int64(limit)
}
Enter fullscreen mode Exit fullscreen mode

The Sliding Window Counter offers better precision for time-based limits:

type SlidingWindow struct {
    capacity int
    window   time.Duration
    requests map[int64]int
    mu       sync.Mutex
}

func NewSlidingWindow(capacity int, window time.Duration) *SlidingWindow {
    return &SlidingWindow{
        capacity: capacity,
        window:   window,
        requests: make(map[int64]int),
    }
}

func (sw *SlidingWindow) Allow() bool {
    sw.mu.Lock()
    defer sw.mu.Unlock()

    now := time.Now().Unix()
    windowStart := now - int64(sw.window.Seconds())

    count := 0
    for timestamp, requests := range sw.requests {
        if timestamp < windowStart {
            delete(sw.requests, timestamp)
        } else {
            count += requests
        }
    }

    if count >= sw.capacity {
        return false
    }

    sw.requests[now]++
    return true
}
Enter fullscreen mode Exit fullscreen mode

For high-performance applications, we can implement an adaptive rate limiter that adjusts based on system load:

type AdaptiveRateLimiter struct {
    baseline    *TokenBucket
    loadFactor  float64
    mu          sync.Mutex
}

func (al *AdaptiveRateLimiter) Allow() bool {
    al.mu.Lock()
    defer al.mu.Unlock()

    var m runtime.MemStats
    runtime.ReadMemStats(&m)

    systemLoad := float64(m.Sys) / float64(m.HeapSys)
    adjustedRate := al.baseline.rate * (1 - (systemLoad * al.loadFactor))

    return al.baseline.Allow()
}
Enter fullscreen mode Exit fullscreen mode

When implementing rate limiting in microservices, we need to consider distributed coordination. Here's a gRPC-aware rate limiter:

type GRPCRateLimiter struct {
    limiter    *TokenBucket
    middleware func(ctx context.Context) error
}

func (gl *GRPCRateLimiter) UnaryInterceptor(ctx context.Context, req interface{}, info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (interface{}, error) {
    if !gl.limiter.Allow() {
        return nil, status.Error(codes.ResourceExhausted, "rate limit exceeded")
    }
    return handler(ctx, req)
}
Enter fullscreen mode Exit fullscreen mode

For optimal performance, we can implement a concurrent rate limiter using channels:

type ChannelRateLimiter struct {
    ticker *time.Ticker
    sem    chan struct{}
}

func NewChannelRateLimiter(rate int, burst int) *ChannelRateLimiter {
    ch := make(chan struct{}, burst)
    ticker := time.NewTicker(time.Second / time.Duration(rate))

    go func() {
        for range ticker.C {
            select {
            case ch <- struct{}{}:
            default:
            }
        }
    }()

    return &ChannelRateLimiter{
        ticker: ticker,
        sem:    ch,
    }
}

func (cl *ChannelRateLimiter) Allow() bool {
    select {
    case <-cl.sem:
        return true
    default:
        return false
    }
}
Enter fullscreen mode Exit fullscreen mode

Rate limiters should also handle cleanup and resource management:

type CleanupAwareRateLimiter struct {
    *TokenBucket
    cleanup func()
    ctx     context.Context
    cancel  context.CancelFunc
}

func (cl *CleanupAwareRateLimiter) Start() {
    go func() {
        ticker := time.NewTicker(time.Minute)
        defer ticker.Stop()

        for {
            select {
            case <-ticker.C:
                cl.cleanup()
            case <-cl.ctx.Done():
                return
            }
        }
    }()
}
Enter fullscreen mode Exit fullscreen mode

These implementations provide a solid foundation for rate limiting in Go applications. The choice of algorithm depends on specific requirements like precision, performance, and distribution needs.

For production deployment, consider monitoring rate limiter behavior:

type MonitoredRateLimiter struct {
    limiter  RateLimiter
    metrics  *prometheus.CounterVec
}

func (ml *MonitoredRateLimiter) Allow() bool {
    allowed := ml.limiter.Allow()
    if allowed {
        ml.metrics.WithLabelValues("allowed").Inc()
    } else {
        ml.metrics.WithLabelValues("rejected").Inc()
    }
    return allowed
}
Enter fullscreen mode Exit fullscreen mode

Rate limiting is essential for building reliable, scalable systems. These implementations provide different approaches to handle various scenarios, from simple API rate limiting to complex distributed systems.

Remember to test rate limiters under load and adjust parameters based on real-world usage patterns. Regular monitoring and tuning ensure optimal performance and protection for your applications.


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)