DEV Community

Chandra Shettigar
Chandra Shettigar

Posted on

Scaling Rails Background Jobs in Kubernetes: From Queue to HPA

Ever tried processing a million records in a Rails controller action? Yeah, that's not going to end well. Your users will be staring at a spinning wheel, your server will be gasping for resources, and your ops team will be giving you that "we need to talk" look.

The Problem: Long-Running Requests

Picture this: Your Rails app needs to:

  • Generate complex reports from millions of records
  • Process large file uploads
  • Send thousands of notifications
  • Sync data with external systems

Long Running Requests in Rails

Doing any of these in a controller action means:

  • Timeout issues (Nginx, Rails, Load Balancer)
  • Blocked server resources
  • Poor user experience
  • Potential data inconsistency if the request fails

Step 1: Moving to Background Processing

First, let's move these long-running tasks to background jobs:

# app/controllers/reports_controller.rb
class ReportsController < ApplicationController
  def create
    report_id = SecureRandom.uuid
    ReportGenerationJob.perform_later(
      user_id: current_user.id,
      report_id: report_id,
      parameters: report_params
    )

    render json: { 
      report_id: report_id,
      status: 'processing',
      status_url: report_status_path(report_id)
    }
  end
end

# app/jobs/report_generation_job.rb
class ReportGenerationJob < ApplicationJob
  queue_as :reports

  def perform(user_id:, report_id:, parameters:)
    # Process report
    report_data = generate_report(parameters)

    # Store results
    store_report(report_id, report_data)

    # Notify user
    ReportMailer.completed(user_id, report_id).deliver_now
  end
end
Enter fullscreen mode Exit fullscreen mode

Long Running Requests in Rails - with Async Jobs

Great! Now our users get immediate feedback, and our server isn't blocked. But we've just moved the problem - now it's in our job queue.

The Scaling Challenge

A single Rails worker instance with Sidekiq needs proper configuration for queues and concurrency. Here's a basic setup:

# config/initializers/sidekiq.rb
# Note: This is a simplified example to demonstrate the concept.
# Actual syntax might vary based on your Sidekiq version and requirements.

Sidekiq.configure_server do |config|
  # Configure Redis connection
  config.redis = { url: ENV.fetch('REDIS_URL', 'redis://localhost:6379/0') }

  # Configure concurrency based on environment
  config.options[:concurrency] = case Rails.env
    when 'production'
      ENV.fetch('SIDEKIQ_CONCURRENCY', 25).to_i
    else
      10
  end
end

# Queue configuration with weights for priority
config.options[:queues] = [
  ['critical', 5],      # Higher weight = higher priority
  ['sequential', 3],
  ['default', 2],
  ['low', 1]
]
Enter fullscreen mode Exit fullscreen mode

And in your config/sidekiq.yml:

# Note: This is a simplified example. Adjust based on your needs
:verbose: false
:concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
:timeout: 25

# Environment-specific configurations
production:
  :concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
  :queues:
    - [critical, 5]
    - [sequential, 3]
    - [default, 2]
    - [low, 1]
Enter fullscreen mode Exit fullscreen mode

This gives us 25 concurrent jobs in production, but what happens when:

  • We have 1000 reports queued up
  • Some jobs need to run sequentially (like financial transactions)
  • Different jobs need different resources
  • We have mixed workloads (quick jobs vs long-running jobs)

Queue Strategy: Not All Jobs Are Equal

Let's organize our jobs based on their processing requirements:

class FinancialTransactionJob < ApplicationJob
  queue_as :sequential
  sidekiq_options retry: 3, backtrace: true

  def perform(transaction_id)
    # Must process one at a time
    process_transaction(transaction_id)
  end
end

class ReportGenerationJob < ApplicationJob
  queue_as :default
  sidekiq_options retry: 5, backtrace: true

  def perform(report_id)
    # Can process many simultaneously
    generate_report(report_id)
  end
end

class NotificationJob < ApplicationJob
  queue_as :low
  sidekiq_options retry: 3

  def perform(user_ids)
    # Quick jobs, high volume
    send_notifications(user_ids)
  end
end
Enter fullscreen mode Exit fullscreen mode

Enter Kubernetes HPA: Dynamic Worker Scaling

Long Running Requests in Rails with Jobs on Kubernetes HPA

Now we can set up our worker deployment and HPA:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rails-workers
spec:
  template:
    spec:
      containers:
      - name: sidekiq
        image: myapp/rails:latest
        command: ["bundle", "exec", "sidekiq"]
        env:
        - name: RAILS_ENV
          value: "production"
        - name: SIDEKIQ_CONCURRENCY
          value: "25"
        resources:
          requests:
            memory: "1Gi"
            cpu: "1"
          limits:
            memory: "2Gi"
            cpu: "2"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: rails-workers
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: rails-workers
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: sidekiq_queue_depth
      target:
        type: AverageValue
        averageValue: 100
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Pods
        value: 2
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60
Enter fullscreen mode Exit fullscreen mode

This setup gives us:

  • Minimum 2 worker pods (50 concurrent jobs)
  • Maximum 10 worker pods (250 concurrent jobs)
  • Automatic scaling based on queue depth
  • Conservative scale-down to prevent thrashing
  • Resource limits to protect our cluster

Monitoring and Fine-Tuning

To make this work smoothly, monitor:

  1. Queue depths by queue type
  2. Job processing times
  3. Error rates
  4. Resource utilization

Add Prometheus metrics:

# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
  config.on(:startup) do
    queue_depth_gauge = Prometheus::Client.registry.gauge(
      :sidekiq_queue_depth,
      docstring: 'Sidekiq queue depth',
      labels: [:queue]
    )

    Sidekiq::Scheduled::Poller.new.poll do
      Sidekiq::Queue.all.each do |queue|
        queue_depth_gauge.set(
          { queue: queue.name },
          queue.size
        )
      end
    end
  end
end
Enter fullscreen mode Exit fullscreen mode

Best Practices and Gotchas

  1. Queue Isolation

    • Separate queues for different job types
    • Consider dedicated workers for critical queues
    • Use queue priorities effectively
  2. Resource Management

    • Set appropriate memory/CPU limits
    • Monitor job memory usage
    • Use batch processing for large datasets
  3. Error Handling

    • Implement retry strategies
    • Set up dead letter queues
    • Monitor failed jobs
  4. Scaling Behavior

    • Set appropriate scaling thresholds
    • Use stabilization windows
    • Consider time-of-day patterns

Conclusion

By combining Rails' background job capabilities with Kubernetes' scaling features, we can build a robust, scalable system for processing long-running tasks. The key is to:

  1. Move long-running tasks to background jobs
  2. Organize queues based on job characteristics
  3. Configure worker processes appropriately
  4. Use HPA for dynamic scaling
  5. Monitor and adjust based on real-world usage

Remember: The goal isn't just to scale - it's to provide a reliable, responsive system that efficiently processes work while maintaining data consistency and user experience.

Top comments (0)