Ever tried processing a million records in a Rails controller action? Yeah, that's not going to end well. Your users will be staring at a spinning wheel, your server will be gasping for resources, and your ops team will be giving you that "we need to talk" look.
The Problem: Long-Running Requests
Picture this: Your Rails app needs to:
- Generate complex reports from millions of records
- Process large file uploads
- Send thousands of notifications
- Sync data with external systems
Doing any of these in a controller action means:
- Timeout issues (Nginx, Rails, Load Balancer)
- Blocked server resources
- Poor user experience
- Potential data inconsistency if the request fails
Step 1: Moving to Background Processing
First, let's move these long-running tasks to background jobs:
# app/controllers/reports_controller.rb
class ReportsController < ApplicationController
def create
report_id = SecureRandom.uuid
ReportGenerationJob.perform_later(
user_id: current_user.id,
report_id: report_id,
parameters: report_params
)
render json: {
report_id: report_id,
status: 'processing',
status_url: report_status_path(report_id)
}
end
end
# app/jobs/report_generation_job.rb
class ReportGenerationJob < ApplicationJob
queue_as :reports
def perform(user_id:, report_id:, parameters:)
# Process report
report_data = generate_report(parameters)
# Store results
store_report(report_id, report_data)
# Notify user
ReportMailer.completed(user_id, report_id).deliver_now
end
end
Great! Now our users get immediate feedback, and our server isn't blocked. But we've just moved the problem - now it's in our job queue.
The Scaling Challenge
A single Rails worker instance with Sidekiq needs proper configuration for queues and concurrency. Here's a basic setup:
# config/initializers/sidekiq.rb
# Note: This is a simplified example to demonstrate the concept.
# Actual syntax might vary based on your Sidekiq version and requirements.
Sidekiq.configure_server do |config|
# Configure Redis connection
config.redis = { url: ENV.fetch('REDIS_URL', 'redis://localhost:6379/0') }
# Configure concurrency based on environment
config.options[:concurrency] = case Rails.env
when 'production'
ENV.fetch('SIDEKIQ_CONCURRENCY', 25).to_i
else
10
end
end
# Queue configuration with weights for priority
config.options[:queues] = [
['critical', 5], # Higher weight = higher priority
['sequential', 3],
['default', 2],
['low', 1]
]
And in your config/sidekiq.yml
:
# Note: This is a simplified example. Adjust based on your needs
:verbose: false
:concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
:timeout: 25
# Environment-specific configurations
production:
:concurrency: <%= ENV.fetch("SIDEKIQ_CONCURRENCY", 25) %>
:queues:
- [critical, 5]
- [sequential, 3]
- [default, 2]
- [low, 1]
This gives us 25 concurrent jobs in production, but what happens when:
- We have 1000 reports queued up
- Some jobs need to run sequentially (like financial transactions)
- Different jobs need different resources
- We have mixed workloads (quick jobs vs long-running jobs)
Queue Strategy: Not All Jobs Are Equal
Let's organize our jobs based on their processing requirements:
class FinancialTransactionJob < ApplicationJob
queue_as :sequential
sidekiq_options retry: 3, backtrace: true
def perform(transaction_id)
# Must process one at a time
process_transaction(transaction_id)
end
end
class ReportGenerationJob < ApplicationJob
queue_as :default
sidekiq_options retry: 5, backtrace: true
def perform(report_id)
# Can process many simultaneously
generate_report(report_id)
end
end
class NotificationJob < ApplicationJob
queue_as :low
sidekiq_options retry: 3
def perform(user_ids)
# Quick jobs, high volume
send_notifications(user_ids)
end
end
Enter Kubernetes HPA: Dynamic Worker Scaling
Now we can set up our worker deployment and HPA:
apiVersion: apps/v1
kind: Deployment
metadata:
name: rails-workers
spec:
template:
spec:
containers:
- name: sidekiq
image: myapp/rails:latest
command: ["bundle", "exec", "sidekiq"]
env:
- name: RAILS_ENV
value: "production"
- name: SIDEKIQ_CONCURRENCY
value: "25"
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "2Gi"
cpu: "2"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: rails-workers
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: rails-workers
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: sidekiq_queue_depth
target:
type: AverageValue
averageValue: 100
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 2
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 1
periodSeconds: 60
This setup gives us:
- Minimum 2 worker pods (50 concurrent jobs)
- Maximum 10 worker pods (250 concurrent jobs)
- Automatic scaling based on queue depth
- Conservative scale-down to prevent thrashing
- Resource limits to protect our cluster
Monitoring and Fine-Tuning
To make this work smoothly, monitor:
- Queue depths by queue type
- Job processing times
- Error rates
- Resource utilization
Add Prometheus metrics:
# config/initializers/sidekiq.rb
Sidekiq.configure_server do |config|
config.on(:startup) do
queue_depth_gauge = Prometheus::Client.registry.gauge(
:sidekiq_queue_depth,
docstring: 'Sidekiq queue depth',
labels: [:queue]
)
Sidekiq::Scheduled::Poller.new.poll do
Sidekiq::Queue.all.each do |queue|
queue_depth_gauge.set(
{ queue: queue.name },
queue.size
)
end
end
end
end
Best Practices and Gotchas
-
Queue Isolation
- Separate queues for different job types
- Consider dedicated workers for critical queues
- Use queue priorities effectively
-
Resource Management
- Set appropriate memory/CPU limits
- Monitor job memory usage
- Use batch processing for large datasets
-
Error Handling
- Implement retry strategies
- Set up dead letter queues
- Monitor failed jobs
-
Scaling Behavior
- Set appropriate scaling thresholds
- Use stabilization windows
- Consider time-of-day patterns
Conclusion
By combining Rails' background job capabilities with Kubernetes' scaling features, we can build a robust, scalable system for processing long-running tasks. The key is to:
- Move long-running tasks to background jobs
- Organize queues based on job characteristics
- Configure worker processes appropriately
- Use HPA for dynamic scaling
- Monitor and adjust based on real-world usage
Remember: The goal isn't just to scale - it's to provide a reliable, responsive system that efficiently processes work while maintaining data consistency and user experience.
Top comments (0)