TL,DR:
If your application goes down because of Redis outages while using Redis as cache storage, you need to check your Redis client configurations in your code. You're not actually using Redis as a cache storage properly in this case.
Redis cache anti-pattern
Redis is mostly used as a cache storage to speed up server response time. However, I've often seen applications crash when Redis servers fail (due to Redis memory being full or connection limits being reached). This is an anti-pattern and shouldn't happen. Your application should still function during Redis outages (although with increased latency). This problem typically occurs because the Redis client library configuration treats Redis as a primary data source and keeps retrying with exponential backoff when Redis crashes, instead of gracefully falling back to the primary database.
Demo
I created a simple demo for this using nodejs and ioredis library here.
In the main branch , we have a simple redis configuration like this :
const redis = new Redis({
host: process.env.REDIS_HOST || "redis",
port: process.env.REDIS_PORT || 6379,
password: process.env.REDIS_PASSWORD || "",
});
Let's run it using docker:
docker compose up
//Create a new item
curl -X POST http://localhost:8080/items \
-H "Content-Type: application/json" \
-d '{"name":"test item","value":"test value"}'
// Get the item
curl http://localhost:8080/items/1
Now , lets bring down redis using "docker stop ". And we see application crashes because ioredis connection error is unhandled.
Now let's checkout another branch where we handled redis connection errors :
// Handle errors without crashing
redis.on("error", (err) => {
console.error("[ioredis] Connection error:", err.message);
// Don't crash the app, just log the error
});
Now, if we stop the Redis Docker container, the app doesn't crash anymore. We've made some progress. However, our requests are taking forever to get a response.
Why is that? Because our configuration still isn't resilient to Redis outages. It keeps trying to connect to Redis indefinitely (ioredis default max retry attempt is 20, which is still too many—we don't want to retry so many times).
Let's make it even better :
// Redis connection
const redis = new Redis({
host: process.env.REDIS_HOST || "redis",
port: process.env.REDIS_PORT || 6379,
password: process.env.REDIS_PASSWORD || "",
connectTimeout: 100, // 100 millisecond timeout
maxRetriesPerRequest: 1, // Try once and then move on
});
Now, our application is resilient to Redis outages. It remains responsive even when Redis goes down.
Conclusion
Of course, you should implement high availability setups and use max-memory policies so that Redis doesn't go down in the first place. And if it does go down, there should be alerts in place. But a Redis cache outage shouldn't turn into a P0 incident for your application.
Top comments (0)