DEV Community

Cover image for How to Crater Your Database, An Introduction
Seth Orell for AWS Community Builders

Posted on • Originally published at sethorell.substack.com

How to Crater Your Database, An Introduction

I've worked with lots of engineers who never consider scale. That's OK. Not everyone has to. However, someone on the team must consider scaling, or your carefully constructed application, which is now starting to grow, could collapse in on itself like a sinkhole. This series of articles is for both classes of engineers: those new to scaling and the experts who might need a refresher.

I'll give away the secret before I even start. If you want to crater1 your database, do things that don't scale, and then try to scale. That's it!

"But," you may ask, "how do I know which things scale and which don't?" That is a great question, and it will be the subject of several articles, starting with this one.

But I don't need scale

Not every company needs to invest in scale right now. If you are prototyping for market fit, you can use any data store you want, even a flat file.2

Kent Beck has an interesting metaphor that helps classify three broad phases of a business: Explore, Expand, and Extract.3 He refers to this as "3X" and uses it to delineate different approaches to building software.

In the first phase, "Explore," you don't care about scale or load; you care about quickly expressing ideas to generate feedback about what to do next. Companies in this phase should optimize their choices to favor fast prototypes.

The second phase, "Expand," is a growth phase. Scale suddenly becomes paramount. Your load is increasing, and the focus should be on eliminating bottlenecks.

He calls the last phase "Extract." As your scaling rate becomes manageable, efficiency will matter more than growth.

I will be gearing these articles toward software engineers in companies moving into (or neck deep in) phase 2, Expand. If you tell me, "I just need the functionality; it will handle our MVP just fine," you are in the Explore phase. That's fine. Keep these articles bookmarked for when your product becomes successful.


If you want to crater your database, do things that don't scale, and then try to scale.


Predictable scaling

Scale means significant growth. Not 10%, but 10x (or more). It's when your biggest customer plans an ambitious expansion and wants to sign a new contract. Here's how it can look.

Sales: "Acme plans to roll out our product to 50 new locations!"
Engineer: "How many locations do they use us on today?"
Sales: "Five."
Engineer: "(to herself), but Acme already eats up 80% of our database capacity..."

Part of the problem here is that the engineer doesn't know how the system will behave after a 10x increase. She expects it to suffer, but in what way? To put it another way, the system scales unpredictably.

Designing a system to scale predictably is an achievement. You must carefully examine many of your previous assumptions--things that worked for a prototype--and ask yourself, "How will this scale?"

The database

The database is a known bottleneck. Every company I've worked with that has experienced scaling events has fought hard-to-scale database constraints. All the amazing architectural advances in scaling non-stateful resources like compute or networking have revealed the areas that have not made similar progress. The only thing left behind is the database.

Note that you can broaden the term "database" to "datastore." The database problems I will describe also occur in search engines and file systems. Using the same bad techniques, you can crater these systems, too.

The sweet spot: Serverless

If done prematurely, the energy spent on building scalable systems is wasted. Since you don't yet know if your product will be a hit, you need fast prototypes and quick pivots. Choose technologies that allow you to build quickly.

To scale up a system, you must make architectural and technological choices that support predictable scaling.

The "sweet spot" between these two approaches is to use technologies that let you build fast and offer predictable scaling. If you are comfortable with fully managed AWS services like Lambda, API Gateway, S3, and DynamoDB, you can build fast with predictable scaling from the beginning. In other words, consider a "serverless first" approach.

Any investment in becoming more proficient with these services will pay off in spades as you find yourself in more "Explore" and "Expand" situations.4 It has for me.

Next Up

In the next installment, we will examine aggregations, one of the most common techniques for cratering your database.

Happy building!

References

Kent Beck's Substack: Software Design: Tidy First?
Robert C. Martin: No DB
Ownership Matters: Everything Suffers from Cold Starts


  1. Merriam-Webster defines crater, used as a verb, to mean “to fail or fall suddenly and dramatically: Collapse, Crash.” 

  2. A tip of the hat to Uncle Bob here. 

  3. Kent Beck: The Product Development Triathlon 

  4. I will end with a brief prediction of how serverless interacts with the "Extract" phase at the end of the series. 

Top comments (0)