Ephemeral environments are the new industry standard for testing and previewing features before they hit staging. They’re generally provisioned upon every pull request, so every feature can be tested in isolation prior to a merge.
Using ephemeral environments to test features is a shortcut to great developer experience: engineers are most enabled when their test environments mirror their current deployment workflows (read: no manual overhead). Swapping them in to replace traditional static environments helps teams ship faster, test more often, review more thoroughly, and most of all trust their deployments.
Ephemeral environment architecture
Ephemeral environments are at the sweet spot between cost efficiency and production parity. There are a few architecture patterns that help you guarantee that their value is maximized:
-
They’re entirely isolated from the rest of your infrastructure
Ephemeral environments shouldn’t share dependencies or services with each other or any of your other environments. This isolates their blast radius in the event of a bug or a regression.
-
They’re full-stack, near-production copies of your app
The best way to test your features? Alongside real services, APIs, etc. Many features don’t get adequate testing until staging (or even production). Your pre-production environments are overwhelmingly more valuable if they can help you shift your real, late-stage testing left.
-
They’re GitOps-enabled
Many environment bottlenecks occur on the human side: e.g. do you need someone to manually create/push your code to a test environment? Ephemeral environments automate the repetitive tasks: they should use GitOps to create and sync environments with your SSOT repository, so they always reflect your latest code changes.
-
They’re short-lived
Treating your environments like cattle is a common pattern for good reason: you don’t need your test environments to be running 24/7, nor do you need them to have static infrastructure. Keeping environments short-lived cuts down on cloud costs, and enables you to quickly provision/deprovision them when needed.
Uber maintains an in-house ephemeral environment system called SLATE (Short-Lived Application Testing Environment). They use it to test new features with their production dependencies.
A typical ephemeral environment workflow
Ephemeral environments, when done right, can cut out days of deployment lead time. Being on-demand means that more team members can work/review asynchronously, especially without any DevOps/Platform team intervention.
Here’s what a workflow might look like:
- A developer completes a feature and opens a PR
- An ephemeral environment automatically provisions, reflecting the dev’s latest code changes
- The CI pipeline runs unit, integration, and E2E tests against the new environment
- If all tests pass, the dev interacts with the environment and makes sure the feature works as intended. The environment spins down after inactivity
- They loop in peers for code review, design review, product review, etc.
- Reviewers spin up the environment when convenient, approving or requesting changes
- Developer pushes new commits (if applicable), environment syncs and runs CI jobs again
- Once the feature is ready, the dev merges the branch and the environment spins down
This whole loop can take anywhere from a couple of hours to a few days, depending on how complex the feature is. Test environment bottlenecks are a huge time sink, so using an ephemeral model instantly increases development velocity by upwards of 40%.
The DORA and DevEx impact
Ephemeral environments have a positive ripple effect on engineers’ lives, but aside from the obvious (ticket velocity, lead time, etc.), how can you quantify their value?
DORA metrics
DORA (DevOps Research and Assessment) metrics are the de-facto way to measure feature throughput and stability. They help teams benchmark their performance in a standard, straightforward way, and set metric goals to foster continuous improvement.
Ephemeral environments, conveniently, have a direct impact on the four key DORA metrics. At its core, DORA guides teams towards good deployment processes (continuous delivery, flexible infrastructure, scalable architecture, etc.) so they can improve the quality of their processes (and consequently, their software).
Developer Experience (DevEx)
Developers are most productive when they’re supported to get their best work done via tooling and processes. An investment in DevEx not only improves developer quality of life, but also has a positive impact on productivity and software quality.
The biggest hindrances to DevEx are tasks that disrupt flow. If a dev can’t get access to an environment shortly after pushing code, they’re forced to context switch, thus losing momentum. Ephemeral environments ensure devs have the infrastructure they need to test and preview their code on the spot, so they can keep developing within the same day (instead of a week later).
Teams can assess their DevEx quotient using the DevEx framework or SPACE metrics.
TLDR: ephemeral environments are invaluable
In short, ephemeral environments benefit engineering processes in nearly every possible way. There’s a reason they’re quickly becoming so widespread: they have a massive ROI relative to their cost/effort.
A quick recap:
-
They’re cost-effective
Since they’re short-lived, you only pay for what you use, thus you can have as many environments as you need
-
They promote collaboration
Teams can review per-feature environments, getting to view and interact with code changes prior to staging
-
They improve developer QoL
On-demand infrastructure means that developers can stay in flow and test on the spot, instead of waiting for manual DevOps intervention
-
They enable teams to ship faster
Ephemeral environments eliminate environment bottlenecks, meaning that deployment lead times become significantly shorter
Looking for managed ephemeral environments? Try Shipyard. Your DORA metrics will thank you :)
Top comments (1)