Dishit Devasia

Posted on Dec 27, 2024 • Originally published at open.substack.com

End to End Testing No More

#programming #testing #contracttesitng #webdev

It’s a typical Tuesday afternoon, and your team is gearing up for a routine release—a feature here, a bug fix there.

Nothing groundbreaking, but it’s part of the rhythm that keeps your product evolving.

Developers have wrapped up their work, testers have validated the changes, and the release is queued up for deployment.

E2E tests are triggered to check for impacts on the overall system.

Few test cases fail.

Developers are called in to look into the issues.

Delivery leads / engineering managers are notified of the issues and now everyone in the team is there.

The team investigates for a couple of days.

The result? 1 bug and 4 false positives.

Sometimes couple of days might turn into a week and what that mean more people in to understand the consequence of the issue.

Managers are left juggling timelines, explaining delays to stakeholders, and hoping the release goes out somehow.

Developers and testers context-switch from building things to fighting fire.

For teams operating in microservices architectures, this isn’t a one-off nightmare—it’s a recurring challenge.

Weekly or fortnightly releases get bogged down, and instead of delivering value to customers, you’re spending precious time firefighting.

Why Does This Happen?

The root of these challenges lies in the very thing that makes modern systems so powerful: microservices architecture.

In a microservices setup, your application is no longer a monolith; it’s a collection of smaller, independent services working together.

While this brings scalability and flexibility, it also introduces complexity—each service depends on others, and any mismatch or downtime can cause cascading failures, especially during E2E tests.

The main issues are:

Interdependencies - Microservices often rely on other services, databases, or external systems. If one of these isn’t available or has mismatched versions, tests fail.
Test Fragility - E2E tests need stable environments, but in dynamic systems, infrastructure, test data, or service availability can be inconsistent.
Troubleshooting Overhead - Pinpointing the root cause of a failure in a web of services takes time and coordination, delaying releases further.

So, how do you address this?

Fixing E2E tests is the easy solution. But you have to consider the tests also increase or modified as more features are built in. So fixing E2E tests becomes a moving target as there will bound to be some flakiness when you are testing the whole stack.

The idea here is to approach the problem from an extreme angle so that you reduce the dependency on E2E tests and thus reduce flakiness at the root.

Contract Testing

One way to reduce reliance on E2E tests is by adopting contract testing.

What is Contract Testing?

Contract testing focuses on verifying the interaction between two services. Instead of testing the entire system, it ensures that a service (the consumer) and its dependency (the provider) agree on how they will communicate.

Here’s how it works:

The consumer defines a contract specifying the API calls it will make and the responses it expects.

The provider validates this contract to ensure it meets the consumer’s expectations.

Benefits of Contract Testing

Faster Feedback - You can test interactions without spinning up the entire system.
Reduced Fragility - Tests don’t rely on all services being up and running.
Clear Ownership - Each team owns its contracts, making troubleshooting more focused.
Scalability - Contract tests scale better than E2E tests in large systems with many services.

By implementing contract testing, you can ensure your services work well together, even if they’re deployed independently.

But contract testing alone cannot solve logic changes in downstream systems.

For example, if a shared service like an authentication system updates its logic, it might still pass its contract tests but fail in actual integration scenarios.

Traceability Matrix

To address logic changes and ensure comprehensive testing, you need a traceability matrix.

What is a Traceability Matrix?

A traceability matrix maps requirements, features, or changes to the tests that validate them. In the context of microservices:

It links services and their dependencies to specific test cases.

It identifies which tests to run when a particular service or feature changes.

How It Works

When a shared service updates, the traceability matrix identifies all dependent services and relevant integration tests.

Instead of running the entire E2E suite, you execute only the impacted tests, saving time and effort.

Benefits of a Traceability Matrix

Targeted Testing - Run only the tests that matter, reducing execution time.
Simplified Debugging - Developers and testers focus on the exact areas impacted by changes.
Faster Releases - By avoiding full E2E runs, you streamline deployments.

As business logic grows and tests multiply, maintaining the traceability matrix becomes increasingly complex.

Over time, there’s a real danger of maintainers adding everything into the matrix to cover all possible scenarios.

This leads to the same problem you were trying to solve: running all E2E tests for every release, negating the benefits of targeted testing.

Leverage Domains

One way to keep the traceability matrix manageable is by organizing it around domains.

A domain represents a logical grouping of services and their associated tests within your organization.

For example, you might have domains like Payments, User Management, or Notifications, each with its own traceability matrix.

How Domains Help

Focused Ownership - Each domain has clear owners—teams responsible for maintaining its traceability matrix and ensuring its accuracy.
Simplified Maintenance - By limiting the scope of each matrix to a domain, you reduce its complexity, making it easier to manage and update.
Better Visibility - Domain-specific matrices provide a clear view of dependencies and test coverage within that domain, helping teams identify gaps or redundancies.

Challenges of Using Domains

Boundary Definition - Defining clear boundaries between domains can be tricky, especially in systems with overlapping responsibilities.
Cross-Domain Dependencies - Interactions between domains may require additional coordination, especially when logic spans multiple areas.
Initial Overhead - Setting up domains and assigning ownership requires time and effort, particularly in large organizations with legacy systems.

The Combined Power of Contract Testing, Traceability Matrix, and Domains

Each solution—contract testing, traceability matrix, and domains—addresses a specific aspect of the E2E testing problem. However, their true strength lies in using them together.

Contract Testing ensures service-to-service communication is reliable, reducing the need for full-system tests.
Traceability Matrix targets logic changes, ensuring only the necessary tests run, saving time and effort.
Domains keep the matrix manageable, preventing it from becoming a monolithic structure that’s hard to maintain.

By combining these approaches, you create a robust deployment strategy that balances speed and reliability.

Challenges and Open Questions

While these solutions address many pain points, they aren’t without challenges:

Tooling - What tools can help automate and maintain traceability matrices across domains?
Training - How do you ensure all teams understand and adopt these practices effectively?
Governance - Who oversees cross-domain dependencies and ensures alignment?
Scalability - How do these approaches scale as the organization and system grow further?

These open questions highlight the need for continuous refinement and adaptation of your deployment strategy.

Conclusion
End-to-end testing failures shouldn’t derail your releases.

By adopting contract testing, implementing a traceability matrix, and leveraging domains, you can address the root causes of these delays and deliver faster, more reliable releases.

These approaches empower your teams to focus on delivering value to customers rather than firefighting test failures.

While challenges remain, taking the first step towards modernizing your testing strategy is the key to staying agile in today’s fast-paced development environment.

With this combination, you’re not just fixing a process—you’re laying the foundation for smoother, faster, and more confident releases.

This was first published on my newsletter.
Subscribe like other software developers and testers to receive frequent updates

Weekend Developer | Dishit Devasia | Substack

A dose of motivation and resources to help you on your journey to becoming a professional software developer. Click to read Weekend Developer, by Dishit Devasia, a Substack publication. Launched 2 years ago.

weekendprogrammer.substack.com

DEV Community