Robin XUAN

Posted on Jan 6

Your 2025 TestCopilot: Cascade AI Agents, Data Assurance, and E2E Risks

#devchallenge #newyearchallenge #future #testing

This is a submission for the 2025 New Year Writing challenge: Predicting 2025.

There is no doubt that AI will have a deeper impact on test engineering by 2025, the data from the 2024 World Quality Report highlights that 73% of respondents believe AI and machine learning will significantly advance test automation, indicating a strong industry-wide trend towards AI adoption in test engineering.

Many years ago, I helped my wife fine-tune my first machine learning model to classify images, which was my first introduction to AI. As a QA engineer, I never imagined AI would play such a significant role in testing, but the rapid changes in AI have shown me that a revolution is underway in test engineering.

Things have changed rapidly, and I believe that anyone reading this article will agree that we are witnessing a revolution in the way AI is being adopted in testing!

So in this page I would love to discuss the following practical scenarios in test engineering that I believe will happen in 2025 and potentially bring some significant value to resolve the one of the QA core problem - Prevent Financial Loss early, and keep increasing Brand image.

2 Types of AI agents will enhance Regression Testing
Cascade AI agents can streamline Test Management and Script Generation
Data Quality will receive increased focus
Risks - Relying solely on translating natural language into automated test scripts

In the end, I will share a general vision within the above scope

2 Types of AI Agents Enhancing Regression Testing

Manual efforts, Sanity Check, Running all E2E test scripts before release, expensive to maintain a huge test suite:) - This may be our stereotype of regression testing.

The two AI agents I will introduce below may significantly enhance the experience of regression testing in 2025.

Agent#1 - The Universal Autonomous AI Testing Agent

Agent#1 mimics customer behavior, especially those who are unfamiliar with the product’s features, by exploring the product intuitively and providing feedback based on user experience. It's like - we never need to teach a human how to use an E-commerce webshop, the application is already very well self-explained, so the AI agent can be self-learning as a customer, then self-test it as a customer with basic common senses.

By using Agent#1 in regression testing, we essentially built up a Virtual AI-powered QA Offshore team, where each agent simulates a group of customers with different background exploring our product with basic knowledge. The agents interact with the product, observe the responses, and at the end, it provides general user experience feedback and bugs in terms of functionality, and performance in general. However, it may not be suitable for complex scenarios or if you want to deeply integrate it against product-specific business logic.

The following picture shows the idea of Agent#1 generally for a website.

The most representative tool I know is Checkie.ai, led by Jason Arbon, see a demo presented by Jason Here

Agent#2 - The ML/Gen AI test agent built with user usage data

Agent#2, a special variant of Agent#1, similar to an Agile QA expert who has rich domain knowledge of your product, is fine-tuned using historical real user usage data from Production as inputs and generates executable test scripts as outputs.

By using Agent#2 in regression testing, we essentially built up a Virtual AI-powered in-house customer-centric SDET team that has a deep background in the product. Each agent can autonomously generate and maintain a test suite based on historical usage data. New test cases are automatically created for newly identified frequent usage paths, while outdated test cases are removed as user behavior evolves. Agent#2 also ensures that the priorities of test cases always align with real user usage to keep test focus, to prevent failures that could result in significant financial or operational losses.

The following picture shows the idea of Agent#2 generally.

The most representative tool at present is TrueTest led by Katalon

Cascade AI agents streamline Test Management and Script Generation

I don't believe there is a single Machine-Learning agent that can resolve all test challenges, and I also believe other machine-learning models (Not only Generative AI) can help with testing - for example: we can use different machine-learning approaches to classify and extract user paths from user usage data.

Cascade AI-powered test agents make multiple machine-learning agents work in sequence, each of the group agents is responsible for specific test tasks.

Let’s analyze a possible implementation case using the following image - using cascaded AI agents to automate the generation of test scripts.

Assumed that we already collected real user usage data from the Production environment under GDPR.
The 1st agent processes historical user data, applying data mining techniques to classify and extract user paths based on interactions such as clicks, typing, and scrolling. This generates a list of user paths, which serves as input for the next agent.
The 2nd agent uses these user paths, along with domain knowledge (e.g., feature requirements from Jira/Confluence), to append assumed assertions.
It then passes the user paths, now containing assertions, to the 3rd AI test agent, which, using Playwright best practices, generates flakiness-free test cases. These test cases are directly executed in the CI/CD pipeline without manual intervention and are self-adjusted based on the test results and Playwright execution logs.
From test generation to execution, no manual maintenance is required for the generated test code, resulting in a self-maintained automated regression test suite that iterates automatically as historical user data is updated.

Looks fine? But we still need to resolve some technical challenges:

How can data mining techniques be used to extract user paths while minimizing the extraction of duplicate paths?
What data is needed to append accurate test assertions, and how can our AI agent autonomously handle visual validation, if required?
How can the Playwright AI Agent distinguish between failures caused by incorrectly generated tests versus real issues? - reinforcement learning maybe?

Even though we cannot resolve it and implement it now by this paper, I still believe that - layered processing and cascade AI-powered test agents - can maximize the value of AI in the testing engineering

Data quality will receive increased focus

(image from WQR 2024)

The 2024 World Quality Report shows that: More than 64% of respondents acknowledged that Data Quality is very important or even critical.

It's truly understandable, and it clearly shows that Data is the critical foundation of AI and AI-based applications development, and at the heart of business decision.

In our classic testing model, we discussed unit testing, integration testing, end-to-end testing, etc., but data quality testing may often be overlooked. However, the approaches to ensure Data Quality are not a new topic at all, I believe in 2025 we will recap it again, standardize it, improve it, streamline it, and monitor it around for example Data Layer validation, validation for Data ETL processes, and Validation for data itself.

Risks - Relying solely on translating natural language into automated test scripts

In the past few years and months, there have been several very great tools, experiments, and research around translating natural language into automated test scripts, a way to follow ATDD/TDD.

It's a really wonderful milestone! In the meanwhile, I also noticed challenges & risks in 2025 for teams who would like to entirely use GenAI to replace writing classic End-to-End test scripts. I feel:

Current GenAI can definitely help and inspire everyone to create test cases in natural language but may be too complicated to generate maintainable and executable test scripts against complicated scenarios from natural language purely.
An overwhelming generation of end-to-end tests could lead to challenging maintenance in the future, or even result in the complete abandonment of the auto-generated tests. You may end up stuck in a cycle of auto-generated tests → manual fixes → unstable tests.
Even if tests are generated, they still need to be executed effectively. If the test automation process is not yet mature, relying solely on AI-generated tests could lead to execution challenges and the need for manual intervention. Over time, you may also realize that the generated tests cannot scale enough because you may not mature enough to consider most automation test situations before, such as if you would like to expand and execute a generated test with different test data.

Starting anything is always the hardest part! Let me share some recent great projects and research on "natural language generation for test scripts" achieved by others:

Write test scripts in natural language, and use GenAI to translate them to test scripts in Runtime:
- ZeroStep or AutoPlaywright
- KaneAI lead by LambdaTest, which has other more fancy features

# an example from ZeroStep
test.describe('Calendly', () => {
  test('book the next available timeslot', async ({ page }) => {
    await page.goto('https://calendly.com/zerostep-test/test-calendly')
    await ai('Verify that a calendar is displayed', { page, test })
    await ai('Dismiss the privacy modal', { page, test })
    await ai('Click on the first day in the month with times available', { page, test })
    await ai('Click on the first available time in the sidebar', { page, test })
    await ai('Click the Next button', { page, test })
    await ai('Fill out the form with realistic values', { page, test })
    await ai('Submit the form', { page, test })
    const element = await page.getByText('You are scheduled')
    expect(element).toBeDefined()
  })
})

Generate Test scripts with prompts and source code A research from Checkly about generating Playwright test scripts by using Copilot, and shared the following conclusion:

AI tools have the potential to help with test generation but "normal AI consumer tools" aren't code-focused enough. High-quality results require too complex prompts to be a maintainable solution.

Conclusion

In this article, based on the insights from other people's work and my own experience and understanding, I’ve outlined several key areas where AI will likely impact test engineering, to resolve one of our QA core problem - Prevent Financial Loss early, and keep increasing Brand image. I primarily discussed the role of AI test agents, and data quality in shaping the future testing landscape, and I analyzed the risks of E2E testing by translating the natural language into test scripts.

I also acknowledge that AI could play an important role in other testing domains such as code quality, test execution results analysis and bug classification and prediction. However, this is beyond the scope of this discussion in this page.

I deeply believe, or rather lean towards, the following actual practices in shaping test engineering by 2025:

AI technology will be more closely connected and integrated to the existing test frameworks, rather than simply generating random test cases. More machine learning models may be applied in testing engineering, not just generative AI.
Bold assumptions, 50% percentage of regression tests are complemented by AI agents.
Based on the 80/20 policy, bold assumed that most of the customers just consume 20% of the features of the product, thus the amount of regression test suite will be squashed 5 times, regression tests are dynamically updated based on actula customer's usage, without significantly comprising the quality.
Cascade AI-powered User-Data fine-tuned test agents will significantly reduce manual efforts by analyzing and covering real user use cases, enhancing both coverage and accuracy and reusing different AI agents for wider purposes. This collaborative approach between layered AI agents can improve testing efficiency and quality, driving higher levels of automation and standardization in the process. By focusing on scenarios that mirror actual user behavior, developers and products can more effectively validate functionality and play a more active role in the testing phase. This shifts testing into the development lifecycle, fostering better cross-team collaboration and optimizing software delivery workflows.
Classic testing methods will continue to play a vital role, and when complimented by AI-powered testing, they will enhance the overall testing process. The goal will be to simplify the testing experience while leveraging AI to automate and optimize where it makes sense.
We will see more testers and data engineers working together to further strengthen and standardize data quality. We will also clarify our data quality goals, to make data and related data quality as a critical pillar for business.
We may see more Vendors provide AI test agents, but for us to choose one or more AI test agents may take more time and cost to compare and explore these services for our business. Developing these AI agents from scratch requires us to learn machine learning at a deeper level, not just learn how to use OpenAI's API :)

DEV Community

Your 2025 TestCopilot: Cascade AI Agents, Data Assurance, and E2E Risks

2 Types of AI Agents Enhancing Regression Testing

Agent#1 - The Universal Autonomous AI Testing Agent

Agent#2 - The ML/Gen AI test agent built with user usage data

Cascade AI agents streamline Test Management and Script Generation

Data quality will receive increased focus

Risks - Relying solely on translating natural language into automated test scripts

Conclusion

Reference

Top comments (0)

Read next

Automating Visual Regression Testing with Playwright

Testing with Playwright: Use i18next Translations in Tests, but not `t('key')`

Lost at Sea: Navigating the Storms and Sanctuaries of My 2024 Journey

Melakukan pengujian Express.js menggunakan node-plug