Kevin Naidoo

Posted on Dec 20, 2024 • Edited on Jan 10 • Originally published at kevincoder.co.za

SaaS Founders: Satya Nadella is wrong about AI Agents!?

#startup #watercooler #discuss

I have much respect and appreciation for Satya Nadella. He is likely one of the best if not the best modern CEO's currently on the planet because of how he pivoted Microsoft from an enterprise/Windows-based company to the software giant it is now.

In a recent interview, he mentioned that AI agents will collapse SaaS businesses. While I understand the context and what he's actually saying, there's been quite a bit of misinterpretation or exaggeration of what he's actually saying.

Here's my take on the whole AI Agent revolution:

What is an AI Agent anyway?

An agent, in simplified terms, acts like an orchestra conductor. Just as a conductor guides the musicians to harmonize their performances, an agent will guide the inputs and outputs of a model to generate a more nuanced and complex result.

For example, you may have an application that takes voice calls to book an appointment. A typical LLM cannot access your calendar (unless through RAG) or make an API call to check whether the date is available.

An Agent, on the other hand, can analyze the model's response during the call, if the user mentions a date or time, the Agent can then extract that information and query an external API.

It can generate a POST request, and then interpret the API response to feedback to the model.

During this process, the Agent will make multiple calls to the LLM, the API, and tools to get the information it needs and perform actions that need to happen.

Agents are powerful but nowhere near a developer's capabilities

You might be tempted to think that an Agent is essentially replacing the job of a developer. Since it can make API calls, prompt models, and call functions, it makes sense to just feed the Agent all the business rules instead of architecting this complex app with "hard-coded" logic, the Agent can then dynamically build its workflow based on the incoming user request.

This sounds good in theory, but there's one major problem! LLMs in general hallucinate, even with RAG there is still a percentage of queries that will trigger invalid results.

Can you imagine telling a client: "We can automate and get rid of your entire web app by just using AI, Except, for one problem, we can only guarantee 80% accuracy!"

Depending on the business, a 20% failure rate could result in thousands of dollars lost due to customer dissatisfaction or failure in the ordering process.

Which business is crazy enough to accept this? even if they save money on development costs, and maybe hosting costs, is it worth the 20% loss?

Reading for meaning

Hallucination might be a big problem, but it's not the only problem. Take Claude Sonnet, for instance, the model's context window is 200k.

This is big enough to fit a whole novel. You will find though, that as you stuff more data into the context window, the accuracy starts to drop.

Give the model a bullet point list of say 30 rules and you'll find, more often than not, it's going to skip rules, ignore rules, or just do the complete opposite. Why?

The model has no "worldview" or understanding of the context data. While this may be an oversimplification of the current generation of models, the model essentially relies on an advanced pattern-matching algorithm to generate its response.

There is no thinking or understanding of the context data, thus an advanced model like Sonnet can miss a simple rule that even a seven-year-old can understand.

In the real world, we use forms to control user input. There are only so many ways to fill in a form, but natural language is unstructured so the variations that can be inputted are infinite.

This variation then becomes difficult to manage, and the level of accuracy tends to drop significantly.

Here's a practical example prompt:

You are a service agent helping the user with their lunch order, you, must adhere to these rules:

1) When retrieving menu items, you must only retrieve items from the context data provided below and nowhere else.
2) Ask the user whether they prefer to collect from the canteen or would like delivery to their room. If they want delivery, ask for their room number.
3) Menus can have sauce options, such as tomato sauce, white sauce, mustard sauce, etc... You must ask them which sauce they prefer if not mentioned.
...

A customer might say:
"Can I have the chicken burger, make sure you add tomato, onions, and lettuce?"

AI Response:
"Sure, I have recorded 1 X chicken burger with tomato sauce, onions, and lettuce"

A real person will interpret that as "a slice of tomato" which is not the same thing as "tomato sauce".

As you build more complex prompts, you will notice this more and more, the model struggles to comprehend even basic instructions when it has too many steps to follow.

Agents, using a chain of thought and RAG, can greatly improve the accuracy of models, but yet again there's always going to be that 20% or even 5% of the time where it fails.

In the real world

One of the reasons customers prefer SaaS providers vs big corporations such as Microsoft, is because of the personalized product features and dedicated support they offer.

While a chatbot is handy and having a copilot to prompt for reports and help with spreadsheets is great, most customers have no clue how to prompt the chatbot to achieve the results they need.

Now, I am not saying they are "stupid" or "incompetent", I am saying people have different strengths and interests.

To give you an example: I know a little about plumbing and probably could figure out how to replace a geyser, but am I really going to do that the next time it bursts?

Absolutely, not! Because:

A) I rather leave it to the professionals who know what they doing.
B) it's going to take me a lot longer to do because I need to Google and watch YouTube probably.
C) I have no interest in fixing geysers.
D) I have better things to do with my time.

Most business owners don't want to fiddle with tech stuff, nor do they have the time. No matter how powerful agents get, there's always going to be some SaaS company that comes in and makes things even simpler for the non-technical and they'll be happy to pay that $20-30 a month for the convenience.

System engineering

Often in a decent-sized SaaS, you will find more than just CRUD. There is usually a mix of complex and simple systems, stitched together across a sea of varying infrastructure. Serverless, deploy pipelines, docker containers, dedicated servers, single-page apps, and multi-page monoliths everywhere, not to mention that old legacy stack that nobody wants to touch!

All these came into being because of customer demand and the need to provide a deep feature-rich diverse offering to ward off competitors.

While Agents can build an entire react app, or generate a landing page, this is in isolation. I have yet to see a dev team go on leave and throw the keys to an Agent to take over 🙂

At best, you could build a pipeline of sorts like Zapier for infrastructure and code. The Agent can then checkout the code, fix a bug, and deploy to a PR and dev server.

Conclusion

So no Satya Nadella! Just No! Agents are not going to squash SaaS apps, in fact, Agents are going to empower more tech founders to innovate more rapidly, thus may end up increasing the amount of SaaS apps in the market.

To be fair, I think Satya was referring more to the low-effort office tools like spreadsheets or business reporting tools that basically just dynamically build SQL.

Sure, in that instance, of course, Agents are going to be a game changer.

I would love to hear your thoughts, What do you think about AI Agents and their role in the future of SaaS applications?

PS: If you looking for more in-depth WebDev and AI-related content, please consider visiting and following me on my blog at kevincoder.co.za. I would really appreciate your support 🙏.

DEV Community

SaaS Founders: Satya Nadella is wrong about AI Agents!?

What is an AI Agent anyway?

Agents are powerful but nowhere near a developer's capabilities

Reading for meaning

In the real world

System engineering

Conclusion

Top comments (0)

Read next

I tested Arc Browser

How to Safeguard Against Disruptions from Data Schema Changes

Oldie but goodie

Bash, Fish, and Zsh: Choosing Your Shell