DEV Community

Cover image for Model Context Protocol (MCP) - Should I stay or should I go? 🎶
Dustin
Dustin

Posted on • Originally published at duske.me

Model Context Protocol (MCP) - Should I stay or should I go? 🎶

In this article, we'll explore the Model Context Protocol (MCP) briefly and help you decide whether it deserves your attention or can be safely ignored for now.
The AI landscape has been buzzing with excitement around Large Language Models (LLMs), and MCP has emerged as one of the key protocols in this rapidly evolving ecosystem.

As with any hype, it is important to take a step back and understand the basics before onboarding the train - choo choo 🚂!
So here is my shot at explaining the MCP use cases and their benefits.

Go to TL;DR section.

Getting external data into the LLM

Traditionally, LLMs are trained on a vast amount of data at a certain point in time. That means, there is going to be a cut-off point of data that is available to the LLM - anything newer than that is not available.
However, in many cases, you want to use LLMs to process data that is stored outside of the training data.
This can be for example:

  • customer support chats
  • product descriptions
  • the latest memes on the internet

Smart people came up with a number of ways to do that:

  • Finetuning the model on the data. This is a very time consuming process and not scalable.
  • Use a so called "Knowledge Base" to store the data and use RAG (Retrieval-Augmented Generation) to answer questions. Good fit for knowledge retrieval tasks.
  • Function calling: Provide functions (e.g. custom code) with semantic meaning to the LLM. Then the LLM can decide, to let the function run or not. For example, the prompt could be: "Please check if the user is eligible for a discount" and the function could be a check_discount_eligibility function.

Options for Integrating External Data into LLM Apps
Options for Integrating External Data into LLM Apps

Design time vs run time

If we look at the 3 techniques above, we can see that they all have one thing in common: They are integrated at design time - meaning, that an engineer needs to carefully integrate the data/code into an application, before the the user can use it.
This works well for many use cases, where you - the developer - have control over the underlying LLM/agent and want achieve the best results.
In addition, as long as the LLM agents are still a bit clunky, constructing the such application requires fine-tuning and adjustments anyway.

However, once you become a user of such an application, you are not in control of the underlying LLM/agent. For example, when you use cursor editor, you use the agent the help you write code but you don't rewire Cursor's internals.

This is where MCP server come into play. These servers provide functionality according to a defined protocol - the MCP protocol - and can be integrated at runtime.
Imagine using Cursor, an AI-powered code editor, to write database queries. You don’t control its internal agent, but with an MCP server, you can plug in your Postgres schema at runtime—no need to wait for Cursor’s developers to build it in. This flexibility lets users extend apps instantly, bypassing the delays of design-time updates.

Is this just an API?

Not quite. Unlike stateless REST APIs or design-time function calling, MCP is an open standardized protocol for applications to provide context to LLMs, using stateful connections, client-server architecture and pre-defined capabilities and messages. It is powered by an JSON-RPC API (not necessarily HTTP) that is defined in the MCP specification and inspired by the Language Server Protocol.
I think lots of confusion arises from the fact, that MCP is often compared to conventional REST APIs or function calling, especially when the use cases are trivial.

For function calling, the integration is done at design time and the function is part of the application - so there is a mismatch.
Let's take a look at the API vs MCP discussion: At first glance, those are similar as you could design an LLM-powered application, that could consume any OpenAPI-spec compliant API and converts it into tools on the fly.
In fact, there is even an OpenAI cookbook for that.
Having a stateful 1:1 mapped client-server connection like MCP defines it just to get the weather in a certain city is a bit of an overkill. And if it is just a small stateless REST API, providing an OpenAPI runner is good enough.

But once you have a more complex use case that involves state or require a deep interaction between the LLM and the application, MCP can be a great fit.
For instance, sampling allows servers to request LLM completions through the client maintaining data control while roots define client's resources like filesystems, where the MCP servers should work with.
Of course such more complex workflows require a powerful client, that might be missing in some users' applications. And with any new technique, the debugging and tooling is not as mature as for the battle-proven HTTP APIs.

The network

No, not the network of the internet - but the people and companies behind a standard.
As MCP is designed for AI engineering, it attracts a fresh group of people that are passionate about the future of AI.
Those can then participate in the development of an open standard - which suggests no lock-in.

What makes it even more interesting is, that it is backed by Anthropic who have a great standing in the developer community thus providing visibility and trust for the long term perspective of the standard.
The more people implement MCP servers, the more attractive it becomes for users to use them as they know, that they will be supported in the future.
This will in turn drive the adoption of MCP and the standard will become more robust and mature. Looking back at the last months, we can definitely see a sharp increase in the number of MCP servers (1100) as well as clients and registries (per Why MCP Won)
Pair this with a fast evolving roadmap and lessons from similar protocols like LSP (Language Server Protocol) and you (might?) have a recipe for success.

TL;DR

If you are not in control of the underlying Agentic LLM, MCP servers can be a great way to add external data and functionality to the application at runtime.
Think of plugins, that extend the capabilities of the LLM without the need to change the underlying model or the application.
Thus if you are a user, MCPs can supercharge your LLM-powered application with additional capabilities.

If you are a developer and design the actual system, MCPs can be an overkill if you just want to integrate a stateless (RESTful) API - which is quite common.
Relying on conventional tooling like OpenAPI, function calling or third-party toolkits like LangChain's is good enough for many use cases. So far, proper tools need tailored agent logic to be useful.

Still, APIs and standards are as powerful as the people behind them and MCP is growing and evolving fast while already have a large group of supporters.
Such network effects can make MCP the de-facto standard for LLM integration in the future - even if it is not perfect for every use case.
As with many topics in the AI space, take predictions with a grain of salt and enjoy the ride.

For an even deeper dive, check out the Why MCP Won article by Latent Space.

Resources

Top comments (0)