DEV Community

Cover image for Building an AI Agent with MCP-Model Context Protocol(Anthropic's) and LangChain Adapters
Seenivasa Ramadurai
Seenivasa Ramadurai

Posted on

Building an AI Agent with MCP-Model Context Protocol(Anthropic's) and LangChain Adapters

Have you ever wanted to create your own AI agent that can write stories, generate images, and search the web all in one go? Well, that’s exactly what we’re going to build today! In this blog post, I’ll guide you through developing a powerful AI agent using Anthropic’s Model Context Protocol (MCP) with LangChain MCP adapters.

What We're Building

We’ll be creating a multi-functional AI agent that combines three specialized MCP servers to deliver the following capabilities:

  1. A Story Writer: This tool crafts engaging stories based on any topic you provide.
  2. A Story Image Generator: This tool creates images to accompany the stories generated by the Story Writer.
  3. A Google Search Tool: This tool helps the AI agent search the web to research any topic, gathering relevant information.

The best part? We’ll connect all these services using the MultiServerMCPClient, allowing us to seamlessly integrate and interact with each tool via LangChain. This approach makes it easy to build complex workflows while maintaining flexibility.

Getting Started: Required Packages

Before we dive into the implementation, we need to install a few Python packages. These will provide the necessary functionalities to interact with LangChain, MCP, and Google Search.

Here’s what you need:

  • langchain-openai: The adapter for OpenAI’s language models.
  • langchain-mcp-adapters: The adapters to interact with MCP servers.
  • googlesearch-python: A package to facilitate Google searches.

To install these packages, run the following command in your terminal:

pip install langchain-openai langchain-mcp-adapters googlesearch-python
Enter fullscreen mode Exit fullscreen mode

Setting Up Our MCP Servers

Now, let’s set up our three MCP servers. Each server will handle a different task in the pipeline: story writing, image generation, and Google search. Here's a breakdown of how we’ll implement each server.

1. Story Writer MCP Server

We’ll start by creating the Story Writer server. This tool will take a topic and generate a short story about it, providing the result in Markdown format.

from mcp.server.fastmcp import FastMCP, Context
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from typing import Dict, List, Union
from langchain.schema import AIMessage, HumanMessage

load_dotenv(override=True)
model = ChatOpenAI(model="gpt-4o-mini", verbose=True)
mcp = FastMCP("storywriter")

@mcp.tool()
async def write_storyt(topic: str) -> str:
    """Write a story.
    Args:
        topic: The story topic  
    Returns:
        The written story as a string
    """
    try:
        messages = [
            (
                "system",
                "You are a talented story writer. Create an engaging short story on the given topic in a maximum of 100 words. Provide the output in markdown format only.",
            ),
            ("human", f"The topic is: {topic}"),
        ]
        ai_msg = await model.ainvoke(messages)
        return ai_msg.content
    except Exception as e:
        return f"An error occurred while writing story: {e}"

if __name__ == "__main__":
  mcp.run()
Enter fullscreen mode Exit fullscreen mode

2. Story Image Generator MCP Server

Next, we’ll build the Story Image Generator tool. This tool will take the story generated by the Story Writer and create an image that complements it. Let’s assume you are using an image generation model like OpenAI’s DALL·E or Stable Diffusion to create these images.

from mcp.server.fastmcp import FastMCP, Context
import openai
from openai import AsyncOpenAI
from typing import List
from dotenv import load_dotenv
load_dotenv(override=True)
client = AsyncOpenAI()
mcp = FastMCP("image")

@mcp.tool()
async def generate_images(topic: str) -> List[str]:
    """Generate header images for a story.

    Args:
        topic: The story topic

    Returns:
        The list of image URLs
    """
    image_url_list = []
    try:
        images_response = await client.images.generate(
          prompt= f"Photorealistic image about: {topic}.",
          n= 3,
          style= "natural",
          response_format= "url",
        )
        for image in images_response.data:
          image_url_list.append(image.model_dump()["url"])
          return image_url_list
    except openai.APIConnectionError as e:
        return f"An error occurred while generating images: {e}"


if __name__ == "__main__":
  mcp.run()
Enter fullscreen mode Exit fullscreen mode

3. Google Search MCP Server

Finally, we’ll set up the Google Search server. This tool will allow our AI agent to search Google for relevant information on a given topic. It will return the results in Markdown format, making it easy to read.

from mcp.server.fastmcp import FastMCP, Context
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from googlesearch import search  
from typing import Dict, List, Union
from langchain.schema import AIMessage, HumanMessage

load_dotenv(override=True)

model = ChatOpenAI(model="gpt-4o", verbose=True)
mcp = FastMCP("storywriter")

@mcp.tool()
async def search_google(query: str) -> str:
    """Search Google for the query and return results as markdown formatted text.
    Args:
        query: The search query
    Returns:
        Search results formatted in markdown
    """
    try:
        search_results = list(search(query, num_results=5))  # Limiting to 5 results
        if not search_results:
            return "No results found."

        # Format the search results in Markdown
        markdown_results = "### Search Results:\n\n"
        for idx, result in enumerate(search_results, 1):
            markdown_results += f"**{idx}.** [{result}](<{result}>)\n"
        return markdown_results
    except Exception as e:
        return f"An error occurred while searching Google: {e}"

if __name__ == "__main__":
    mcp.run()
~
Enter fullscreen mode Exit fullscreen mode

Connecting Everything with MultiServerMCPClient

Now that we’ve set up the individual servers, let’s connect them using the MultiServerMCPClient. This client allows us to integrate multiple MCP servers into one coherent workflow.

import asyncio
import sys
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, AIMessage

from dotenv import load_dotenv
load_dotenv(override=True)

model = ChatOpenAI(model="gpt-4o")

python_path = sys.executable

async def main():

  async with MultiServerMCPClient() as client:
    await client.connect_to_server(
        "storywriter",
        command=python_path,
        args=["write_blog.py"],
        encoding_error_handler="ignore",
    )
    await client.connect_to_server(
        "imagegenerator",
        command=python_path,
        args=["image.py"],
        encoding_error_handler="ignore",
    )

    await client.connect_to_server(
        "googlesearch",
        command=python_path,
        args=["google_search.py"],
        encoding_error_handler="ignore",
    )
    agent = create_react_agent(model, client.get_tools(), debug=True)
    review_requested = await agent.ainvoke(debug=True, input={"messages": "Write story about lord krishna and arjuna and generate images for it and also search in google for it"})

    parsed_data = parse_ai_messages(review_requested)
    for ai_message in parsed_data:
        print(ai_message)
    print("Story written successfully")

def parse_ai_messages(data):
    messages = dict(data).get('messages', [])
    formatted_ai_responses = []

    for message in messages:
        if isinstance(message, AIMessage):
            formatted_message = f"### AI Response:\n\n{message.content}\n\n"
            formatted_ai_responses.append(formatted_message)

    return formatted_ai_responses

if __name__ == "__main__":
    asyncio.run(main())

Enter fullscreen mode Exit fullscreen mode

Agent Output

Image description

Beyond the Basics: Taking our AI Agent Further

This is just the beginning! We've already built a powerful AI agent capable of writing stories, generating images, and even searching the web. But there’s so much more we can do to enhance its capabilities. In this section, let’s explore some exciting ways to take our AI agent to the next level:

Adding a Text-to-Speech Server to Narrate our Stories

Imagine our AI agent could read the stories it generates aloud. By integrating a text-to-speech (TTS) server, we can bring stories to life with a human-like voice. This addition would add an extra layer of interactivity, making the stories more immersive and engaging.

There are a variety of TTS tools available to choose from, such as Google Cloud Text-to-Speech, Amazon Polly, or even open-source options like Pyttsx3. These can be easily integrated into our existing workflow to narrate our stories seamlessly.

Implementing a Story Editor to Refine Generated Stories

While our AI agent can generate compelling stories, sometimes a little polishing is needed. By implementing a story editor, we can give users the ability to refine the generated text. This could include features like sentence rephrasing, grammar correction, and suggestions to improve writing style.

Adding an editing layer allows users to tailor the stories to their personal preferences, making them even more customizable and aligned with their style.

Creating a Translation Server to Convert Stories to Multiple Languages

What if our AI agent could generate stories in multiple languages? Adding a translation server would allow the stories to be translated into various languages. This opens up the potential for a global audience. Tools like Google Translate API, DeepL, and Microsoft Translator Text can help achieve this.

This feature could make our AI agent more inclusive and accessible, offering stories in different languages and expanding our reach worldwide.

Building a Web Interface for User Interaction

To make our AI agent more accessible, consider building a web interface where users can interact with it directly. This could allow users to provide topics, read generated stories, view images, and even listen to stories being narrated.

Frameworks like FastAPI, Streamlit framework will help us to create quickly a web application and user-friendly interface would allow people to easily interact with our AI agent, enhancing the overall experience.

Conclusion

In this blog post, we’ve built a comprehensive AI agent using Anthropic’s Model Context Protocol (MCP) and LangChain. By combining multiple specialized MCP servers—story writing, image generation, and Google search—we’ve created a versatile agent that can write stories, generate relevant images, and gather web research, all seamlessly integrated into a single workflow.

This approach allows you to build complex AI applications with ease, using MCP to connect different tools and LangChain for powerful orchestration. The possibilities are endless—whether you’re building a content generation system, a research assistant, or something entirely new, MCP and LangChain provide the flexibility and power to bring our vision to life.

MCP Millions Of Creations are Possible

Thanks
Sreeni Ramadorai

Top comments (0)