Sunil Kumar Dash

for Composio

Posted on Feb 10

Building an open-source deep research agent from scratch using LlamaIndex, Composio, & ExaAI 🔎🔥

#webdev #programming #javascript #ai

Perhaps the best AI launch from OpenAI in a while, the deep research feature is blowing everyone’s mind, even those who had stopped using Chatgpt.

It can research complex tasks in 10s of minutes, which would have taken hours and days for humans.

However, it is only available to Chatgpt Pro users, which costs $200/month, which is too expensive. Though it will be available to plus users, it will be severely rate-limited.

So, I decided to build an open-source version using.

LlamaIndex for agent building and orchestration,
Composio will integrate Google Docs and ExaAI with agents, research the web, and make documents.

Here’s the demo video of the project

What is OpenAI Deep research?

From OpenAI

Crazy, isn’t it? You’d know how good it is if you've used it once. Hence, I made an open-source version of it, though it won’t be as good as the OG one, as an unreleased version of o3 powers it. However, we will be using Deepseek r1 instead to approximate the performance.

Challenges

The biggest challenge here is integration. We need a way to integrate Google Docs and ExaAI with LlamaIndex agents. Google Docs has an OAuth-based auth flow, while ExaAI has API Key authentication. Building integrations for these will take days.

Here comes Composio; with a few lines of code, you can integrate almost any SaaS app, including Gsuite apps, Slack, GitHub, and more.

Here, we will only use Google Docs to write the final report and ExaAI to power the agents' internet searches.

What is Composio?

Composio allows developers to connect to any 3rd party services with AI agents. There’s a range of integrations of over 250 that you can use to communicate with your AI agent to automate real-world tasks.

You don’t have to worry about user authentication anymore; Composio does it all without letting you break a sweat.

Start with Composio Now

Requirements and Dependencies

For this, you’d need

Groq API key - Visit the official Groq site and create an API Key or get it from TogetherAI.
Deepseek API Key - Visit the official site and get an API key.
Composio account - Visit this URL → sign-up using any method

The project has two parts

Frontend - simple and lean. (You can do a better job than this, but anyway it works)
Backend - The agents are built and orchestrated using LlamaIndex, and Integrations are added using Composio.

Let’s start with building the agent and adding the integrations.

Backend

Before getting started, here is the Replit repository you can refer to.

The backend code is in Python. So, create a virtual environment and install the dependencies.

python -m venv deep-research
cd deep-research
source bin/activate

Install these dependencies

composio-llamaindex,
email-validator,
flask,
flask-sqlalchemy,
gunicorn,
llama-index-core,
llama-index-llms-groq,
llama-index-llms-openai,
openai,
psycopg2-binary,
python-dotenv,

Setup Composio

First, you need to log in to your Composio account or add the Composio API Key to an environment variable.

You can get the API_KEY from the dashboard setting.

composio login

Now add integrations for Exa and Google Docs.

composio add googledocs
composio add exa

Finish the authentication flow as displayed on your screen. Once it is done, you can see it in the dashboard.

Like this. (I have a ton of them; you will see only the ones you integrated)

Now, you’re ready to code your agents. Make sure you’ve added API keys in a .env file.

DEEPSEEK_API_KEY=""
GROQ_API_KEY= ""

Import the libraries, load the environment variable, and enable logging.

from composio_llamaindex import ComposioToolSet, App, Action
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.llms import ChatMessage
from llama_index.llms.groq import Groq
from llama_index.llms.openai import OpenAI
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

Now, define the toolsets and add them to the agent.

def create_research_agent():
    # Initialize toolset and LLM
    toolset = ComposioToolSet()
    tools = toolset.get_tools(actions=[
        Action.EXA_SEARCH, Action.EXA_SIMILARLINK,
        Action.GOOGLEDOCS_CREATE_DOCUMENT
    ])

    #function_calling_llm = OpenAI(model="o1")
    function_calling_llm = Groq(model="deepseek-r1-distill-llama-70b")
    # Setup chatbot-style prefix messages
    prefix_messages = [
        ChatMessage(
            role="system",
            content=("""
                You are a sophisticated research assistant. Perform comprehensive research on the given query and provide detailed analysis. Focus on:
                - Key concepts and main ideas
                - Current developments and trends
                - Important stakeholders and their roles
                - Relevant data and statistics
                - Critical analysis and implications

                Create a detailed report on the research and write it in google docs. Return the google doc url as well. 

                Ensure all information is accurate, up-to-date, and properly sourced. Present findings in a clear, structured format suitable for professional analysis.
                """),
        )
    ]

    return FunctionCallingAgentWorker(
        tools=tools,  # type: ignore
        llm=function_calling_llm,
        prefix_messages=prefix_messages,
        max_function_calls=10,
        allow_parallel_tool_calls=False,
        verbose=True,
    ).as_agent()

Okay, let’s go step-by-step. First, we define the toolset with the required Actions.

EXA_SEARCH = The Search Action Executes Queries Against The Exa Search Service, Returning A Curated List Of Results Based On The Provided Search Criteria.
EXA_SIMILARLINK = Perform A Search With Exa To Find Similar Links And Retrieve A List Of Relevant Results. The Search Can Optionally Return Contents.
GOOGLEDOCS_CREATE_DOCUMENT = To create new documents.

Then, we defined the model we’ll be using. Depending on your bank balance, you can pick any model here. I’m going with the distilled Deepssek Llama 70b, but this model is great enough for tool-calling.

function_calling_llm = Groq(model="deepseek-r1-distill-llama-70b")

Now, we have defined the agent's prefix, which gives the agent an idea of what we expect it to perform.

prefix_messages = [
        ChatMessage(
            role="system",
            content=("""
                You are a sophisticated research assistant. Perform comprehensive research on the given query and provide detailed analysis. Focus on:
                - Key concepts and main ideas
                - Current developments and trends
                - Important stakeholders and their roles
                - Relevant data and statistics
                - Critical analysis and implications

                Create a detailed report on the research and write it in google docs. Return the google doc url as well. 

                Ensure all information is accurate, up-to-date, and properly sourced. Present findings in a clear, structured format suitable for professional analysis.
                """),
        )
    ]

Finally, define the agent with prefixes and toolsets.

return FunctionCallingAgentWorker(
        tools=tools,  # type: ignore
        llm=function_calling_llm,
        prefix_messages=prefix_messages,
        max_function_calls=10,
        allow_parallel_tool_calls=False,
        verbose=True,
    ).as_agent()

Next, define a function that generates questions from the query. This is an essential trait of the deep research agent.

def generate_questions(topic: str, domain: str) -> list[str]:
    """Generate questions about the research topic."""
    function_calling_llm = OpenAI(model='deepseek-reasoner', base_url="https://api.deepseek.com", api_key=os.environ["DEEPSEEK_API_KEY"] )

    questions_prompt = f"""
    Generate 5-6 specific questions about the topic to help guide the research agent to research about the topic: {topic} and this is the domain: {domain}, so don't ask too complex probing questions, keep them relatively simple. Focus on:
    Mostly make these yes or no questions.
    Do not ask the user for information, you are supposed to help him/her with the research, you can't ask questions about the topic itself, 
    you can ask the user about what he wants to know about the topic and the domain.
    Format your response as a numbered list, with exactly one question per line.
    Example format:
    1. [First question]
    2. [Second question]
    """

    questions_response = function_calling_llm.complete(questions_prompt)
    # Clean up the response to ensure proper formatting
    cleaned_questions = [
        q.strip() for q in questions_response.text.strip().split('\n')
        if q.strip() and any(q.startswith(str(i)) for i in range(1, 7))
    ]

    return cleaned_questions

We’ve used Deepseek r1 as the LLM inside this function, as it can better reason the questions.

Add the following code to run the script without frontend and in your CLI.

def chatbot():
    print("🤖: Hi! I can help you research any topics. Let's start!")

    # Get the main research topic
    topic = input("What topic would you like to research: ")
    domain = input('What domain is this topic in: ')
    # Generate and ask probing questions
    cleaned_questions = generate_questions(topic, domain)

    # Show all questions at once and collect one response
    print("\n🤖: Please consider these questions about your research needs:")
    print("\n".join(cleaned_questions))

    answer = input(
        "\nPlease provide your response addressing these questions: ")

    # Combine all information for research
    research_prompt = f"""
    Topic: {topic}
    Domain: {domain}

    User's Response to Questions:
    {answer}

    Please research this topic thoroughly and create a comprehensive report in Google Docs.
    """

    print(
        "\n🤖: Thank you! I'll now conduct the research and create a detailed report..."
    )
    agent = create_research_agent()
    res = agent.chat(research_prompt)
    print("\n🤖: Here's your research report:")
    print(res.response)

if __name__ == "__main__":
    chatbot()

Frontend

The front end is very, very lean and straightforward. You can use your chatbot here and connect it to the backend.

There are only three files: index.html, main.js, and style.css.

Let’s see main.js

document.addEventListener('DOMContentLoaded', function() {
    const chatForm = document.getElementById('chatForm');
    const batchInput = document.getElementById('batchInput');
    const chatMessages = document.getElementById('chatMessages');
    let currentMessageDiv = null;

We are wrapping the entire file under addEventListener to ensure that the code only runs once all the DOM elements are ready.

Then, grab key elements

chatForm: The form for sending messages.
batchInput: The text input field.
chatMessages: The container for chat messages.
currentMessageDiv: Keeps track of the latest updated assistant message.

marked.setOptions({
    breaks: true,
    gfm: true
});

Sets up the marked library to convert any markdown texts to HTML

function addMessage(content, type) {
    const messageDiv = document.createElement('div');
    messageDiv.classList.add('message', `${type}-message`);

    if (type === 'assistant') {
        messageDiv.innerHTML = marked.parse(content);
    } else {
        messageDiv.textContent = content;
    }

    chatMessages.appendChild(messageDiv);
    messageDiv.scrollIntoView({ behavior: 'smooth' });
    return messageDiv;
}

Creates a new message element, styles it according to the sender, and adds it to the chat. Assistant messages are converted from Markdown to HTML, while user messages show plain text.

function addLoadingIndicator() {
    const loadingDiv = document.createElement('div');
    loadingDiv.classList.add('message', 'assistant-message', 'loading');
    loadingDiv.innerHTML = `
        Generating ideas
        <div class="loading-dots">
            <span></span>
            <span></span>
            <span></span>
        </div>
    `;
    chatMessages.appendChild(loadingDiv);
    loadingDiv.scrollIntoView({ behavior: 'smooth' });
    return loadingDiv;
}

Displays a “loading” message with animated dots to show the assistant is generating a response.

chatForm.addEventListener('submit', async function(e) {
    e.preventDefault();

    const batch = batchInput.value.trim();
    if (!batch) return;

    addMessage(batch, 'user');
    batchInput.value = '';

    const loadingDiv = addLoadingIndicator();

    const eventSource = new EventSource(`/stream?batch=${encodeURIComponent(batch)}`);
    let accumulatedContent = '';

    eventSource.onmessage = function(event) {
        const data = JSON.parse(event.data);

        if (loadingDiv) loadingDiv.remove();

        if (data.error) {
            addMessage(data.error, 'error');
            eventSource.close();
            return;
        }

        if (data.content) {
            if (!currentMessageDiv) {
                currentMessageDiv = addMessage('', 'assistant');
            }
            accumulatedContent += data.content;
            currentMessageDiv.innerHTML = marked.parse(accumulatedContent);
            currentMessageDiv.scrollIntoView({ behavior: 'smooth' });
        }

        if (data.done) {
            currentMessageDiv = null;
            accumulatedContent = '';
            eventSource.close();
        }
    };

    eventSource.onerror = function() {
        if (loadingDiv) loadingDiv.remove();
        addMessage('Error connecting to server', 'error');
        eventSource.close();
    };
});

Stops the form from refreshing the page.
Gets the user’s text, shows it, and clears the input.
Shows a loading indicator.
Opens a Server-Sent Events (SSE) connection to get the assistant’s response in real-time.
As chunks of text arrive (on message), the system removes the loading indicator, updates the assistant’s message, and uses Markdown formatting.
If there’s an error, it shows an error message and closes the connection.
When the server signals it’s done (data.done), it resets everything and closes the connection.

Okay, that’s all for our chatbot user interface.

Do check out the HTML and CSS files and add them accordingly.

Serve the app

Finally, serve the app

import logging
from flask import Flask, render_template, Response, request, jsonify
import json
from agent import create_research_agent, generate_questions
import time
import os

# Configure logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/generate_questions', methods=['POST'])
def get_questions():
    data = request.get_json()
    if not data or 'topic' not in data or 'domain' not in data:
        return jsonify({'error': 'Topic and domain are required'}), 400

    # Check for required environment variables
    if not os.environ.get("DEEPSEEK_API_KEY"):
        return jsonify({'error': 'Deepseek API key is not configured. Please set up your API keys.'}), 500

    try:
        questions = generate_questions(data['topic'], data['domain'])
        return jsonify({'questions': questions})
    except Exception as e:
        logger.error(f"Error generating questions: {str(e)}")
        return jsonify({'error': 'An error occurred while generating questions. Please try again later.'}), 500

@app.route('/research', methods=['POST'])
def research():
    data = request.get_json()
    if not data or 'topic' not in data or 'domain' not in data or 'answers' not in data:
        return jsonify({'error': 'Topic, domain and answers are required'}), 400

    # Check for required environment variables
    if not os.environ.get("GROQ_API_KEY"):
        return jsonify({'error': 'Groq API key is not configured. Please set up your API keys.'}), 500

    if not os.environ.get("COMPOSIO_API_KEY"):
        return jsonify({'error': 'Composio API key is not configured. Please set up your API keys.'}), 500

    try:
        agent = create_research_agent()
        research_prompt = f"""
        Topic: {data['topic']}
        Domain: {data['domain']}

        User's Response to Questions:
        {data['answers']}

        Please research this topic thoroughly and create a comprehensive report in Google Docs.
        """

        response = agent.chat(research_prompt)
        return jsonify({'content': str(response.response)})
    except Exception as e:
        logger.error(f"Error in research: {str(e)}")
        return jsonify({'error': 'An error occurred while conducting research. Please try again later.'}), 500

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=True)

Now, you’re sorted. Now run the main.py, and you’re sorted. And, pray it works without errors.

That is all. The complete code is in this Replit repository.

Thank you for reading. Do tell me about your experience building with Composio. We're consistently working on improving the product.

Top comments (2)

Anmol Baranwal • Feb 10

Impressive work Sunil. 🔥 I think Deep research is very similar to co-agents (not really sure), as both work independently without manual instructions. Such as:

assafelovic / gpt-researcher

LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.

English | 中文 | 日本語 | 한국어

🔎 GPT Researcher

GPT Researcher is an autonomous agent designed for comprehensive web and local research on any given task.

The agent produces detailed, factual, and unbiased research reports with citations. GPT Researcher provides a full suite of customization options to create tailor made and domain specific research agents. Inspired by the recent Plan-and-Solve and RAG papers, GPT Researcher addresses misinformation, speed, determinism, and reliability by offering stable performance and increased speed through parallelized agent work.

Our mission is to empower individuals and organizations with accurate, unbiased, and factual information through AI.

Why GPT Researcher?

Objective conclusions for manual research can take weeks, requiring vast resources and time.
LLMs trained on outdated information can hallucinate, becoming irrelevant for current research tasks.
Current LLMs have token limitations, insufficient for generating long research reports.
Limited web sources in existing services lead to misinformation and shallow…

View on GitHub

Sunil Kumar Dash • Feb 11

It's an agent, but it is probably the best agent in production right now. They are using an unreleased o3 optimised for web browsing and analysis, and that's where the secret lies. OpenAI nailing consumer AI like no other, perks of having the best brain in the world.

DEV Community