DEV Community

shadowb
shadowb

Posted on

Building a Simple Chatbot with Llama2 [Chat with Excel]

In this post, I’ll explain how I built a chatbot using the Llama2 model to query Excel data intelligently.

Image description

What We’re Building

  1. Loads an Excel file.
  2. Splits the data into manageable chunks.
  3. Stores the data in a vector database for fast retrieval.
  4. Use a local Llama2 model to answer questions based on the content of the Excel file.

Prerequisites:

Python (≥ 3.8)
Libraries: langchain, pandas, unstructured, Chroma

Step 1: Install Dependencies

%pip install -q unstructured langchain
%pip install -q "unstructured[all-docs]"
Enter fullscreen mode Exit fullscreen mode

Step 2: Load the Excel File

import pandas as pd

excel_path = "Book2.xlsx"
if excel_path:
    df = pd.read_excel(excel_path)
    data = df.to_string(index=False)
else:
    print("Upload an Excel file")

Enter fullscreen mode Exit fullscreen mode

Step 3: Chunk the Data and Store in a Vector Database

Large text data is split into smaller, overlapping chunks for effective embedding and querying. These chunks are stored in a Chroma vector database.

from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma

text_splitter = RecursiveCharacterTextSplitter(chunk_size=7500, chunk_overlap=100)
chunks = text_splitter.split_text(data)

embedding_model = OllamaEmbeddings(model="nomic-embed-text", show_progress=False)
vector_db = Chroma.from_texts(
    texts=chunks, 
    embedding=embedding_model,
    collection_name="local-rag"
)

Enter fullscreen mode Exit fullscreen mode

Step 4: Initialize the Llama2 Model

We use ChatOllama to load the Llama2 model locally.

from langchain_community.chat_models import ChatOllama

local_model = "llama2"
llm = ChatOllama(model=local_model)

Enter fullscreen mode Exit fullscreen mode

Step 5: Create a Query Prompt

The chatbot will respond based on specific column names from the Excel file. We create a prompt template to guide the model

from langchain.prompts import PromptTemplate

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI assistant. Answer the user's questions based on the column names: 
    Id, order_id, name, sales, refund, and status. Original question: {question}"""
)
Enter fullscreen mode Exit fullscreen mode

Step 6: Set Up the Retriever

We configure a retriever to fetch relevant chunks from the vector database, which will be used by the Llama2 model to answer questions.

from langchain.retrievers.multi_query import MultiQueryRetriever

retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

Enter fullscreen mode Exit fullscreen mode

Step 7: Build the Response Chain

The response chain integrates:

  1. A retriever to fetch context.
  2. A prompt to format the question and context.
  3. The Llama2 model to generate answers.
  4. An output parser to format the response.
from langchain.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

template = """Answer the question based ONLY on the following context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Enter fullscreen mode Exit fullscreen mode

Step 8: Ask a Question

Now we’re ready to ask a question! Here’s how we invoke the chain to get a response:

raw_result = chain.invoke("How many rows are there?")
final_result = f"{raw_result}\n\nIf you have more questions, feel free to ask!"
print(final_result)

Enter fullscreen mode Exit fullscreen mode

Sample Output

When I ran the above code on a sample Excel file, here’s what I got:

Based on the provided context, there are 10 rows in the table.
If you have more questions, feel free to ask!

Enter fullscreen mode Exit fullscreen mode

Conclusion:

This approach leverages the power of embeddings and the Llama2 model to create a smart, interactive chatbot for Excel data. With some tweaks, you can extend this to work with other types of documents or integrate it into a full-fledged app!

Check working example with UI on my LinkedIn:

Introducing BChat Excel: A Conversational AI-Powered Tool for Excel File Interactions

Top comments (1)

Collapse
 
vinayak_mishra_147ff0a039 profile image
Vinayak Mishra

Hey folks, came across this post and thought it might be helpful for you! Rag Evaluation Metrics.