What you will find in this article?
AI is becoming a part of every aspect of our lives, especially at work. Understanding documents better, finding the right information faster, and collaborating more effectively are all things that AI can help us with.
In this article, we will build a powerful AI assistant that lets you chat, ask questions, and get answers from your documents. We will use Next.js, vercel/ai
and OpenAI to build this application.
Papermark - the open-source DocSend alternative.
Before we kick it off, let me share Papermark with you. It's an open-source alternative to DocSend that helps you securely share documents and get real-time page-by-page analytics from viewers. Of course, there's a AI document assistant included. And it's all open-source!
I would be grateful if you could give us a star! Don't forget to share your thoughts in the comments section ❤️
https://github.com/mfts/papermark
Setup the project
Let's set up our project environment. We will be setting up a Next.js app, installing Vercel's AI package and configuring the OpenAI.
Set up Next.js with TypeScript and Tailwindcss
We will use create-next-app to generate a new Next.js project. We will also be using TypeScript and Tailwind CSS, so make sure to select those options when prompted.
npx create-next-app
# ---
# you'll be asked the following prompts
What is your project named? my-app
Would you like to add TypeScript with this project? Y/N
# select `Y` for typescript
Would you like to use ESLint with this project? Y/N
# select `Y` for ESLint
Would you like to use Tailwind CSS with this project? Y/N
# select `Y` for Tailwind CSS
Would you like to use the `src/ directory` with this project? Y/N
# select `N` for `src/` directory
What import alias would you like configured? `@/*`
# enter `@/*` for import alias
Install Vercel's AI package
Next, we will install Vercel's AI package. This package provides convenient type-safe abstractions to access the OpenAI (and other LLM) API. It also provides a convenient way to use the API in a serverless environment, including streaming chat responses.
npm install ai
I have to admit, ai
is a pretty epic package name! 🎉
Setup OpenAI
If you haven't done so, create an account on OpenAI. Once you have created an account, you will need to create an API key on platform.openai.com. You can find your API key on the dashboard. We'll need that for later.
Building the application
Now that we have our setup in place, we are ready to start building our application. The main features we'll cover are:
- Configure the OpenAI Assistant API
- Creating a chat interface
#1 Configure the OpenAI Assistant API
Let's start by configuring the OpenAI Assistant API on the OpenAI platform.
Create a new assistant from the dashboard
Give it a name, an instruction prompt, a model (currently it has to be gpt-4-1106-preview
), and make sure you enable Retrieval.
Finally, add a file to the assistant. This file will be the initial document to chat with. Let's upload a PDF file.
Get the Assistant ID
When saving the assistant, you will find the assistant ID in the dashboard or below the name of the assistant. We will need this ID to configure the API.
#2 Creating a chat interface for your document
Now that we have our assistant configured, let's create a chat interface to interact with the assistant. We will be using the useAssistant
hoo (currently in beta) from the ai
package to interact with the API.
Let's create the chat interface in app/page.tsx
:
// app/page.tsx
"use client";
import { Message, experimental_useAssistant as useAssistant } from "ai/react";
import { useEffect, useRef } from "react";
const roleToColorMap: Record<Message["role"], string> = {
system: "red",
user: "black",
assistant: "green",
};
export default function Chat() {
const { status, messages, input, submitMessage, handleInputChange, error } =
useAssistant({
api: "/api/assistant",
});
// When status changes to accepting messages, focus the input:
const inputRef = useRef<HTMLInputElement>(null);
useEffect(() => {
if (status === "awaiting_message") {
inputRef.current?.focus();
}
}, [status]);
return (
<div className="flex flex-col w-full max-w-md py-24 mx-auto stretch">
{error != null && (
<div className="relative bg-red-500 text-white px-6 py-4 rounded-md">
<span className="block sm:inline">
Error: {(error as any).toString()}
</span>
</div>
)}
{messages.map((m: Message) => (
<div
key={m.id}
className="whitespace-pre-wrap"
style={{ color: roleToColorMap[m.role] }}>
<strong>{`${m.role}: `}</strong>
{m.role !== "data" && m.content}
{m.role === "data" && (
<>
{(m.data as any).description}
<br />
<pre className={"bg-gray-200"}>
{JSON.stringify(m.data, null, 2)}
</pre>
</>
)}
<br />
<br />
</div>
))}
{status === "in_progress" && (
<div className="h-8 w-full max-w-md p-2 mb-8 bg-gray-300 dark:bg-gray-600 rounded-lg animate-pulse" />
)}
<form onSubmit={submitMessage}>
<input
ref={inputRef}
disabled={status !== "awaiting_message"}
className="fixed bottom-0 w-full max-w-md p-2 mb-8 border border-gray-300 rounded shadow-xl"
value={input}
placeholder="What is the temperature in the living room?"
onChange={handleInputChange}
/>
</form>
</div>
);
}
And the corresponding API route in app/api/assistant/route.ts
:
// app/api/assistant/route.ts
import { experimental_AssistantResponse } from "ai";
import OpenAI from "openai";
import { MessageContentText } from "openai/resources/beta/threads/messages/messages";
// Create an OpenAI API client (that's edge friendly!)
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY || "",
});
// IMPORTANT! Set the runtime to edge
export const runtime = "edge";
export async function POST(req: Request) {
// Parse the request body
const input: {
threadId: string | null;
message: string;
} = await req.json();
// Create a thread if needed
const threadId = input.threadId ?? (await openai.beta.threads.create({})).id;
// Add a message to the thread
const createdMessage = await openai.beta.threads.messages.create(threadId, {
role: "user",
content: input.message,
});
return experimental_AssistantResponse(
{ threadId, messageId: createdMessage.id },
async ({ threadId, sendMessage }) => {
// Run the assistant on the thread
const run = await openai.beta.threads.runs.create(threadId, {
assistant_id:
process.env.ASSISTANT_ID ??
(() => {
throw new Error("ASSISTANT_ID is not set");
})(),
});
async function waitForRun(run: Run) {
// Poll for status change
while (run.status === "queued" || run.status === "in_progress") {
// delay for 500ms:
await new Promise((resolve) => setTimeout(resolve, 500));
run = await openai.beta.threads.runs.retrieve(threadId!, run.id);
}
// Check the run status
if (
run.status === "cancelled" ||
run.status === "cancelling" ||
run.status === "failed" ||
run.status === "expired"
) {
throw new Error(run.status);
}
}
await waitForRun(run);
// Get new thread messages (after our message)
const responseMessages = (
await openai.beta.threads.messages.list(threadId, {
after: createdMessage.id,
order: "asc",
})
).data;
// Send the messages
for (const message of responseMessages) {
sendMessage({
id: message.id,
role: "assistant",
content: message.content.filter(
(content) => content.type === "text"
) as Array<MessageContentText>,
});
}
}
);
}
And finally, add the OPENAI_API_KEY
and ASSISTANT_ID
from earlier to your environemnt variables:
# .env
OPENAI_API_KEY=<your-openai-api-key>
ASSISTANT_ID=<your-assistant-id>
Bonus: Add a new document to the assistant
If you want to add new documents to the assistant, then you need to upload them to the files
endpoint of the OpenAI API.
It's important that you state the purpose
of the file as assistants
so that it can be used by the assistant.
Add the following backend code somewhere where you have access to uploaded files:
// ...
// Upload the file to OpenAI
const fileId = (
await openai.files.create({
file: await fetch(url_to_file), // the `file` variable accepts a File, Buffer or ReadableStream
purpose: "assistants",
})
).id;
// ...
Then, you need to give your assistant access to the file you have uploaded. You can do that by adding the file to the assistant:
// ...
// Add the file to the assistant
await openai.beta.assistants.files.create(assistantId, {
file: fileId,
});
// ...
And that's it! You can now upload new documents to your assistant.
Conclusion
Congratulations! You have built a powerful AI assistant that lets you chat with your document.
Thank you for reading. I am Marc, an open-source advocate. I am building papermark.io - the open-source alternative to DocSend.
Have fun building!
Help me out!
If you found this article helpful and got to understand OpenAI's Assistant API, vercel/ai
package, and Next.js, I would be grateful if you could give us a star! And don't forget to share your thoughts in the comments ❤️
Top comments (6)
is probably the best package name, surely it is.
Just install AI. 🤖 😂😂
You can try it here
papermark.io/ai
Pretty cool. I would like a self hosted version without the OpenAI API dependency.
Another useful tutorial, thanks for sharing, Marc!
Thanks Matija!
Would love to see how to do function calling with this