DEV Community

Kojiro Yanoᯅ/VR Educator
Kojiro Yanoᯅ/VR Educator

Posted on

How to Use LlamaIndex.TS on Vercel

Disclaimer

This is an English translation of my original Japanese post on Qiita (https://qiita.com/yanosen_jp/items/db6e6f6d066d6bd06c68).

Introduction

LlamaIndex (originally written in Python) is an open-source data framework for working with LLMs and data. LlamaIndex.TS brings that functionality to JavaScript/TypeScript environments—ranging from Node.js and Deno to the Vercel Edge Runtime. While it focuses on server-side usage (due to some missing browser APIs), it provides a lightweight, low-latency way to work with LLMs.

However, not all features from the Python version are available, and deploying in a serverless environment like Vercel can present additional challenges. On the flip side, once set up correctly, it’s incredibly responsive. Vercel even offers an AI SDK that integrates seamlessly with LlamaIndex.

In this post, we’ll look at the basic steps and considerations for using LlamaIndex.TS on Vercel. As I’m also relatively new to this, feel free to leave a comment with any corrections or feedback!


Key Resources

Before you begin, I recommend reviewing the following:


Basic Setup

In this tutorial, we’ll be using Next.js and Vercel. Below are the commands to get started in your project’s root directory:

# Initialize a Next.js app
npx create-next-app@latest .

# Initialize a Vercel project
vercel

# (Optional) Install and use Node.js 22 via nvm
nvm install 22
nvm use 22

# Install LlamaIndex.TS core
npm install llamaindex
Enter fullscreen mode Exit fullscreen mode

TypeScript Configuration

Make sure your tsconfig.json has the following setting:

{
  "compilerOptions": {
    // ...
    "moduleResolution": "bundler"
    // ...
  }
}
Enter fullscreen mode Exit fullscreen mode

This is required so TypeScript can properly handle ESM, CJS, and conditional exports in a serverless environment. See the official note for more details.

Switching to next.config.mjs

Following the official setup instructions, you should delete next.config.ts and create a next.config.mjs instead:

import withLlamaIndex from "llamaindex/next";

/** @type {import('next').NextConfig} */
const nextConfig = {};

export default withLlamaIndex(nextConfig);
Enter fullscreen mode Exit fullscreen mode

This helps Next.js adapt to ECMAScript Modules and avoids version conflicts.


Example: Using the OpenAI Chat Model

Below is a simplified version of the official OpenAI example for GPT-4o-mini.

1. Helper Module

Create a helper module for OpenAI calls (e.g., lib/openai.ts). This centralizes your OpenAI settings.

import { OpenAI } from "@llamaindex/openai";

export async function getChatResponse(message: string) {
  const llm = new OpenAI({ model: "gpt-4o-mini", temperature: 0.1 });
  const response = await llm.chat({
    messages: [{ content: message, role: "user" }],
  });
  return response.message.content;
}
Enter fullscreen mode Exit fullscreen mode

2. Backend API Endpoint

Set up an API route in app/api/openai/route.ts to handle requests and call your helper function:

import { NextResponse } from 'next/server';
import { getChatResponse } from '@/lib/openai';

export async function POST(request: Request) {
  try {
    const { content } = await request.json();
    const response = await getChatResponse(content);
    return NextResponse.json({ response });
  } catch (error) {
    console.error('API Error:', error);
    return NextResponse.json({ error: 'Error occurred' }, { status: 500 });
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Frontend UI Component

Finally, create a simple UI for testing in app/page.tsx:

'use client';

import { useState } from 'react';

export default function Home() {
  const [input, setInput] = useState('');
  const [response, setResponse] = useState('');

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    try {
      const res = await fetch('/api/openai', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ type: 'chat', content: input }),
      });
      const data = await res.json();
      setResponse(data.response);
    } catch (error) {
      console.error('Client Error:', error);
      setResponse('An error occurred.');
    }
  };

  return (
    <div className="p-8">
      <form onSubmit={handleSubmit}>
        <textarea
          value={input}
          onChange={(e) => setInput(e.target.value)}
          className="w-full p-2 border rounded"
        />
        <button
          type="submit"
          className="mt-2 px-4 py-2 bg-blue-500 text-white rounded"
        >
          Submit
        </button>
      </form>

      {response && (
        <div className="mt-4 p-4 bg-gray-100 rounded">
          {response}
        </div>
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

4. Environment Variables

In your local environment, make sure to place your OpenAI API key in .env.local:

OPENAI_API_KEY=sk-xxxx...
Enter fullscreen mode Exit fullscreen mode

Also configure this in the Vercel dashboard under SettingsEnvironment Variables to ensure your deployed functions can access it.

5. Test Locally and Deploy

Run locally:

npm run dev
Enter fullscreen mode Exit fullscreen mode

Check http://localhost:3000. If everything looks good, deploy to Vercel:

vercel deploy
Enter fullscreen mode Exit fullscreen mode

Example: Using the Vercel AI SDK

Vercel provides an AI SDK that simplifies working with large language model responses, including streaming out tokens as they come in.

1. Install the SDK

npm install ai
Enter fullscreen mode Exit fullscreen mode

2. Create the API Route

Below is an example using the LlamaIndex adapter. This closely follows the official example, plus a few extra console.log calls so you can see it in action:

// app/api/aisdk/route.ts
import { OpenAI, SimpleChatEngine } from 'llamaindex';
import { LlamaIndexAdapter } from 'ai';

export const maxDuration = 60; // optional, define your own constants

export async function POST(req: Request) {
  console.log('ai sdk route');

  const { prompt } = await req.json();

  const llm = new OpenAI({ model: 'gpt-4o-mini' });
  const chatEngine = new SimpleChatEngine({ llm });

  const stream = await chatEngine.chat({
    message: prompt,
    stream: true,
  });

  return LlamaIndexAdapter.toDataStreamResponse(stream);
}
Enter fullscreen mode Exit fullscreen mode

3. Frontend with useCompletion

In your page.tsx, you can significantly reduce boilerplate by using the useCompletion hook, which handles streaming:

'use client';

import { useCompletion } from 'ai/react';

export default function Home() {
  const { completion, input, handleInputChange, handleSubmit } = useCompletion({
    api: '/api/aisdk',
  });

  return (
    <div className="p-8">
      <form onSubmit={handleSubmit}>
        <textarea
          value={input}
          onChange={handleInputChange}
          className="w-full p-2 border rounded"
        />
        <button type="submit" className="mt-2 px-4 py-2 bg-blue-500 text-white rounded">
          Submit
        </button>
      </form>

      {completion && (
        <div className="mt-4 p-4 bg-gray-100 rounded">
          {completion}
        </div>
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

Because of streaming, you’ll see the answer come in gradually rather than in one chunk.


Wrapping Up

LlamaIndex.TS has less documentation and fewer examples compared to the Python version, but it builds quickly and feels snappy in a serverless environment. With Vercel’s AI SDK, you also get streaming out of the box, which makes for a better user experience.

If you run into issues or notice any mistakes here, please leave a comment. Let’s explore and improve on LlamaIndex.TS together!

Happy coding!

Top comments (0)