DEV Community

Cover image for How to Integrate Ollama in Next.js
Asaolu Elijah 🧙‍♂️
Asaolu Elijah 🧙‍♂️

Posted on

How to Integrate Ollama in Next.js

There’s no ignoring how AI and and large language models (LLMs) have dominated conversations lately. Nearly every week, new models are released, and new integrations (e.g AI agents and operators) are developed. It’s crazy not to want to jump on this exciting train.

In this article, we’ll explore how to integrate Ollama with Next.js to build LLM-powered applications. We’ll cover how to download and interact with open-source LLMs (such as Llama, DeepSeek, and Mistral), as well as how to send messages and process LLM responses in Next.js using ollama.js. Our final application will look like the image shown below.

Screenshot of a chat session with Ollama via a Next.js app

Prerequisite

To follow along, you should have a basic understanding of Next.js and how LLMs work behind the scenes.

What is Ollama?

Ollama is an open-source framework founded by Michael Chiang and Jeffrey Morgan to simplify LLM integration with other applications. Ollama itself is not an LLM; instead, it allows you to download LLMs, interact with them, and create custom models. It also provides pre-configured API endpoints with which you can integrate downloaded or customized LLMs into other applications.

It’s also worth mentioning that Ollama should not be confused with Llama (without the "O"). Llama is an open-source LLM developed by Meta AI. In contrast, Ollama is an independent open-source tool that enables integration with various LLMs. With that clarified, let’s explore these integrations in practice.

Setting up ollama

To get started, download Ollama from its official website. If you’re using other Unix/Linux distro, you can also download it by running the following command.

curl -fsSL https://ollama.com/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

Once you’ve successfully downloaded and installed Ollama, run the following command to verify that it is now executable on your terminal/command line.

ollama -v
Enter fullscreen mode Exit fullscreen mode

If everything went well, you should see the version number printed in your console. Next, let’s install and interact with our first model.

Pull and run a model

As mentioned previously, Ollama provides access to multiple open-source models, including DeepSeek, Llama, Mistral, Qwen, and Phi. You can also view the list of supported models here.

Popular LLM models list on Ollama website

From the model list, Llama 3.2 (by Meta AI) presently stands out as a small-sized model with impressive benchmark stats. It is available in two variants: 1B parameters (1.3GB) and 3B parameters (2GB). For this tutorial, we’ll use the Llama 3.2 1B model, as its smaller size makes it more manageable. To proceed, download the model by running:

ollama run llama3.2:1b
Enter fullscreen mode Exit fullscreen mode

This command will pull the Llama 3.2 (1B parameters) model from Meta’s registry. Once the download is complete, the model will launch, and you can start chatting with it right away, as shown in the screenshot below.

Screenshot of a chat with llama3.2 on the terminal

Feel free to spend some time interacting with the LLM directly from your terminal. Once you're ready, let’s dive into bringing it into a Next.js application.

Interact with LLMs in Next.js with Ollama.js

Ollama provides REST API endpoints that allow you to interact with downloaded models. Once the desktop app is running, it exposes an endpoint at http://localhost:11434/api, where you can send custom HTTP requests to perform interactions.

For example, to generate an answer to the question "What is Exorcism?", we can send a POST request to the /api/generate endpoint, specifying our preferred model and question, as shown below:

curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt": "What is Exorcism?"
}'
Enter fullscreen mode Exit fullscreen mode

You should then see the response objects streamed to your command line or terminal in real-time:

Llama3.2 streaming response to the terminal

To simplify the process, the Ollama team created a JavaScript library (ollama-js) that comes pre-configured with various methods for referencing these endpoints and performing different tasks. We’ll leverage this library in our application.

To proceed, create a new Next.js app by running the following command:

npx create-next-app next-ollama
Enter fullscreen mode Exit fullscreen mode

During the installation process, select Tailwind CSS and the pages/ router to ensure consistency with this tutorial. All other configuration options can be customized to your preference.

Once your new Next.js app is successfully created, navigate to the project directory and install ollama-js by running the following commands:

cd next-ollama
npm install ollama
Enter fullscreen mode Exit fullscreen mode

Next, let’s create a simple chat interface to interact with the LLM we installed earlier. Open the default pages/index.js file and replace its contents with the following code:

import { useState } from "react";

const Index = () => {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState("");
  const [isLoading, setIsLoading] = useState(false);

  const handleSubmit = async (e) => {
    e.preventDefault();

    const userMessage = { role: "user", content: input };
    setMessages((prevMessages) => [...prevMessages, userMessage]);
    setInput("");

    /* ====
    ⚠️ Main logic to be added later
    ==== */
  };

  return (
    <div className="max-w-2xl mx-auto p-4">
      <div className="bg-white min-h-[400px] p-4 mb-4 rounded-xl border-2 border-gray-500 overflow-y-auto">
        {messages.map((message, index) => (
          <div
            key={index}
            className={`mb-4 ${
              message.role === "user" ? "text-right" : "text-left"
            }`}
          >
            <div
              className={`inline-block p-3 rounded-2xl max-w-[80%] ${
                message.role === "user"
                  ? "bg-blue-600 text-white"
                  : "bg-gray-100 text-gray-800"
              }`}
            >
              {message.content}
            </div>
          </div>
        ))}
        {isLoading && (
          <div className="text-left">
            <div className="inline-block p-3 rounded-2xl bg-gray-100 text-gray-400">
              Thinking...
            </div>
          </div>
        )}
      </div>
      <form onSubmit={handleSubmit} className="flex gap-3">
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          className="flex-1 p-3 rounded-xl border-2 border-gray-500"
          placeholder="Type your message..."
        />
        <button
          type="submit"
          className="bg-blue-600 text-white px-6 py-3 rounded-xl hover:bg-blue-700 transition-colors font-medium"
        >
          Send
        </button>
      </form>
    </div>
  );
};

export default Index;
Enter fullscreen mode Exit fullscreen mode

The code above creates a basic chat interface that allows us to send messages and receive responses. It also includes a handleSubmit() function, which doesn’t do much at this point, but this is where we’ll add the logic to process our messages with Ollama.

Start your application by running:

npm run dev
Enter fullscreen mode Exit fullscreen mode

You should then see a chat interface similar to the image below.

A simple chat interface

Now, let’s create an API endpoint to send our message to Llama 3.2 via Ollama and process the response. To do this, create a new chat.js file inside the default pages/api/ directory and paste the following code into it.

import ollama from "ollama";

export default async function handler(req, res) {
  if (req.method !== "POST") {
    return res.status(405).json({ error: "Method not allowed" });
  }

  try {
    const { message } = req.body;

    const response = await ollama.chat({
      model: "llama3.2",
      messages: [{ role: "user", content: message }],
    });

    return res.status(200).json({ message: response.message.content });
  } catch (error) {
    console.error("Ollama API error:", error);
    return res.status(500).json({ error: "Failed to get response from LLM" });
  }
}
Enter fullscreen mode Exit fullscreen mode

Here, we imported Ollama and created a Next.js API endpoint that accepts POST requests with a message in the request body. Next, we called Ollama’s .chat() method, specifying the LLM model we installed earlier (llama3.2) and passing the received message as part of the request object.

To complete our integration, let’s update the main chat interface to communicate with the newly created endpoint. Open the pages/index.js file and replace the existing handleSubmit() function with the code below.

 const handleSubmit = async (e) => {
    e.preventDefault();
    if (!input.trim()) return;

    const userMessage = { role: "user", content: input };
    setMessages((prevMessages) => [...prevMessages, userMessage]);
    setInput("");
    setIsLoading(true);

    try {
      const response = await fetch("/api/chat", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
        },
        body: JSON.stringify({ message: input }),
      });

      if (!response.ok) {
        throw new Error("Failed to get response");
      }

      const data = await response.json();
      const aiMessage = {
        role: "assistant",
        content: data.message,
      };
      setMessages((prevMessages) => [...prevMessages, aiMessage]);
    } catch (error) {
      console.error("Error:", error);
      const errorMessage = {
        role: "assistant",
        content: "Sorry, I encountered an error while processing your request.",
      };
      setMessages((prevMessages) => [...prevMessages, errorMessage]);
    } finally {
      setIsLoading(false);
    }
  };
Enter fullscreen mode Exit fullscreen mode

At this stage, your pages/index.js file should match the one provided here. Finally, start your app by running:

npm run dev
Enter fullscreen mode Exit fullscreen mode

Go to the URL displayed in your console (typically http://localhost:3000/), and you should now be able to chat with the Llama 3.2 LLM directly from your Next.js app, as shown in the image below.

Screenshot of a chat session with Ollama via a Next.js app

And we’re done! You can now start interacting with a locally downloaded version of Llama 3.2 directly from your Next.js application. You can also pull other open-source models and interact with them.

Conclusion

In this tutorial, we covered how to install and download open-source large language models (LLMs) with Ollama, as well as how to use the ollama-js library to integrate these LLMs into a Next.js application. You can find the complete code used in this tutorial in this GitHub repository.

In an upcoming follow-up article, we’ll explore how to stream responses and display them in real-time instead of waiting for the final computed response. We’ll also cover how to make the LLM remember our previous conversations, along with other ollama-js methods and community integrations. In the meantime, enjoy chatting with your custom-built LLM friend!

Thanks for reading!

Top comments (2)

Collapse
 
ravi_choudhary_f925780566 profile image
Ravi Choudhary

Thanks a lot! I installed as you instructed above, and I was able to run it :)
Although llama3.2:1b model was installed, so I had to struggle a bit and then finally able to update exact model+tags to use it correctly.

Collapse
 
asaoluelijah profile image
Asaolu Elijah 🧙‍♂️

Thanks for the wonderful feedback, Ravi! I'm really glad you found the article helpful.