DEV Community

Cover image for Run Deepseek locally using Docker!
Savvas Stephanides
Savvas Stephanides

Posted on • Edited on

Run Deepseek locally using Docker!

About

If you follow AI news even a little bit, you've probably heard of Deepseek. The new AI app built in China which is supposed to be a lot better than anything else out there. You've also probably heard horror stories about apps from China collecting information on their users.

If this sounds like you, you're probably looking for a better alternative. And there is: You can build a Deepseek client on your local system and run it without any access to the Internet. What's even better is that you can do this in Docker so you don't need anything else installed. Sound good? If yes, read on!

What we're building

We're going to use Docker to create a web interface where you can ask questions to Deepseek.

📖 Docker, explained

To do this, we're going to create two apps:

  1. A Deepseek endpoint built with Ollama
  2. A simple static website which calls the endpoint and gets answers to questions.

When we're building multiple apps that communicate with each other, we can do this easily in Docker using Docker Compose.

Steps

🏃‍♂️ In a hurry? Just clone this repository and follow the instructions in the README file!

Step 1: Create docker-compose file

To work with Docker Compose, we first need to create a new file in our project's directory. Create a new directory for your project if you haven't do so already and in that, create a new file called docker-compose.yml.

💡 A docker-compose.yml file is the place where we will be describing our two apps so we can run them.

Step 2: Add first app

Now that we created our Docker Compose file, let's define our first app, the Deepseek Ollama app. In Docker Compose, apps are called "services", but it's essentially the same thing.

The Deepseek Ollama app is where we're going to point to to ask questions to Deepseek and get answers.

💡 What is Ollama? Ollama is a lightweight framework where you can easily run open source frameworks, like Deepseek or Llama, on your computer locally.

To add the Deepseek endpoint to your docker-compose.yml file, add this to the file:

services:
  ollama:
    image: ollama/ollama
    volumes:
      - ./ollama-models:/root/.ollama
    ports:
      - 11434:11434
Enter fullscreen mode Exit fullscreen mode

🤔 What is going on here?

  • This part tells Docker Compose to create a container called "ollama"
  • The container will be based on the image (basically a blueprint) called ollama/ollama. This is an image which comes with Ollama preinstalled so you don't have to do anything else!
  • The volumes part essentially saves all the data from the models we're going to install later on your local hard drive. Within the container, all this data is saved in /root/.ollama but will disappear once the container is shut down. Not exactly what we want. So whatever is stored in the container's directory, will be permanently stored in ollama-models directory in your project's root.
  • When the Ollama app gets up and running, it will run on the 11434 port within the container, but not on your machine, which means you won't be able to access it from your browser. To fix this, the ports part pairs the two ports.

Now that we added the Ollama endpoint, we can now run it so we can install Deepseek. To do this, just run this Docker Compose command in your terminal:

docker compose up -d ollama
Enter fullscreen mode Exit fullscreen mode

Now go to your browser and check that the Ollama app is running, by pointing your browser to this address:

http://localhost:11434

Your browser should show the "Ollama is running" text like so:

A browser window with

Step 3: Install Deepseek

Now that Ollama is up and running, we can now go ahead and install our Deepseek model. To do this just run this Docker Compose command:

docker compose exec ollama ollama pull deepseek-r1:7b
Enter fullscreen mode Exit fullscreen mode

🤔 What is going on here?

  • The exec command from Docker Compose essentially runs any command within a given container, in this case the ollama container. The above line runs (executes) the command ollama pull deepseek-r1:7b which installs the Deepseek model. The basic structure of the exec command is as follows: docker compose exec <containerName> <command>.

This command will take a few minutes to install (depending on the size of the model) but once this is done, it should populate the new ollama-models directory with the files needed for the Deepseek model.

💡 The Deepseek model comes in lots of different sizes. For this example, I've chosen 7b (which means 7 billion parameters) but you can choose more or less depending on the capabilities of your system. You can see the full list here.

Step 4: Create a website

Now that we have our Deepseek app up and running, we can create a web interface to ask questions. We're going to create a simple site with HTML, CSS and Javascript. This is what we're creating:

Screenshot of the website

And here's how:

HTML

The HTML is going to define a simple page with a text box, a button to send the question and a space for the answer.

📖 HTML, explained

Create a new directory called web and inside that create a new file called index.html. Paste this HTML inside the file:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>My AI</title>
    <script src="showdown.min.js"></script>
    <script src="ollama.js"></script>
    <link rel="stylesheet" href="style.css" />
</head>
<body>
    <h1>🤖 Local Deepseek</h1>

    <textarea name="" id="question"></textarea>
    <button onclick="run()">Ask!</button>

    <div id="answer">

    </div>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

🤔 What is going on here?

  • In the <head> part of the HTML, you'll notice that we're linking to a style.css We'll create this file next in order to style our website.
  • You'll also notice two Javascript files, ollama.js and showdown.min.js. ollama.js will be where we talk to Deepseek and showdown.min.js uses Showdown, a Javascript library for converting Markdown (what we'll be getting back from Deepseek) to HTML.

CSS

📖 CSS, explained

To style our page, create a new file called style.css and paste the CSS below:

body{
    width: 600px;
    margin: auto;
}

#question{
    display: block;
    width: 100%;
    padding: 9px;
    font-size: 15px;
}

#answer{
    font-family: Arial, Helvetica, sans-serif;
    font-size: 15px;
    margin-top: 30px;
    line-height: 1.5;
}

#answer #think{
    border-left: 3px solid #eee;
    padding-left: 9px;
    color: #aaa;
    font-style: italic;
}
Enter fullscreen mode Exit fullscreen mode

Javascript

📖 Javascript, explained

Now we're going to create the Javascript to talk to Deepseek and give us answers to our questions. Create a new file called ollama.js and paste this:

const converter = new showdown.Converter()

async function run(){
    let prompt = document.querySelector("#question").value

    const response = await fetch("http://localhost:11434/api/generate", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            model: "deepseek-r1:7b",
            prompt: prompt,
            stream: true
        })
    })

    const reader = response.body.getReader()
    const decoder = new TextDecoder()

    let compiledResponse = ""
    while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        const chunk = decoder.decode(value, { stream: true });
        let chunkJson = JSON.parse(chunk)
        compiledResponse += chunkJson.response
        compiledResponse = compiledResponse.replace("<think>", `<div id="think">`)
        compiledResponse = compiledResponse.replace("</think>", `</div>`)
        document.querySelector("#answer").innerHTML = converter.makeHtml(compiledResponse)
    }
}
Enter fullscreen mode Exit fullscreen mode

🤔 What is going on here?

  • We're creating a Javascript function called run().
  • Within the function, we're going to get the text from the text box in our HTML using document.querySelector("#question").value and store it in a variable called prompt.
  • Then we're going to use the built in fetch() function to send a POST request to http://localhost:11434/api/generate which includes our prompt. The response is stored in a variable called response. Since we've set stream: true, we're going to get a response in small chunks.
  • To get each chunk individually, we're going to run response.body.getReader() to get the reader and initialise a new TextDecoder with new TextDecoder().
  • We're introducing a new empty string let entireResponse = "" which we're going to be appending each chunk of the response.
  • Finally, the while loop is going to run until it runs out of chunks. For each chunk, we're going to get the response, add it to the entireResponse, process it, and show it to the webpage using document.querySelector("#answer").innerHTML = entireResponseAsHtml.

🤔 Why are we processing the response?

The response comes back from Deepseek looking like this:

<think>
The user says hello so I should say hello back
</think>

**Hello! How are you doing?**
Enter fullscreen mode Exit fullscreen mode

When we process the file, we replace <think> with <div id="think"> and </think> with </div>. This way we can style it however we like.

entireResponse = entireResponse.replace("<think>", `<div id="think">`)
entireResponse = entireResponse.replace("</think>", `</div>`)
Enter fullscreen mode Exit fullscreen mode

We're also then converting the entire response from Markdown into HTML, using the ShowdownJS library:

let entireResponseAsHtml = converter.makeHtml(entireResponse)
Enter fullscreen mode Exit fullscreen mode

Import showdown

Finally, we need to add ShowdownJS to our project. To do this, simply download this file and add it to the web directory of your project.

At the end of all this, the web directory should look like this:

📝 index.html
📝 ollama.js
📝 showdown.min.js
📝 style.css

Step 5: Add web page to Docker Compose

Once you're done with the creation of your website, add your app to the Docker Compose file like so:

  web:
    image: nginx:1.27.3-alpine
    volumes:
    - ./web:/usr/share/nginx/html
    ports:
    - "3001:80"
Enter fullscreen mode Exit fullscreen mode

Your entire docker-compose.yml file should look like this:

services:
  ollama:
    image: ollama/ollama
    volumes:
      - ./ollama-models:/root/.ollama
    ports:
      - 11434:11434

  web:
    image: nginx:1.27.3-alpine
    volumes:
    - ./web:/usr/share/nginx/html
    ports:
    - "3001:80"
Enter fullscreen mode Exit fullscreen mode

Run the website

Now that we've created our website and added to our docker-compose.yml file, we can run it with this command:

docker compose up -d web
Enter fullscreen mode Exit fullscreen mode

Give it one or two seconds and then point your browser to this URL:

http://localhost:3001

If you see this, we're good to go:

Website screenshot

Let's test it!

Let's give our AI app a go! Write a question in the text box and click the Ask button.

Deepseek should soon start responding. First with some "thinking"...

The thinking box

The answer box

This is how it looks on Video:

And that's it! You are now running Deepseek locally using just Docker!

Now every time you want to run your AI app, just run this command from your project directory in your terminal:

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

And that's it!

Any questions? Let me know here!

Top comments (0)