DEV Community

Jonathan U
Jonathan U

Posted on

Running DeepSeek-R1 Locally - Use with Open WebUI, Chatbox, CodeGPT

What is DeepSeek-R1?

DeepSeek-R1 is an open-source LLM developed by the Chinese AI startup DeepSeek. It achieves performance comparable to OpenAI's o1 model but was created at a fraction of the cost. The model uses a "chain-of-thought" reasoning approach, enhancing the quality of its responses by systematically breaking down problems into logical steps. This methodology allows DeepSeek-R1 to excel in tasks requiring detailed reasoning and analysis.

DeepSeek OpenAI o1 comparison

Install and run Ollama

Ollama is a platform that allows users to interact with AI models in a way that is optimized for local, privacy-respecting, and customizable experiences.

There are a couple ways to install Ollama:

Ollama Site

  • Alternatively, if you're on Mac:
brew install ollama
Enter fullscreen mode Exit fullscreen mode

Once Ollama is installed, if it's not running already:

ollama serve
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1 Models

If you have a powerful computer, you can try to run DeepSeek-R1:

ollama run deepseek-r1:671b
Enter fullscreen mode Exit fullscreen mode

I do not have a computer capable of running this model. The DeepSeek team has demonstrated that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance than the reasoning patterns discovered through RL on small models.

There are several distilled models to run with Ollama. The options for DeepSeek-R1 are:
DeepSeek-R1-Distill-Qwen-1.5B

ollama run deepseek-r1:1.5b
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1-Distill-Qwen-7B

ollama run deepseek-r1:7b
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1-Distill-Llama-8B

ollama run deepseek-r1:8b
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1-Distill-Qwen-14B

ollama run deepseek-r1:14b
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1-Distill-Qwen-32B

ollama run deepseek-r1:32b
Enter fullscreen mode Exit fullscreen mode

DeepSeek-R1-Distill-Llama-70B

ollama run deepseek-r1:70b
Enter fullscreen mode Exit fullscreen mode

I'm on a Macbook Pro with the M1 Pro chip and 16 GB of RAM. I tried running the deepseek-r1:14b model, but it was not able to perform well. I've had the best luck with the deepseek-r1:7b model.

The ollama run command is successful if you see something like this:
ollama run

Using DeepSeek-R1 Locally

If you want to use something other than the terminal to interact with DeepSeek-R1, other options I've used are:

Chatbox AI

Chatbox AI is a simple download with a decent UI to interact with your local LLMs out of the box. It has several features that I haven't tried yet, but I've found this way to be the simplest to get it up and going.
To configure it to use your local DeepSeek-R1:

Settings -> Under the Model tab -> Select OLLAMA as the model provider -> API Host should be correct by default -> Select the Model you've installed in the Model dropdown (deepseek-r1:7b).
Then you should be all set!

Chatbox AI settings

Open WebUI

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with a built-in inference engine for RAG, making it a powerful AI deployment solution.

There are a couple ways to run Open WebUI:

  1. Using Python 3.11
pip install open-webui
Enter fullscreen mode Exit fullscreen mode
open-webui serve
Enter fullscreen mode Exit fullscreen mode

This will start the Open WebUI server at http://localhost:8080

  1. Docker The Open-WebUI has documentation on different ways to run it in docker, which you can find here: https://docs.openwebui.com/getting-started/quick-start

For my case, I just ran this command:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Enter fullscreen mode Exit fullscreen mode

If you want to bypass the login page. You'll need to set the WEBUI_AUTH environment variable to False:

docker run -d -p 3000:8080 -e WEBUI_AUTH=False -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main
Enter fullscreen mode Exit fullscreen mode

Open WebUI example

I use Open WebUI the most when interacting with my local LLMs.

CodeGPT

CodeGPT is an extension for VSCode/Cursor, as well as a plugin for JetBrains IDEs. It's an alternative to Github Copilot and allows you to use your local LLMs, such as the DeepSeek-R1 model, for prompts, code completion, unit testing, auto-complete, and more.

Conclusion

It's simple and accessible to run open-source LLMs on your local machine thanks to Ollama, DeepSeek, and other open-source LLMs.
Tools such as Chatbox AI, Open WebUI, and CodeGPT make it easy and intuitive when using these LLMs.
These tools also provide the option to use the official hosted APIs if you have an API key.

Top comments (0)