Ajeet Singh Raina

Posted on Jan 29

Getting Started with DeepSeek R1 using Ollama Locally

#docker #deepseek #ai #llm

DeepSeek LLM is an advanced language model developed by the DeepSeek team. Launched in early 2024, DeepSeek LLM has quickly gained traction due to its powerful reasoning, problem-solving, and factual retrieval capabilities. With 67 billion parameters, support for both English and Chinese, and an open-source framework, it has become a strong contender in the LLM space. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, the DeepSeek team have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

DeepSeek LLM Vs DeepSeek R1

DeepSeek R1 is an open-weight AI model designed to run efficiently on personal GPUs and edge devices, unlike traditional LLMs (Large Language Models) that require cloud-based, high-power infrastructure.

DeepSeek LLM = A high-powered AI that runs best in the cloud.
DeepSeek R1 = A compact AI designed to run efficiently on local hardware.

Aspect	DeepSeek LLM	DeepSeek R1
Focus	General-purpose language tasks	Specialized for reasoning, coding, and precision tasks
Design Philosophy	Broad versatility (text generation, Q&A)	Optimized for accuracy and logical reasoning in technical domains

Aspect	DeepSeek LLM	DeepSeek R1
Training Data	Diverse web text, books, general content	Enriched with code repositories, math datasets, and technical docs
Architecture	Standard transformer-based LLM	Enhanced with task-specific modules (e.g., symbolic reasoning)
Fine-Tuning	General conversational alignment	Domain-specific tuning for STEM, coding, and analysis

Aspect	DeepSeek LLM	DeepSeek R1
Reasoning	Good for everyday logic	Superior at math, code debugging, and complex problem-solving
Code Generation	Basic code snippets	Production-grade code with fewer hallucinations
Speed	Standard inference speed	Optimized for low-latency responses in technical tasks

Aspect	DeepSeek LLM	DeepSeek R1
Ideal For	Content creation, casual chatbots, summarization	Technical coding, mathematical proofs, data analysis
Enterprise Fit	General business automation	R&D teams, engineers, and data scientists

Aspect	DeepSeek LLM	DeepSeek R1
Model Size	Smaller variants (7B, 13B) for local use	Larger versions (e.g., 70B) for high-stakes tasks
Hardware Needs	Runs on consumer GPUs (RTX 3060)	May require enterprise GPUs (A100/H100)

How it is different from ChatGPT?

DeepSeek stands out for its cost-effectiveness. Its training and deployment costs are significantly lower than those of ChatGPT, enabling broader accessibility for smaller organizations and developers. For example, the DeepSeek R1 model, which rivals ChatGPT in reasoning and general capabilities, was developed for a fraction of the cost of OpenAI’s models.

Unlike industry leaders like OpenAI, DeepSeek LLM operates with a cost-effective approach, utilizing fewer advanced chips while still achieving state-of-the-art performance in tasks like mathematics, coding, and knowledge retrieval. Moreover, its integration capabilities with tools like Ollama, Open WebUI, and Docker make it a flexible and scalable solution for various AI applications.

🎯 Why Run DeepSeek R1 Locally?

Privacy 🔒: Cloud models require sending data to external servers, exposing sensitive information. With DeepSeek R1, all processing stays local—no third-party risks, no compliance headaches.
*Speed ⚡ *: Skip API latency and network delays. DeepSeek R1 delivers instant inference directly on your hardware, perfect for real-time applications.
Cost 💸: Cloud providers charge per API call, which adds up fast. Running DeepSeek R1 locally is free after setup—no recurring fees, just one-time hardware investment.
Customization 🛠️: Cloud APIs limit fine-tuning and control. Locally, you own the stack: tweak the model, integrate with private data, and optimize for your unique needs.
Deployment 🌐: Dependence on cloud connectivity can cripple workflows in remote or secure environments. DeepSeek R1 works offline and on-premises, empowering edge computing and air-gapped use cases.

Running DeepSeek R1 locally isn’t just about cutting costs—it’s about reclaiming control. For developers, enterprises, and privacy-focused users, it’s the ultimate way to harness AI without compromises.

In this blog post, we will explore how to set up and use DeepSeek LLM with Ollama, Open WebUI, and Docker to build a robust AI-powered environment.

What Makes DeepSeek R1 Special?

Key Features:

67 Billion Parameters – Enables the model to process complex information and generate high-quality responses.
Open-source – Available for research and development, fostering AI innovation.
Bilingual Support – Supports both English and Chinese, enhancing accessibility for a global audience.
Cost Efficiency – Uses fewer high-end chips, making it a budget-friendly alternative.
Scalability – Easily adjustable based on user requirements for optimal performance.

DeepSeek R1 - The "Reasoning" Model

DeepSeek R1 follows an incremental response generation technique, mimicking human reasoning through multi-step problem-solving. This allows it to:

Use less memory compared to competing models.
Optimize performance for complex reasoning tasks.
Provide structured and accurate responses in real-time.

Integrating DeepSeek R1 with Ollama, Open WebUI, and Docker

Prerequisites

Install Docker Desktop on Mac
Install Ollama
Run OpenWeb UI using Docker
Pull DeepSeek R1 Model

Minimal MacBook Requirements

For Basic Local Inference (Smaller Models, e.g., 7B-13B Parameter Models)

Chip: Apple Silicon M1/M2/M3 (8-core CPU + 7-10-core GPU).
RAM: 16GB unified memory (8GB might work for tiny models with heavy quantization, but expect instability).
Storage: 40GB+ free space (for model weights, libraries, and dependencies).

Software/Setup

Use frameworks like MLX (Apple’s machine learning framework) or llama.cpp for CPU/GPU-accelerated inference.
Quantized model formats (e.g., 4-bit or 8-bit GGUF) to reduce memory usage.

Key Considerations

Performance Expectations:

Even with M1/M2 Pro/Max chips, expect slower speeds compared to NVIDIA GPUs (no CUDA acceleration).
Larger models (e.g., 70B parameters) will struggle on most MacBooks unless heavily quantized.

Optimization Tips:

Use quantized models (e.g., DeepSeek-7B-GGUF instead of full-precision).
Close background apps to free up RAM/GPU resources.
For newer Macs (M3), leverage hardware-accelerated APIs like Metal for better GPU utilization.

Avoid Intel Macs:

Older Intel-based MacBooks lack unified memory and GPU muscle, making them impractical for local LLM inference.

If your MacBook meets these specs, you can experiment with smaller models, but for serious work (fine-tuning, larger models), consider cloud GPUs or a dedicated NVIDIA rig. Let me know if you need help setting up! 🚀

1. Install Docker Desktop

Click here to install Docker Desktop

2. Installing Ollama

3. Running Open WebUI

$ docker run -d -p 9783:8080 -v open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main

4. Pull DeepSeek R1 model using OpenWeb UI

DeepSeek R1 is an open-source AI model optimized for reasoning, problem-solving, and factual retrieval.

You can access it via http://localhost:9783/

5. Start querying and interacting with DeepSeek Model

Result: Here's the output that comes with DeepThink R1.

Thought for a few seconds
Alright, the user is asking me to write a compose file for WordPress. Let's break this down.

First, I need to figure out what exactly they're looking for. They might be a developer who wants to integrate custom plugins or themes into WordPress. Alternatively, they could be a user looking to customize their site using WordPress.

I should consider the structure of a standard WordPress compose file. It typically includes middleware extensions like .c2m and .linter, plugins that add functionality, and theme files for custom styling and behavior.

I'll start by outlining each section in the compose file. The .c2m file lists all the plugins installed, so I'll include their names and URLs. Then comes the .linter, which is a plugin used to detect syntax errors, but it's not necessary for composing unless they want more advanced features.

Next, the theme file where custom CSS or JavaScript can be added. This is important if the user wants to add their own styles or functionality beyond what WordPress provides.

I should also mention that the compose file format supports multiple extensions like .c2t and .jsl, but the main focus is on plugins, linter, and themes. Maybe I'll highlight how each can be used to extend WordPress's capabilities.

Finally, I'll remind them that if they want more advanced customization beyond this, they might need to use custom compose scripts or extensions. Providing links for further assistance would be helpful. 

It seems like you're asking about creating a WordPress Compose File , which is used to customize the appearance and behavior of WordPress themes, plugins, and other components.

The standard structure of a WordPress Compose File (also known as a .c2m file) includes:

Conclusion

DeepSeek LLM is a powerful alternative to mainstream AI models, offering advanced reasoning capabilities at a fraction of the cost. By integrating it with Ollama, Open WebUI, and Docker, you can create a robust and scalable AI-powered environment for various applications, including research, coding assistance, and knowledge retrieval.

With its open-source nature and high efficiency, DeepSeek LLM presents an exciting opportunity for developers and AI enthusiasts. Start experimenting with it today and unlock new possibilities in AI-driven applications!

DEV Community