Aswin Rajeev

Posted on Jan 29 • Originally published at aswinrajeev.me

Using DeepSeek R1 on an AMD GPU

#deepseek #ai #opensource #chatgpt

The cover of this article is from a video where Sam Altman goes on and claims, "It is hopeless to compete with us!". But they've spoken too soon and have finally found a worthy opponent. A Chinese company, DeepSeek, announced their flagship model, R1, that matches the performance of OpenAI's flagship models GPT-4o and o1.

Why is this huge?

There are a few pointers here:

• The computing power used to train the model was 1/10th that of similar LLMs.
• It costed them only $6 million compared to the $100+ million of GPT 4 and newer models.
• It is open source.
• It was all done amid sanctions on their use of Nvidia chips.

The points above puts into perspective how a significant player can emerge in the A.I race out of nowhere. Enough of this, let's get our hands on this beast of a model.

Prerequisites

To run the R1 locally, you must Ollama. Download it for your OS and you're all set.

Downloading DeepSeek R1

Being open source, you can easily download R1 and start using it right away. Right away, as long as you have the computing resources to run the lighter versions.

The original model has 671b (404GB) parameters but there are 6 other distilled variants with the following parameters and sizes:

• 1.5b - 1.1GB
• 7b - 4.7GB
• 8b - 4.9GB
• 14b - 9.0GB
• 32b - 20.0GB
• 70b - 43GB

I'm running the 14b model on my i5-12600k, with 32GB of RAM, and the AMD Radeon 7800XT with 16GB of VRAM. It seems to be working fine, although the GPU utilization goes over 90% when I run queries.

Don't try and use models with 32B parameters and above locally, unless you have the latest and greatest GPU from Nvidia and tons of processing power. If you're having a GPU with less than 8GB of VRAM, consider using the 1.5b model only.

To download the 14b parameter model,

ollama run deepseek-r1:14b

It starts the download immediately and you should see something like this. If it doesn't, make sure Ollama is running by executing ollama serve.

To download the other models, simply change the parameter value. For instance, if you want to download and run the 8b model, change 14b to 8b in the above query.

After the download completes, it immediately starts the model as well. I've read that this model performs extremely well with reasoning, especially in the field of mathematics.

Output versus ChatGPT

I asked a very simple question, what is (a+b)^2?

The output from DeepSeek R1 was so detailed, almost like a professor teaching.

>>> What is a + b the whole squared?
<think>
To determine \( (a + b)^2 \), I recall that squaring a binomial involves using the distributive property.

First, I apply the formula for the square of a sum: \( (a + b)^2 = a^2 + 2ab + b^2 \).

Next, I verify this by expanding \( (a + b)(a + b) \):
- Multiply the first terms: \( a \times a = a^2 \).
- Multiply the outer terms: \( a \times b = ab \).
- Multiply the inner terms: \( b \times a = ab \).
- Multiply the last terms: \( b \times b = b^2 \).

Adding these products together gives \( a^2 + 2ab + b^2 \), confirming that the expansion is correct.
</think>

To find the value of \( (a + b)^2 \), we can use the distributive property (also known as the FOIL method) to
expand the expression.

**Step-by-Step Solution:**

1. **Apply the Distributive Property:**

   \[
   (a + b)^2 = (a + b)(a + b)
   \]

2. **Multiply Each Term in the First Parenthesis by Each Term in the Second Parenthesis:**

   - **First Terms:** \( a \times a = a^2 \)
   - **Outer Terms:** \( a \times b = ab \)
   - **Inner Terms:** \( b \times a = ab \)
   - **Last Terms:** \( b \times b = b^2 \)

3. **Add All the Products Together:**

   \[
   (a + b)^2 = a^2 + ab + ab + b^2
   \]

4. **Combine Like Terms (\(ab + ab = 2ab\)):**

   \[
   (a + b)^2 = a^2 + 2ab + b^2
   \]

**Final Answer:**

\[
\boxed{(a + b)^2 = a^2 + 2ab + b^2}
\]

The same question, when I asked ChatGPT online, gave me this:

Conclusion

This article is not about finding the better A.I model, but a way to reflect on how A.I has grown over the past couple of years. It is very easy to conclude that ignorance can hit you hard or in terms of the U.S stock market, a $1.2 trillion drop in tech stocks! Anyways, have fun playing around with DeepSeek R1. Be sure to try other open source models available on Ollama and see what best suits your needs!

DEV Community

Using DeepSeek R1 on an AMD GPU

Why is this huge?

Prerequisites

Downloading DeepSeek R1

Output versus ChatGPT

Conclusion

Top comments (0)

Read next

JSON Unescape: Understanding and Using It Effectively

Deploy Gemini-powered LangChain applications on GKE

How to Use JUnit on VS Code: A Comprehensive Guide

Prompt Engineering Certification: Learn how to best interact with AI models