The cover of this article is from a video where Sam Altman goes on and claims, "It is hopeless to compete with us!". But they've spoken too soon and have finally found a worthy opponent. A Chinese company, DeepSeek, announced their flagship model, R1, that matches the performance of OpenAI's flagship models GPT-4o and o1.
Why is this huge?
There are a few pointers here:
• The computing power used to train the model was 1/10th that of similar LLMs.
• It costed them only $6 million compared to the $100+ million of GPT 4 and newer models.
• It is open source.
• It was all done amid sanctions on their use of Nvidia chips.
The points above puts into perspective how a significant player can emerge in the A.I race out of nowhere. Enough of this, let's get our hands on this beast of a model.
Prerequisites
To run the R1 locally, you must Ollama. Download it for your OS and you're all set.
Downloading DeepSeek R1
Being open source, you can easily download R1 and start using it right away. Right away, as long as you have the computing resources to run the lighter versions.
The original model has 671b (404GB) parameters but there are 6 other distilled variants with the following parameters and sizes:
• 1.5b - 1.1GB
• 7b - 4.7GB
• 8b - 4.9GB
• 14b - 9.0GB
• 32b - 20.0GB
• 70b - 43GB
I'm running the 14b model on my i5-12600k, with 32GB of RAM, and the AMD Radeon 7800XT with 16GB of VRAM. It seems to be working fine, although the GPU utilization goes over 90% when I run queries.
Don't try and use models with 32B parameters and above locally, unless you have the latest and greatest GPU from Nvidia and tons of processing power. If you're having a GPU with less than 8GB of VRAM, consider using the 1.5b model only.
To download the 14b parameter model,
ollama run deepseek-r1:14b
It starts the download immediately and you should see something like this. If it doesn't, make sure Ollama is running by executing ollama serve
.
To download the other models, simply change the parameter value. For instance, if you want to download and run the 8b model, change 14b
to 8b
in the above query.
After the download completes, it immediately starts the model as well. I've read that this model performs extremely well with reasoning, especially in the field of mathematics.
Output versus ChatGPT
I asked a very simple question, what is (a+b)^2?
The output from DeepSeek R1 was so detailed, almost like a professor teaching.
>>> What is a + b the whole squared?
<think>
To determine \( (a + b)^2 \), I recall that squaring a binomial involves using the distributive property.
First, I apply the formula for the square of a sum: \( (a + b)^2 = a^2 + 2ab + b^2 \).
Next, I verify this by expanding \( (a + b)(a + b) \):
- Multiply the first terms: \( a \times a = a^2 \).
- Multiply the outer terms: \( a \times b = ab \).
- Multiply the inner terms: \( b \times a = ab \).
- Multiply the last terms: \( b \times b = b^2 \).
Adding these products together gives \( a^2 + 2ab + b^2 \), confirming that the expansion is correct.
</think>
To find the value of \( (a + b)^2 \), we can use the distributive property (also known as the FOIL method) to
expand the expression.
**Step-by-Step Solution:**
1. **Apply the Distributive Property:**
\[
(a + b)^2 = (a + b)(a + b)
\]
2. **Multiply Each Term in the First Parenthesis by Each Term in the Second Parenthesis:**
- **First Terms:** \( a \times a = a^2 \)
- **Outer Terms:** \( a \times b = ab \)
- **Inner Terms:** \( b \times a = ab \)
- **Last Terms:** \( b \times b = b^2 \)
3. **Add All the Products Together:**
\[
(a + b)^2 = a^2 + ab + ab + b^2
\]
4. **Combine Like Terms (\(ab + ab = 2ab\)):**
\[
(a + b)^2 = a^2 + 2ab + b^2
\]
**Final Answer:**
\[
\boxed{(a + b)^2 = a^2 + 2ab + b^2}
\]
The same question, when I asked ChatGPT online, gave me this:
Conclusion
This article is not about finding the better A.I model, but a way to reflect on how A.I has grown over the past couple of years. It is very easy to conclude that ignorance can hit you hard or in terms of the U.S stock market, a $1.2 trillion drop in tech stocks! Anyways, have fun playing around with DeepSeek R1. Be sure to try other open source models available on Ollama and see what best suits your needs!
Top comments (0)