No. This is not another "hype post" about R1.
Or maybe it is... 🤔
Controversial or not, there's an undeniable fact. DeepSeek has created a "before and after" in artificial intelligence, and we'll quickly see the evidence.
If you're late to the news, let me tell you what DeepSeek-R1 is all about 🐋
In July 2023, Liang Wenfeng founded DeepSeek. His vision was clear: Challenge everything we thought we knew about artificial intelligence.
The industry was obsessed with a simple formula:
💰 More money = better models
Tech giants investing billions and burning resources in datacenters. Everyone playing the same expensive game.
Obviously, not everyone could participate.
DeepSeek bet on an efficiency-based approach.
In fact, this isn't the first time DeepSeek has made headlines. Their previous models, especially DeepSeek V3, have drawn attention for their capabilities and ease of use. Sure, these models weren't as powerful as current leading competitors.
That's over now.
DeepSeek has released a new AI model, called R1. A model that rivals o1, one of OpenAI's most powerful models to date (only behind o3).
R1 offers very similar results in its reasoning level but at a fraction of the cost.
We're not talking about small optimizations. We're talking about massive resources no longer being required.
How much does it really cost to build a world-class AI model? 🤑
Until today, the answer always included absurd amounts of money... but R1 has changed that.
New approach, new rules
DeepSeek R1 represents a huge leap in language model development, standing out not only for its performance comparable to first-line models but also for its focus on efficiency and accessibility.
This approach allows the model to maintain the processing capability of larger models while significantly optimizing computational resource usage.
Performance and Benchmarks
Benchmark results demonstrate R1's impressive performance against popular models:
In mathematical reasoning tasks and complex problem-solving, R1 demonstrates exceptional performance, matching or surpassing strongly established models like OpenAI's.
However, it wasn't its response capability (although quite advanced) that impressed the world.
We had already seen surprising results with "human reasoning" from OpenAI.
R1's true revolution lies in its costs.
Model | Input cost | Output cost |
---|---|---|
DeepSeek-R1 | $0.55 | $2.19 |
OpenAI o1-1217 | $15.00 | $60.00 |
We're talking about approximately 96% cost savings. It's totally insane! 🤯
R1 matches (and sometimes surpasses) the most powerful models in the market... at a cost 27 times lower. 27 times!
If that weren't enough, they've published the code as Open Source. Transparent. Ready to be studied, modified, and improved by anyone (Including competitors).
Sounds weird.
This has important implications
It proves that high-level AI doesn't require billion-dollar investments
It establishes a new efficiency standard in training new models
Advanced AI models will be truly accessible to everyone
R1 is perfectly replicable
Before trying to analyze the impact this will bring to the industry and the world, let's first see how it works 🤓
Architecture
DeepSeek R1 combines computational efficiency with advanced reasoning capabilities.
R1's core implements an MoE (Mixture-of-Experts) architecture. The interesting thing about this approach is that, during each operation, the model selectively activates only the parameters it needs.
This selective activation works like a system of "specialists," where different model components activate according to the specific task. The model decides in real-time which components it needs for each type of processing. This means significant savings in operational costs.
R1's training was primarily developed using common techniques found in most large language models. But one of them, very interesting, is called Chain-of-Thought, which we can also find in OpenAI's o1 model.
This technique divides the query into multiple stages instead of generating a direct response. At each stage, it executes the reasoning or processing tasks it was trained on, and its conclusion is taken by the next stage's process. It automatically feeds back.
It's like "thinking step by step" 🧠
In simple terms, what happens when interacting with these models is:
Analyzes the problem using chain of thought
Proposes multiple solutions considering temporal and spatial complexity
Implements the most efficient solution
Verifies the solution
This allows it to rival OpenAI's no longer so novel model.
Ok, we know it's a powerful model. How did they optimize it so much?
No, they didn't create impossible methods from a sci-fi series 🎬
No, they're not aliens or engineers from the year 3000 👽
No, AI won't take our jobs 🙄
They used good old software engineering fundamentals. Mathematics, linear algebra, probability, and statistics.
They optimized memory bits. Integrated a specialized cache system. Used Assembler instead of CUDA, a language created by NVIDIA to program chips. (NVIDIA didn't take it well)
They knew how to maximize the potential of the chips used. Very cheap chips, by the way. (H800)
Techniques that aren't out of this world. It's not a "magical" invention. That "simplicity" is what makes it so interesting.
R1's architecture proves it's possible to match, and even exceed, popular models' performance through intelligent optimizations and careful design.
It's a game-changer. It's simply a new way of thinking about artificial intelligence.
Its architecture, training, and other technical aspects are explained in detail in their Official Paper.
Practical capabilities
DeepSeek R1 particularly excels in mathematical tasks, code generation and debugging, as well as reasoning tasks.
For example, when trying to solve a mathematical problem:
R1 begins to solve the problem systematically:
Starts by squaring both sides
Performs algebraic manipulations
Reaches a point of "reflection"
The curious thing here is that in the middle of the process, R1 has what it calls an "aha moment" (a reflection or questioning). It interrupts its own process and decides to reevaluate its approach from the beginning.
This behavior is particularly interesting because it shows:
Self-correction capability
Transparency in the thought process
Similarity to human behavior when we realize we might be taking an overly complicated path to solve a problem
The model "realized" it was unnecessarily complicating the solution and decided to step back to look for a more elegant or efficient approach.
This is thanks to its "simulation of human reasoning" through the chain of thought. The problem is broken down step by step, and it can identify if the solution isn't going in the right direction to return to a previous step in the process.
Of course, natural expressions like an "aha" come from the thousands of examples it was trained on.
Let's now try with a classic coding problem. The Fibonacci sequence.
I used the following prompt: "Write a JavaScript function that calculates the nth number in the Fibonacci sequence. The sequence starts with 0 and 1, and each subsequent number is the sum of the previous two."
I first asked ChatGPT, specifically its GPT4-o model, and this was its response:
function fibonacci(n, memo = {}) {
if (n in memo) return memo[n];
if (n <= 1) return n;
memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo);
return memo[n];
}
// Example usage
console.log(fibonacci(10)); // 55
ChatGPT successfully completed the task. However, it used a recursive approach, which tends to have serious performance issues. This makes the provided solution inefficient ❌
Now, if we ask exactly the same question to DeepSeek R1, we can see something totally different.
The first thing we notice is the chain of thought, where it breaks down the task we've assigned ☁️
Shortly after, we can observe that it has concluded that using recursion is not a viable option ✅
After a few seconds, it finally gives us the result.
function fibonacci(n) {
if (n === 0) return 0;
if (n === 1) return 1;
let a = 0, b = 1;
for (let i = 2; i <= n; i++) {
const next = a + b;
a = b;
b = next;
}
return b;
}
While the code might not be perfect, the interesting thing about all this is that we can transparently observe the steps that led to this solution.
This can offer us a lot of context when trying to carry out tasks we might not have a good understanding of, or simply want to understand what aspects it considered to reach that result.
I haven't been able to perform the same test with OpenAI's o1 model. Most likely, it would have reached the same conclusion.
We can experiment directly with R1 from its official site, interact with its API (which also uses the same library as OpenAI, so if you already had an application developed, migration is very simple).
Or better yet. Install R1 on your own computer. Without depending on external servers.
DeepSeek R1 is available for local installation from Ollama and compatible with Open WebUI.
(Technical knowledge required)
The disruptive impact of R1
DeepSeek R1's launch has caused a significant impact on the industry, questioning established practices about the development and deployment of advanced AI models. This impact has manifested particularly dramatically in the stock market.
On January 27, 2025, NVIDIA experienced one of the biggest drops in tech market history. Their shares fell more than 18% in a single day, resulting in a loss of nearly 600 billion dollars 🔻
NVIDIA had dedicated itself to developing increasingly powerful chips to achieve impressive advances in artificial intelligence, receiving massive investments.
Investors' concern upon discovering that such advanced and expensive chips weren't really necessary to obtain the same, or at least similar results, has been clear.
Cheaper chips translate to lower return on investment 💸
What can we expect? 🤔
NVIDIA's fall is just the beginning... we're at an inflection point in the AI industry.
Why? Because R1 doesn't just democratize access - it rewrites the rules of the game:
Startups can compete with tech giants
Developers can experiment on their own machines
Small companies can offer AI services without going broke
This means that startups and small organizations, previously limited by operating costs, can now access advanced AI.
The result? A wave of innovation.
We'll see applications that were previously economically unfeasible. New use cases that nobody had imagined, or maybe they had, but couldn't create. Solutions that will change entire industries 🚀
Let's not forget that R1 being Open Source is replicable.
OpenAI will use R1-based models. Anthropic too. Google, Meta. Everyone.
There will be a new hype of AI-powered applications. Much bigger than what we've seen so far.
The question is no longer "who can afford to develop AI?" Now it's "what will YOU do with it?"
Advanced artificial intelligence models will no longer be a privilege...
OpenAI with ChatGPT and other giants won't disappear. It would be naive to think they will.
The market will be weird for a while. Yes.
There will continue to be new advances. R1 won't be trending forever, nor the only alternative we'll have.
But that's exactly what happens when a technology goes from being exclusive to being accessible.
Don't believe me? In the time it took me to write this article, Amazon announced that DeepSeek-R1 is already available on AWS
Remember when Linux came out? Android? The web we know today?
The future is promising.
Top comments (0)