DEV Community

Cover image for Everyone is freaking out about Chinese AI startup DeepSeek. Are its claims too good to be true?
Gajanan Rajput
Gajanan Rajput

Posted on • Originally published at Medium

Everyone is freaking out about Chinese AI startup DeepSeek. Are its claims too good to be true?

Chinese AI startup DeepSeek shocked the U.S. tech industry

The Chinese AI company DeepSeek surprised the U.S. tech sector this weekend by introducing an advanced large language model that could rival leading players like OpenAI, all developed with limited financial resources. This announcement resulted in one of the largest single-day declines in U.S. tech stocks in recent history. Nevertheless, many prominent figures caution that the astonishing assertions made by the enigmatic lab should not be accepted without scrutiny.

DeepSeek, founded by the leader of a Chinese quantitative trading firm, is making some bold announcements regarding its achievements. To begin with, the startup claims that it developed its new R1 model in just two months at a cost of less than $6 million, which is approximately 3–5% of what sources suggest OpenAI spent on its next-generation O1 model. DeepSeek also asserts that it only utilized Nvidia H800 chips, a downgraded product designed to meet U.S. export regulations, in addition to an older generation A100 offering — essentially arguing that it has either matched or surpassed the best models from the West while operating under significant constraints. Nevertheless, numerous experts contend that those assertions are likely deceptive or entirely false.

One critic is Alexandr Wang, the CEO of Scale AI and the youngest self-made billionaire in the world, who has characterized the AI competition as a battle that the U.S. must prevail in. Recently, Wang stated that his company’s evaluations have shown that DeepSeek’s R1 model can either surpass or at least match the leading American alternatives. Importantly, he asserted that DeepSeek and other labs in China have access to more advanced H100 chips (which were Nvidia’s premier product until the Blackwell platform was introduced late last year) than they are revealing.

“My belief is that DeepSeek possesses around 50,000 H100s,” he mentioned to CNBC, “but they are unable to discuss it, of course, due to the export regulations imposed by the United States.”

Elon Musk replied to a clip of Wang’s comments in a post on X, the social media platform he owns.

“Obviously,” said Musk, the CEO of xAI, the world’s richest man, and a close ally of U.S. President Donald Trump.

Obviously

— Elon Musk (@elonmusk) January 27, 2025

Ted Mortonson, managing director and tech strategist at Baird, also questions whether DeepSeek could have constructed its model using reduced-capacity H800 chips. He believes the enthusiasm surrounding DeepSeek’s alleged innovations is exaggerated, arguing that American companies could have developed such a model if they had the desire.

“We have some of the top AI engineers in the world based in the U.S.,” he remarked, “and to suggest that they haven’t explored this through open source and optimization is quite absurd, if you consider it. Therefore, I personally do not place much faith in what Chinese firms claim.”

DeepSeek’s cost figures misleading
Gavin Baker, managing partner and CIO at Boston hedge fund Atreides Management, said it’s true that DeepSeek’s R1 can accomplish inference — when an already-trained model generates outputs — faster and more cheaply than Open AI’s o1. But he adds an important qualifier.

Pegging R1’s price tag at $6 million, he said, is wildly misleading. DeepSeek’s technical paper, he noted, said the figure did not include “costs associated with prior research and ablation experiments on architectures, algorithms, and data.”

“Other than that Mrs. Lincoln, how was the play?” Baker mused on X, referencing Abraham Lincoln’s assassination. “This means that it is possible to train an [R1] quality model with a $6m run if a lab has already spent hundreds of millions of dollars on prior research and has access to much larger clusters.”

1) DeepSeek r1 is real with important nuances. Most important is the fact that r1 is so much cheaper and more efficient to inference than o1, not from the $6m training figure. r1 costs 93% less to use than o1 per each API, can be run locally on a high end work station and…

— Gavin Baker (@GavinSBaker) January 27, 2025

Baker also said it probably would not have been possible for DeepSeek to train R1 without access to OpenAI’s ChatGPT-4o or o1. Permitting Chinese companies to peel off leading American models, he said, seemed to defeat the purpose of export restrictions.

“Interesting analysis,” Musk responded. “Best I’ve seen.”

Although there are many reasons to doubt some of DeepSeek’s assertions, it’s important to highlight that currently, U.S. tech leaders are taking them quite seriously. This includes Meta’s CEO, Mark Zuckerberg, who has allegedly set up teams of engineers to investigate how DeepSeek might have equalized competition.

At the same time, well-known venture capitalist Marc Andreessen described DeepSeek’s emergence as a “Sputnik moment,” alluding to the time when the Soviet Union outpaced the U.S. in launching

Thank you for reading I hope this article was helpful and informative.

Top comments (0)