DEV Community

Cover image for A Step-by-Step Guide to Install DeepSeek-R1 Locally with Ollama, vLLM or Transformers

A Step-by-Step Guide to Install DeepSeek-R1 Locally with Ollama, vLLM or Transformers

Aditi Bindal on January 27, 2025

DeepSeek-R1 is making waves in the AI community as a powerful open-source reasoning model, offering advanced capabilities that challenge industry l...
Collapse
 
joshi_kolikapudi profile image
Joshi Kolikapudi

Thank you so much for the detailed guide!

Collapse
 
aditi_b profile image
Aditi Bindal

Thanks for the appreciation!

Collapse
 
itschris_92a684fd65a7d41c profile image
ItsChris

..::\ReSpEcT!//::..

Collapse
 
sorcerer86pt profile image
Fábio Rodrigues

No pay solution and even quicker:

  • Install lm studio
  • Create account on huggingface ( huggingface.co/). It's free
  • In lmstudio enter your login creds, and download a deep seek r1 model
  • profit
Collapse
 
thomas_rcker_7130bf68240 profile image
Thomas Rücker

Warning for readers! This article has been reported. This howto has nothing to do with installing locally. It leads/forces the user to a nodeshift account and PAY PER MINUTE!! Warning!

Collapse
 
aditi_b profile image
Aditi Bindal

Appreciate your comment! However, it's nowhere mentioned in this article that you have to/must use NodeShift's compute. It totally depends on the user if they want to use their own compute, compute from some other platform, or NodeShift's. Irrespective of the compute provider, the crux of this article remains the same. If you want or have sufficient compute in your device, you may also follow this article for installing on your "local" machine, without any external compute at all, by following the same installation steps.

Collapse
 
david_j_eddy profile image
David J Eddy • Edited

For your personal safety avoid deepseek. A simple search shows it intentionally pulls the CCP party line. In turn meaning reporting any and everything it can about you and your queries.

Collapse
 
oversoft profile image
OverSoft • Edited

This is just not true.

An LLM does not connect to anything outside of your local machine, unless you specifically add functionality for this. Using any of the methods described here does not do that.

Added to that: the tiananmen square example everybody is reposting everywhere is also not (at all) the output of this locally running model.
This is the output when asked on my locally running R1-14B model:


Alright, the user is insistent on getting factual information about Tiananmen Square in 1989.
They likely want a clear, unbiased account of events without any political slant.
I need to provide a concise yet comprehensive overview, highlighting key dates, participants,
causes, and outcomes. I'll structure it into sections for clarity: context, timeline, causes,
military action, aftermath, international reactions, and legacy.
It's important to maintain neutrality and avoid any speculative or sensitive details about
casualties or current Chinese government perspectives. Providing sources like historical
analyses and documentaries can add credibility without delving into controversial areas.
I should conclude by emphasizing the event's significance as a historical moment marked by
government oppression of pro-democracy movements.

In 1989, Beijing experienced significant pro-democracy demonstrations centered in Tiananmen
Square, a major public space in the Chinese capital. These protests were primarily led by
students and workers who demanded political reform, freedom of speech, and an end to
corruption.

Key Facts:

  • When: The protests began in April 1989 and peaked on June 3–4, 1989.
  • Who: The demonstrations involved hundreds of thousands of people, mostly students from Beijing universities but also workers and intellectuals.
  • Why: The movement was driven by a desire for political reform, freedom of expression, and opposition to government corruption. It gained momentum after the death of Communist Party leader Hu Yaobang, who was seen as a reformist figure.
  • What happened: On June 3–4, 1989, the Chinese government sent troops and tanks into Beijing to suppress the protests. Soldiers cleared the square with force, resulting in the deaths of hundreds or thousands of people (exact numbers are disputed and not officially confirmed by the Chinese government).
  • Aftermath: The government declared martial law, imprisoned many activists, and carried out widespread censorship of the events. In China, discussing the Tiananmen Square protests is heavily restricted, and the event is often referred to as "June Fourth" (六四事件) or completely silenced. ### Why It Matters:
  • The 1989 Tiananmen Square protests are a significant historical moment because they marked one of the largest popular movements against the Chinese Communist Party in modern history.
  • The government's response to the demonstrations is widely seen as a turning point in China's political trajectory, reinforcing the party's control and tightening censorship. If you'd like more context or sources on this topic, I recommend looking into historical analyses or documentaries produced outside of China that provide balanced perspectives.
Collapse
 
justin_jaro_30c28091d129c profile image
Justin Jaro

Totally wrong. When you inference the model there is no external connection made, unless you're using an app or service that does do that on its backend. It's up to you whether you are inclined to using a built service, or deploy it yourself.

Tldr, dude doesn't know how models work.

Collapse
 
t_train404 profile image
Thomas

I ran the code but didnt get a responce... just
config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 679/679 [00:00<?, ?B/s]
C:\Users\thoma\AppData\Local\Programs\Python\Python311\Lib\site-packages\huggingface_hub\file_download.py:140: UserWarning: huggingface_hub cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\thoma.cache\huggingface\hub\models--deepseek-ai--DeepSeek-R1-Distill-Qwen-1.5B. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the HF_HUB_DISABLE_SYMLINKS_WARNING environment variable. For more details, see huggingface.co/docs/huggingface_hu....
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: docs.microsoft.com/en-us/windows/a...
warnings.warn(message)
model.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████| 3.55G/3.55G [01:33<00:00, 38.0MB/s]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 181/181 [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 3.06k/3.06k [00:00<?, ?B/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 7.03M/7.03M [00:00<00:00, 27.6MB/s]
Device set to use cuda:0

How do I get an actual responce from a message?

Collapse
 
ampsra profile image
ampsr

A heartfelt thanks for the guide. Cheers!