Many people are struggling to get access to DeepSeek r1 at the moment because of the rate limits and restricted sign ups. However, there are alternative providers that you can use to access DeepSeek r1 and its distill models.
deepseek-r1-distill-qwen-32b
Let’s start with deepseek-r1-distill-qwen-32b
because it is the easiest model to get access to and probably the best balance of cost, performance and speed.
deepseek-r1-distill-qwen-32b
is a distilled version of r1. This model is made by transferring the knowledge from the larger model to the smaller model through a process known as knowledge distillation. The 32b qwen model in particular beats other models in several benchmarks, esp. coding.
There is only one provider that currently makes this model available for anyone to use: Glama Gateway. Alternatively, you can self-host this model, but expect that you will need approx. 80gb of VRAM.
Providers:
The great thing about 32b is price and response times. It is currently cheaper than the official DeepSeek r1 and responds slightly faster than r1.
deepseek-r1-distill-llama-70b
The 70b llama version is also a distilled version of DeepSeek r1. It is based on llama, meaning that there are more providers available.
Groq is one of the noteworthy providers that have access to this model.
https://console.groq.com/docs/models
The benefit of using Groq is that it is extremely fast (upwards of 300 tokens per second for this model).
The downside is that the model is severely rate limited. Depending on what you are planning to do, the current rate limits (30k tokens per minute) might be not enough.
You can also access this model through Glama — deepseek-r1-distill-llama-70b. As a gateway provider, Glama has slightly elevated rate limits and can offer up to 60k tokens per minute.
Other providers to evaluate:
- https://deepinfra.com/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
- https://novita.ai/models/llm/deepseek-deepseek-r1-distill-llama-70b
I will update this article as I discover other providers.
If you were planning to host this model yourself, bare in mind that it requires a lot of vRAM (GB 140). While it is possible to host it with lower spec machine, the performance will be subpar.
DeepSeek r1
Finally, if you are trying to get deepseek-r1
, your best bet remains waiting for deepseek.com to clear out the backlog of demand. Allegedly, they are currently experiencing DDoS. Therefore, new user sign ups are currently restricted.
A few other providers that claim to offer r1:
I explicitly mention “claim to offer” because many of them are oversubscribed at the moment and not able to meet the demand. Even if you sign up, you might get rate limits.
Unfortunately, hosting r1 yourself is not a viable option for most of us. The model is 671b parameter model. Meaning that you would need at least 1,342 vRAM to host it, which is beyond reach for any home user.
If you become aware of other providers, please leave a comment and I will add them to the list.
Other Distill Models
There are many other distilled versions available. If your goal is to run the model locally, then you should evaluate them based on the benchmarks in the latter GitHub repository. Some small models (like the 1.5B and 7B) can be reasonably run on your local machine and perform decently well.
Top comments (0)