DEV Community

Cover image for Deploy Deepseek-R1: Guide to run multiple variants on AWS
Agam Jain
Agam Jain

Posted on

Deploy Deepseek-R1: Guide to run multiple variants on AWS

Hi Everyone

Deepseek-R1 is everywhere. So, we have done the heavy lifting for you to run each variant on the cheapest and highest-availability GPUs. All these configurations have been tested with vLLM for high throughput and auto-scale with the Tensorfuse serverless runtime.

Below is the table that summarizes the configurations you can run.

Supported GPU types for each variant of Deepseek R1<br>

Take it for an experimental spin

You can find the Dockerfile and all configurations in the GitHub repo below. Simply open up a GPU VM on your cloud provider, clone the repo, and run the Dockerfile.

Github Repo: https://github.com/tensorfuse/tensorfuse-examples/tree/main/deepseek_r1

Deploy a production-ready service on AWS using Tensorfuse

If you are looking to use Deepseek-R1 models in your production application, follow our detailed guide to deploy it on your AWS account using Tensorfuse.

The guide covers all the steps necessary to deploy open-source models in production:

  1. Deployed with the vLLM inference engine for high throughput
  2. Support for autoscaling based on traffic
  3. Prevent unauthorized access with token-based authentication
  4. Configure a TLS endpoint with a custom domain

Top comments (1)

Collapse
 
samagra_sharma_0b6d85c152 profile image
Samagra Sharma

Wow looks good !