DEV Community

Cover image for Unlock DeepSeek R1 7B on Your Laptop—Experience the Smartest AI Model I Ever Tested!
Alexander Uspenskiy
Alexander Uspenskiy

Posted on

Unlock DeepSeek R1 7B on Your Laptop—Experience the Smartest AI Model I Ever Tested!

To be honest, I didn’t have high expectations for the buzz surrounding DeepSeek R1. However, I decided to test the 7B DeepSeek model, deepseek-ai/deepseek-r1-distill-qwen-7b, and what I discovered truly amazed me.

So let's start.

This is a project that utilizes transformer models for generating responses to user queries. It leverages the transformers library from HuggingFace and torch for efficient model handling and inference.

Setup

Prerequisites

  • Python 3.7 or higher
  • pip (Python package installer)

Installation

  1. Clone the repository:

    git clone https://github.com/alexander-uspenskiy/deepseek
    cd deepseek
    
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate
    
  3. Install the required packages:

    pip install transformers torch
    

Usage

  1. Run the main script:

    python deepseek.py
    
  2. Follow the prompts to enter your questions. Type 'quit' to exit the interaction.

Project Structure

  • deepseek.py: Main script containing the model setup and response generation logic.

Example

(venv) $ python deepseek.py

Enter your question (or 'quit' to exit): A bat and a ball costs 1 dollar and 10 cents in total. The bat costs 1 dollar more than a ball. How much does the ball costs?

Response: Question: A bat and a ball costs 1 dollar and 10 cents in total. The bat costs 1 dollar more than a ball. How much does the ball costs?
Answer: 5 cents.

But wait, that doesn't make sense. If the ball is 5 cents, then the bat is 1 dollar more, which would be $1.05, and together that's $1.10. So, the ball is 5 cents.

Wait, but that seems a bit tricky. Let me think again. Maybe I should set up some equations to solve this properly.

Let me denote the cost of the ball as \( x \) dollars. Then, according to the problem, the bat costs \( x + 1 \) dollars. The total cost of the bat and the ball is $1.10, so:

\( x + (x + 1) = 1.10 \)

Simplify the equation:

\( 2x + 1 = 1.10 \)

Subtract 1 from both sides:

\( 2x = 0.10 \)

Divide both sides by 2:

\( x = 0.05 \)

So, the ball costs 5 cents. That seems correct, but let me verify.

If the ball is 5 cents, the bat is $1.05. Adding them together: 5 + 105 = 110 cents, which is $1.10. Yes, that checks out.

Wait, but sometimes these problems can be tricky. Is there any other way to interpret the problem that could lead to a different answer? Maybe the phrasing could be ambiguous. Let me read it again:

"A bat and a ball costs 1 dollar and 10 cents in total. The bat costs 1 dollar more than a ball. How much does the ball costs?"

Hmm, no, the way it's phrased seems pretty straightforward. It says the total is $1.10, and the bat is $1 more than the ball. So, with the equations I set up, it leads to the ball being 5 cents.
Enter fullscreen mode Exit fullscreen mode

As you see the response shows the whole reasoning process which is amazing for the model that can be executed on your laptop.

Source Code

    from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

def setup_model():
    # Model ID from HuggingFace
    model_id = "deepseek-ai/deepseek-r1-distill-qwen-7b"

    # Initialize tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

    # Load model with lower precision for memory efficiency
    model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.float16,  # Use fp16 for efficiency
        device_map="auto",  # Automatically handle device placement
        trust_remote_code=True
    )

    return model, tokenizer

def generate_response(model, tokenizer, prompt, max_length=512):
    # Tokenize input
    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True).to(model.device)

    # Generate response
    with torch.no_grad():
        outputs = model.generate(
            inputs.input_ids,
            attention_mask=inputs.attention_mask,  # Pass attention_mask
            max_length=max_length,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id,
        )

    # Decode and return response
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response

def main():
    try:
        # Setup model and tokenizer
        model, tokenizer = setup_model()

        # Example QA interaction
        while True:
            question = input("\nEnter your question (or 'quit' to exit): ")
            if question.lower() == 'quit':
                break

            prompt = f"Question: {question}\nAnswer:"
            response = generate_response(model, tokenizer, prompt)
            print(f"\nResponse: {response}")

    except Exception as e:
        print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Troubleshooting

If you encounter issues with the model download or execution, ensure that your internet connection is stable and try the following steps:

  1. Ensure the virtual environment is activated:

    source venv/bin/activate
    
  2. Reinstall the required packages:

    pip install --upgrade transformers torch
    
  3. Check the Python interpreter being used:

    which python
    

Top comments (0)