atsushi-ambo

Posted on Feb 5

Running Qwen, Nearly as Powerful as DeepSeek, on a MacBook Pro

#deepseek #qwen #llm

Running Qwen Locally on a MacBook Pro

Qwen (Qwen GitHub Repository) has been gaining attention recently as a powerful open-source large language model (LLM). I decided to give it a spin on my MacBook Pro using Ollama, a platform designed for running local LLMs. While Qwen2.5-Max boasts the highest performance, my setup could only handle the smaller Qwen2.5 (32B) model. Here's what I found!

Hardware and Software Environment

Device: MacBook Pro (M3 Pro)
Memory: 36GB RAM
Software: Latest version of Ollama

Installing Ollama

To get started, I installed Ollama. With Homebrew, the installation process is straightforward:

brew install ollama

# Verify installation
ollama --version
# Output: ollama version is 0.5.7

Downloading and Running Qwen2.5 (32B)

Once Ollama was set up, I proceeded to download and run the Qwen2.5 (32B) model:

ollama run qwen2.5:32b
# Output during the first run:
pulling manifest 
pulling eabc98a9bcbf... 100% ▕█████████████████████████████████████████████████████████████████████████████▏  19 GB
...
success

The model weighs about 20GB, so depending on your internet speed, downloading may take some time.

Memory Usage Insights

Running the Qwen2.5 (32B) model requires substantial memory. With my MacBook Pro M3 and its 36GB RAM, these were the key observations:

Memory Usage: About 30-34GB was utilized while running the model. (Baseline system memory before starting: ~15GB.)
Performance: Each response took around 1–2 minutes, varying based on the complexity of the prompt.
Extended usage caused the laptop fan to start running, but the overall experience was stable.

Caveats to Keep in Mind

Large Initial Download: Be prepared for the ~20GB download on the first launch.
High Memory Usage: Ensure you manage other applications to free up resources.
Slower Response Without GPU: In setups without a dedicated GPU, inference times can be slower.

Example Task: Round-Trip Train Calculation

To test the model’s reasoning and computation abilities, I used the following prompt:

ollama run qwen2.5:32b
>>> A train travels from city A to city B at 60 km/h and takes 2 hours. On the return journey, due to headwind, it travels at 40 km/h. 
... 1) What is the total distance between cities A and B?
... 2) How long does the return journey take?
... 3) What is the average speed for the entire round trip?
... Please show all calculations.

Qwen's Response

Below is the model's detailed response:

### 1) Total Distance Between Cities A and B
Distance = Speed × Time  
= 60 km/h × 2 h   
= **120 km**

The total distance is 120 km.

### 2) Duration of the Return Journey
Time = Distance ÷ Speed  
= 120 km ÷ 40 km/h  
= **3 hours**

The return journey takes 3 hours.

### 3) Average Speed for the Entire Round Trip
- Total Distance = 120 km (to B) + 120 km (return) = **240 km**
- Total Time = 2 hours (to B) + 3 hours (return) = **5 hours**
- Average Speed = Total Distance ÷ Total Time  
= 240 km ÷ 5 h  
= **48 km/h**

The average speed for the round trip is 48 km/h.

I cross-verified the results with Gemini, GPT-4, and Claude 3.5, all of which produced the same answers.

Uninstalling Qwen

If you no longer need the model, it can be removed efficiently using the following command:

ollama rm qwen2.5:32b
# Output: deleted 'qwen2.5:32b'

Summary

Running Qwen2.5 (32B) on a MacBook Pro (M3, 36GB RAM) was possible, albeit with tight memory constraints. I had to close other applications to allocate sufficient resources. Despite the slower response times in a CPU-only environment, Qwen did a commendable job handling complex queries.

Although my benchmark was limited to Qwen2.5 (32B), the official Qwen2.5-Max benchmark suggests its performance is on par with models like DeepSeek-V3 and GPT-4. This makes Qwen an exciting open-source alternative worth watching!

Are you planning to try Qwen on your setup? Let me know your experience in the comments below! 🚀

DEV Community

Running Qwen, Nearly as Powerful as DeepSeek, on a MacBook Pro

Running Qwen Locally on a MacBook Pro

Hardware and Software Environment

Installing Ollama

Downloading and Running Qwen2.5 (32B)

Memory Usage Insights

Caveats to Keep in Mind

Example Task: Round-Trip Train Calculation

Qwen's Response

Uninstalling Qwen

Summary

Top comments (0)

Read next

How to build an AI Agent in Javascript from scratch

Calling LangChain from Go (Part 1)

Unlock AI Magic – Completely Free with Gemini 2.0 Flash & Google AI Studio

Exploring The Role Of Prompt Engineering In Secure Code Generation