Ok, it's been a week or two since the DeepSeek fever shook the AI world again (as it often happens when a new model appears). The initial reaction is often to figure out how to get hands-on experience with it, though running it locally is only feasible with the distilled versions unless you opt for alternative methods.
What makes it particularly interesting is its ability to transfer reasoning capabilities to Small Language Models (SLMs) via distillation—from the massive 647B R1 down to 8B models using architectures like Llama and Qwen.
After testing these distilled models, which are still reasoning models at their core, we can observe they behave differently from traditional LLMs.
A reasoning model has a totally different goal. It has to evaluate all the options available to be sure the answer it gives is correct.
What does this mean?
✅ Accuracy over speed: Reasoning models analyze multiple possibilities before generating responses, leading to longer processing times—but also ideally, more precise outputs.
✅ Local testing: You can quickly prototype with the distilled models by downloading Ollama and running them locally. This also helps on scale deployments, on-prem and using APIs.
❌ Open Source? : Not quite. Despite the trend toward openness, DeepSeek R1 itself is not open-source. (More on that in this post).
🚀 Reasoning in SLMs: The most exciting takeaway: SLMs with reasoning capabilities could reshape how we think about efficient, intelligent AI at small scales.
Did you try it?
Check the full article
Happy coding!
Top comments (0)