mehmet akar

Posted on Feb 12

Unsloth AI: Tutorial (Github: 1874 Stars Just for Today)

#ai #llm #unslothai #deepseek

I have been watching Unsloth AI Team for several months and it is the time to explore them. They gain attraction more and more everyday on Github. They got +28k in total with 1874 stars only for today(Feb. 12th) Why? It is not a surprise. They enable llm reasoning faster and lighter. This catches one of the big trends in terms of "AI development with less resources"

🚀 Comprehensive Tutorial on Unsloth AI: Finetuning LLMs 2x Faster with 80% Less Memory

With a rapidly growing user base, Unsloth is establishing itself as a high-performance finetuning framework for Large Language Models (LLMs), offering 2x faster training with 80% less memory usage.

🦥 What is Unsloth AI?

Unsloth AI is an open-source framework that allows users to finetune large language models (LLMs) like Llama 3, Mistral, Phi-4, Deepseek R1, Qwen 2.5, and Gemma at a significantly lower cost and with higher speed. It optimizes LoRA (Low-Rank Adaptation) finetuning, enabling researchers, startups, and AI enthusiasts to train powerful models on consumer-grade GPUs.

🔑 Key Features:

✅ 2x faster finetuning compared to current trainers

✅ 80% less memory usage, allowing models to run on smaller GPUs

✅ Supports a wide range of LLMs: Llama 3.3, Phi-4, Mistral, Gemma, and Qwen

✅ Export to GGUF, Ollama, vLLM, or upload to Hugging Face

✅ Free, beginner-friendly Google Colab Notebooks

Unsloth supports	Free Notebooks	Performance	Memory use
Llama 3.2 (3B)	▶️ Start for free	2x faster	70% less
GRPO (reasoning)	▶️ Start for free	2x faster	80% less
Phi-4 (14B)	▶️ Start for free	2x faster	70% less
Llama 3.2 Vision (11B)	▶️ Start for free	2x faster	50% less
Llama 3.1 (8B)	▶️ Start for free	2x faster	70% less
Gemma 2 (9B)	▶️ Start for free	2x faster	70% less
Qwen 2.5 (7B)	▶️ Start for free	2x faster	70% less
Mistral v0.3 (7B)	▶️ Start for free	2.2x faster	75% less
Ollama	▶️ Start for free	1.9x faster	60% less
DPO Zephyr	▶️ Start for free	1.9x faster	50% less

🔧 How to Install and Use Unsloth AI

📥 Installation

Unsloth AI supports both pip and Conda installations. Below are the recommended methods:

1️⃣ Pip Installation (Recommended)

pip install unsloth

For the latest updates from GitHub:

pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

2️⃣ Conda Installation

conda create --name unsloth_env python=3.11 pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers -y
conda activate unsloth_env
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps trl peft accelerate bitsandbytes

🚀 Using Unsloth AI: Finetuning Llama 3 (8B) in Minutes

Step 1: Load the Model

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/llama-3-8b-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True
)

Step 2: Apply LoRA for Efficient Training

model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
)

Step 3: Load Your Dataset

from datasets import load_dataset
dataset = load_dataset("json", data_files={"train": "your_data.json"}, split="train")

Step 4: Train the Model

from transformers import TrainingArguments
from trl import SFTTrainer

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=2048,
    tokenizer=tokenizer,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        max_steps=60,
        output_dir="outputs",
        fp16=True
    ),
)

trainer.train()

📊 Performance Benchmarks: Faster & More Memory-Efficient

Model	VRAM	Unsloth Speed	Memory Reduction
Llama 3.3 (70B)	80GB	2x Faster	75% less VRAM
Llama 3.1 (8B)	80GB	2x Faster	70% less VRAM

📌 Context Length Expansion:

Llama 3.1 (8B) supports up to 342K context length, while native maxes out at 128K.
Unsloth extends context length by 13x, making it a top choice for long-context training.

💰 Unsloth AI Valuation & Funding Status: Only $500K Pre-Seed, No Seed Round Yet!

Unsloth AI secured $500,000 in pre-seed funding through Y Combinator (YC) but has not yet raised a seed round. This presents a golden opportunity for early investors. It has core advantages that cannot be easily copied. They do something different behind the scenes of fine-tuning LLMs, thereby accelerating LLMs & lowering the capacity needed.

🎯 Unsloth AI: The Final View

With its rapid growth, unmatched efficiency, and a still-open investment window, Unsloth AI is revolutionizing LLM reasoning. Whether you're an AI developer, researcher, or investor, now is the perfect time to explore Unsloth.

👉 Try it out now: Unsloth Documentation

👉 Join the Community: Discord

👉 GitHub Stars: +28k (and counting!)

DEV Community