In this deep dive video, we zoom in on two popular techniques for parameter-efficient training, LoRA/QLoRA and Spectrum.
We discuss their mathematical foundations in detail, including Singular Value Decomposition (SVD). Then, we look at some benchmarks on popular Small Language Models, Mistral-7b and Llama-3.1–8b. We conclude that Spectrum is the better choice, both in terms of training speed and model quality, and is even competitive with the accuracy of full fine-tuning.
Top comments (0)