Building Code Grader Feedback System Using Meta LLaMA and Naga AI API

In the world of software development, receiving timely and actionable feedback on code quality is crucial to ensure best practices and maintain high standards. The Code Grader Feedback project was developed to address this need, evaluating code in multiple programming languages such as C, C++, Python, JavaScript, Rust, and SQL, and offering constructive feedback to developers. In this article, we'll dive into how we tackled challenges, fine-tuned the model, and leveraged the Naga AI API for code grading.

Setting Things Up: Challenges and Resolutions

The development process wasn't without its hurdles. One significant challenge arose during the quantization of the fine-tuned Meta LLaMA model. The fine-tuning was performed in a Google Colab notebook utilizing CUDA for efficient processing. However, this posed a compatibility issue when attempting to run the quantized model on a Mac, which lacks native support for CUDA.

To address this challenge, we had to explore alternatives for running the model on macOS. This involved transitioning to a CPU-based execution environment, which resulted in slower performance compared to the CUDA-optimized version. The transition required adjustments in our model's architecture and inference code to accommodate the limitations of the Mac's hardware. We also had to carefully manage memory usage to ensure that the quantized model could operate efficiently without running into performance bottlenecks.

The integration of the Meta LLaMA model into the code feedback system was a core part of the project. Meta LLaMA was used as the foundation to analyze and interpret various coding languages, but the model needed further optimization to handle this specific task effectively.

To solve this, we employed the LoRA (Low-Rank Adaptation) technique, which allowed us to fine-tune the Meta LLaMA model in a resource-efficient way. LoRA optimizes only the necessary parameters, significantly speeding up the process and reducing memory usage.

Another challenge was setting up a user-friendly and seamless interface. The project utilizes React for the frontend and FastAPI for the backend, ensuring smooth interaction between the user and the system. We also incorporated modern tools like Tailwind CSS and Vite to optimize the development experience.

Comparing API Performance

For code grading, we used the Naga AI API, which provides efficient, scalable code grading across multiple languages. The API’s performance was impressive, offering quick responses and accurate evaluations. When compared to other APIs, we found Naga AI to be better suited for tasks where fast and scalable code evaluation is essential. However, when it came to generating more detailed and nuanced feedback, especially in areas like code structure, Meta LLaMA, fine-tuned using LoRA, provided a more robust and insightful analysis.

Guardrailing and Fine-Tuning

Fine-tuning was a key aspect of this project. We employed supervised fine-tuning (SFT) techniques to customize the Meta LLaMA model specifically for code analysis tasks. This involved using the SFTTrainer class to fine-tune the model on a custom dataset, optimizing it for instruction-following tasks such as code generation and completion.

Key fine-tuning details:

Learning Rate: 2e-4, carefully chosen for balanced optimization.
Batch Size: Each device handled a batch size of 4, with gradient accumulation for memory efficiency.
Precision: Mixed precision (FP16 or bf16) enhanced training speed and memory efficiency.
Optimizer: A memory-efficient AdamW variant, ideal for large models on GPUs with limited memory.

This process was streamlined with the use of The Stack, a dataset part of the BigCode Project, which includes over 6TB of source code from 358 programming languages. This diverse dataset enabled the model to handle a wide range of languages and code patterns effectively.

Unique Endpoints and Features

Both the Naga AI API and the fine-tuned Meta LLaMA model have unique strengths:

Naga AI API: Ideal for fast code grading across multiple languages. It was particularly helpful in assessing the performance and structure of code, adhering to industry standards, and providing a comprehensive grading system.
Meta LLaMA: Excelled in generating detailed feedback for code simplification, refactoring, and modularity. Its insights were instrumental in guiding developers to improve their code's scalability, readability, and maintainability.

An interesting feature of the feedback provided by Meta LLaMA is its focus on:

Code simplification: Recommending ways to reduce complexity.
Refactoring: Offering suggestions to make code more maintainable.
Modularity and scalability: Ensuring the code can be reused and adapted for larger projects.

Conclusion

The Code Grader Feedback project successfully merges the capabilities of the Naga AI API for grading and a fine-tuned Meta LLaMA model for detailed feedback generation. By combining these two tools, we were able to build a system that evaluates code performance while offering actionable insights on how to improve it.

Through effective use of fine-tuning techniques and leveraging a robust dataset like The Stack, we've created a tool that can be expanded to support multiple programming languages, helping developers improve their coding skills with early feedback loops.

DEV Community

Building Code Grader Feedback System Using Meta LLaMA and Naga AI API

Setting Things Up: Challenges and Resolutions

Comparing API Performance

Guardrailing and Fine-Tuning

Unique Endpoints and Features

Conclusion

Top comments (0)

Read next

Deploying a Serverless Function with AWS Lambda

Master Git with this Cheatsheet for Beginners...

Day 1: Serverless

Bootstrap Basics