How to Create RAG using DeepSeek R1, Ollama & Semantic Kernel .NET

#deepseek #ai #rag #csharp

Have you ever wanted to combine your own data with AI to get instant insights? In this blog post, we’ll explore exactly how to do that by building a Retriever-Augmented Generation (RAG) application using DeepSeek R1, Ollama, and Semantic Kernel. Our example scenario is a simple expense manager that tracks daily spending and lets AI answer natural-language questions like:

"How much did I spend on coffee?"
"Which day did I spend the most money overall?"
"Which day did I spend the least money overall?"

By the end of this post, you’ll see how you can load your own data (e.g., a text file, PDF documents, or others) and perform natural language Q&A over it using these tools.

For a more in-depth walk-through, be sure to watch my YouTube video, where I demonstrate everything step by step!

What is RAG (Retriever-Augmented Generation)?

Retriever-Augmented Generation (RAG) is an approach that combines:

A Retriever: A system/component to fetch relevant documents or pieces of data from a larger collection (e.g., your expense records).
A Language Model: A large language model (LLM) that can generate natural language responses based on prompts and retrieved context.

When you ask a question, RAG retrieves relevant snippets or documents from your dataset and uses them to ground the LLM’s response. This helps the AI give more accurate and context-aware answers by focusing on information you’ve provided. Essentially, you augment the AI’s knowledge by feeding it the specific data you care about.

Use Cases of RAG

Customer Support: Quickly retrieve knowledge base articles and feed them to an LLM for on-demand issue resolution.
Content Search: Search large text corpuses and summarize or answer queries about them.
Personal Knowledge Management: Organize personal notes or expense records, and use AI to query them or generate insights.
Finance and Expense Tracking: Like in our example, store transaction history, then ask AI to identify spending patterns, daily totals, or anomalies.

1. Prerequisites

Visual Studio 2022+ (with .NET 9 SDK installed) .NET 9 is still in preview, so ensure that you have the preview SDK installed.
Ollama (for managing and running local models)
DeepSeek1.5b Model

2. Installing Ollama

Ollama is a platform or tool (specific details may vary depending on the context) that allows users to interact with large language models (LLMs) locally. It simplifies the process of deploying and running LLMs like LLaMA, Phi, DeepSeek R1, or other open-source models.

To install Ollama visit its official website https://ollama.com/download and install it on your machine.

3. Installing DeepSeek R1

DeepSeek's first-generation reasoning models with comparable performance to OpenAI-o1, including six dense models distilled from DeepSeek-R1 based on Llama and Qwen.

On the Ollama website click on Models and click on deepseek-r1 and choose 1.5b parameter option

Open Command Prompt and run the below command.

ollama run deepseek-r1:1.5b

It will download the model and start running automatically.

Once done, verify the model is available

ollama list

That’s it! We’re ready to integrate DeepSeek locally.

4. Creating .NET Console Application

Launch Visual Studio
Make sure .NET 9 is installed.
Create a New Project
File → New → Project…
Pick Console App with .NET 9.
Name Your Project
e.g., DeepSeekDemoApp or any name you prefer.
Target Framework Check
Right-click on your project → Properties.
Set Target Framework to .NET 9.

5. Integrating DeepSeek R1 with Semantic Kernel

While you could call DeepSeek via direct HTTP requests to Ollama, using Semantic Kernel offers a powerful abstraction for prompt engineering, orchestration, and more.

Add Necessary NuGet Packages

<ItemGroup>
  <PackageReference Include="Microsoft.KernelMemory.AI.Ollama" Version="0.97.250211.1" />
<PackageReference Include="Microsoft.KernelMemory.Core" Version="0.97.250211.1" />
  <PackageReference Include="Microsoft.SemanticKernel" Version="1.35.0" />
</ItemGroup>

Download Complete Project

6. Source Code

using Microsoft.KernelMemory;
using Microsoft.KernelMemory.AI.Ollama;

var config = new OllamaConfig
{
    Endpoint = "http://localhost:11434",
    TextModel = new OllamaModelConfig("deepseek-r1:1.5b", 131072),
    EmbeddingModel = new OllamaModelConfig("deepseek-r1:1.5b", 2048)
};

var memory = new KernelMemoryBuilder()
    .WithOllamaTextGeneration(config)
    .WithOllamaTextEmbeddingGeneration(config)
    .Build<MemoryServerless>();

Console.WriteLine("Processing document, please wait...");

await memory.ImportDocumentAsync("expenses.txt", documentId: "DOC001");

Console.WriteLine("Model is ready to take questions\n");

while(await memory.IsDocumentReadyAsync("DOC002"))
{

    Console.WriteLine("Ask your questions\n");

    var question = Console.ReadLine();

    var answer = await memory.AskAsync(question);

    Console.WriteLine(answer.Result);

    Console.WriteLine("\n Sources:");

    foreach(var x in answer.RelevantSources)
    {
        Console.WriteLine($" {x.SourceName} - {x.SourceUrl} - {x.Link}");
    }

}

Daily expense data

Date        | Category       | Amount
------------|----------------|-------
2025-02-13  | Groceries      | 45.30
2025-02-13  | Transport      | 12.00
2025-02-14  | Coffee         | 4.50
2025-02-14  | Rent           | 800.00
2025-02-15  | Groceries      | 54.20
2025-02-16  | Entertainment  | 30.00
2025-02-16  | Transport      | 10.00
2025-02-17  | Bills          | 120.00
2025-02-17  | Coffee         | 3.75
2025-02-18  | Groceries      | 62.40

Save this info as expenses.txt, put it into the bin directory, and run the application.

Now you are good for asking some questions like:

“How much did I spend on coffee?”
“Which day did I spend the most money overall?”
“Which day did I spend the least money overall?”

The AI will use the embedded expense data to calculate or deduce answers. It should also list the sources (e.g., the document ID containing relevant info)

Support me

If you found this guide helpful, make sure to check out the accompanying YouTube video tutorial, in which I walk you through the process visually. Don’t forget to subscribe to my channel for more amazing tutorials!

I would appreciate it if you could buy me a coffee.