Building a Real-Time Streaming Chatbot with Kotlin and Ollama AI

#ollama #kotlin #chatbot #streaming

In the ever-evolving landscape of artificial intelligence, creating a chatbot capable of handling real-time conversations with streaming responses is a fascinating challenge. In this blog post, we'll walk you through the process of building a Kotlin-based chatbot using Ollama AI. We'll cover everything from setting up the project to implementing and testing the streaming capabilities. Let's dive in!

Project Overview

Our goal is to create a chatbot that leverages Ollama AI to provide real-time streaming responses to user inputs. We'll use Kotlin, OkHttp, and MockWebServer for testing. The chatbot will handle streaming responses, displaying them as they are received, ensuring a smooth and interactive user experience.

1- Initialize the Kotlin Project:

Start by creating a new Kotlin project in IntelliJ IDEA.
Add the necessary dependencies in your build.gradle.kts file:

dependencies {
    implementation("org.jetbrains.kotlin:kotlin-stdlib")
    implementation("com.squareup.okhttp3:okhttp:4.9.1")
    implementation("org.json:json:20210307")
    implementation("org.jetbrains.kotlinx:kotlinx-coroutines-core:1.5.2")

    // Dependencies to test
    testImplementation(kotlin("test"))
    testImplementation("org.junit.jupiter:junit-jupiter-api:5.9.0")
    testRuntimeOnly("org.junit.jupiter:junit-jupiter-engine:5.9.0")
    testImplementation("org.jetbrains.kotlin:kotlin-test:1.9.10")
    testImplementation("org.jetbrains.kotlin:kotlin-test-junit:1.9.10")
    testImplementation("com.squareup.okhttp3:mockwebserver:4.9.1")
}

2- Install Ollama and Download the Model:

Before running the chatbot, you need to install Ollama on your machine and download the necessary model. Follow the Ollama installation guide to set up Ollama.
Once installed, download the model using the following command:

ollama pull llama2-uncensored

3- Create the OllamaClient:

This class will handle sending requests to Ollama AI and processing the streaming responses.

import okhttp3.*
import okhttp3.MediaType.Companion.toMediaType
import okhttp3.RequestBody.Companion.toRequestBody
import okio.BufferedSource
import org.json.JSONObject
import java.io.IOException

class OllamaClient {
    private val client = OkHttpClient()
    private val baseUrl = "http://localhost:11434/api/generate"

    fun streamResponse(prompt: String, onResponse: (String) -> Unit, onComplete: () -> Unit, onError: (Exception) -> Unit) {
        val requestBody = JSONObject()
            .put("model", "llama2-uncensored")
            .put("prompt", prompt)
            .put("stream", true)
            .toString()
            .toRequestBody("application/json".toMediaType())

        val request = Request.Builder()
            .url(baseUrl)
            .post(requestBody)
            .build()

        client.newCall(request).enqueue(object : Callback {
            override fun onFailure(call: Call, e: IOException) {
                onError(e)
            }

            override fun onResponse(call: Call, response: Response) {
                if (!response.isSuccessful) {
                    onError(IOException("Unexpected code $response"))
                    return
                }

                response.body?.use { responseBody ->
                    val source: BufferedSource = responseBody.source()
                    while (!source.exhausted()) {
                        val line = source.readUtf8Line()
                        if (line != null) {
                            val jsonResponse = JSONObject(line)
                            if (jsonResponse.has("response")) {
                                onResponse(jsonResponse.getString("response"))
                            }
                        }
                    }
                    onComplete()
                }
            }
        })
    }
}

4- Create the ConversationHandler:

This class will manage the conversation, ensuring that user inputs are processed and responses are displayed in real-time.

import kotlinx.coroutines.*

class ConversationHandler(private val ollamaClient: OllamaClient) {
    private val conversationHistory = mutableListOf<String>()

    fun start() = runBlocking {
        while (true) {
            print("You: ")
            val userInput = readLine()
            if (userInput.isNullOrEmpty()) break
            conversationHistory.add("You: $userInput")
            val context = conversationHistory.joinToString("\n")

            var completeResponse = ""
            ollamaClient.streamResponse(
                context,
                onResponse = { responseFragment ->
                    completeResponse += responseFragment
                    print("\rOllama: $completeResponse")
                },
                onComplete = {
                    println() // Move to the next line after completion
                    conversationHistory.add("Ollama: $completeResponse")
                    print("You: ")
                },
                onError = { e ->
                    println("\nOllama: Error - ${e.message}")
                    print("You: ")
                }
            )
        }
    }
}

5- Main Function:

This will serve as the entry point for your application.

fun main() {
    val ollamaClient = OllamaClient()
    val conversationHandler = ConversationHandler(ollamaClient)
    conversationHandler.start()
}

Testing the Streaming Response
To ensure that our OllamaClient handles streaming responses correctly, we'll write unit tests using MockWebServer.

Setup the Test Class:

import okhttp3.mockwebserver.MockResponse
import okhttp3.mockwebserver.MockWebServer
import org.json.JSONObject
import org.junit.After
import org.junit.Before
import org.junit.Test
import kotlin.test.assertEquals
import kotlin.test.assertNotNull

class OllamaClientTest {

    private lateinit var mockWebServer: MockWebServer
    private lateinit var ollamaClient: OllamaClient

    @Before
    fun setUp() {
        mockWebServer = MockWebServer()
        mockWebServer.start()
        ollamaClient = OllamaClient().apply {
            val baseUrlField = this::class.java.getDeclaredField("baseUrl")
            baseUrlField.isAccessible = true
            baseUrlField.set(this, mockWebServer.url("/api/generate").toString())
        }
    }

    @After
    fun tearDown() {
        mockWebServer.shutdown()
    }

    @Test
    fun `test streamResponse returns expected response`() {
        val responseChunks = listOf(
            JSONObject().put("response", "Hello").toString(),
            JSONObject().put("response", " there").toString(),
            JSONObject().put("response", ", how are you?").toString()
        )

        responseChunks.forEach { chunk ->
            mockWebServer.enqueue(MockResponse().setBody(chunk).setResponseCode(200))
        }

        val completeResponse = StringBuilder()
        val onCompleteCalled = arrayOf(false)

        ollamaClient.streamResponse(
            prompt = "hello",
            onResponse = { responseFragment ->
                completeResponse.append(responseFragment)
            },
            onComplete = {
                onCompleteCalled[0] = true
                assertEquals("Hello there, how are you?", completeResponse.toString())
            },
            onError = { e ->
                throw AssertionError("Error in streaming response", e)
            }
        )

        Thread.sleep(1000)
        assertEquals(true, onCompleteCalled[0])
    }

    @Test
    fun `test streamResponse handles error`() {
        mockWebServer.enqueue(MockResponse().setResponseCode(500).setBody("Internal Server Error"))

        var errorCalled = false

        ollamaClient.streamResponse(
            prompt = "hello",
            onResponse = { _ ->
                throw AssertionError("This should not be called on error")
            },
            onComplete = {
                throw AssertionError("This should not be called on error")
            },
            onError = { e ->
                errorCalled = true
                assertNotNull(e)
            }
        )

        Thread.sleep(1000)
        assertEquals(true, errorCalled)
    }
}

Repository

https://github.com/josmel/ChatbotKotlinOllama

Conclusion
Building a real-time streaming chatbot using Kotlin and Ollama AI is a rewarding challenge that showcases the power of modern AI and streaming capabilities. By following this guide, you can create a chatbot that not only responds quickly but also handles conversations smoothly. Remember to install Ollama and download the necessary model before running your project.

Happy coding! Feel free to reach out with any questions or comments. If you found this guide helpful, please share it with others and follow me for more Kotlin and AI tutorials!

DEV Community

Building a Real-Time Streaming Chatbot with Kotlin and Ollama AI

Repository

Top comments (0)

Read next

User Controllers, Services & File Upload (Day 4) - Creating a SaaS Startup in 30 Days

Setup REST-API service of AI by using Local LLMs with Ollama

Langchain: Question Answering over Documents

How I finally managed to set up Ktor project with ExposedORM