Josh Mo

Posted on Feb 26

Implementing Design Patterns for Agentic AI with Rig & Rust

#programming #tutorial #rust #ai

In this tutorial, we'll be re-implementing some of the common design patterns you might find in agentic AI, using Rig. These examples will primarily be following Anthropic's article on building effective agents - so you can expect to see these commonly being used whatever kind of agent you're building!

By the end of this article, you'll have a good understanding of how to create the following agentic AI design patterns in Rust using the rig framework:

Prompt chaining
Routing
Parallelization
Orchestrator-worker
Evaluator-optimizer
Autonomous agent

Note that if you want to find any of the full code examples, you can find them in the rig repo.

Prompt chaining

Prompt chaining simply decomposes a task into a list of smaller steps, where we make LLM calls one after the other, piping the result of the previous LLM call into the next (or the prompt, if it's the first LLM call). You can additionally add gates between LLM calls to exit promptly if the given answer from the LLM is either unsatisfactory or has escaped the bounds of what the answer should be.

To get started, we'll create our OpenAI client then create our agents:

use std::env;

use rig::providers::openai::Client;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    let rng_agent = openai_client.agent("gpt-4")
        .preamble("
            You are a random number generator designed to only either output a single whole integer that is 0 or 1. Only return the number.
        ")
        .build();

    let adder_agent = openai_client.agent("gpt-4")
        .preamble("
            You are a mathematician who adds 1000 to every number passed into the context, except if the number is 0 - in which case don't add anything. Only return the number.
        ")
        .build();

    // .. more code down here!
}

Next, we'll create our pipeline then simply put the agents in our pipeline sequentially after each other. Simple!

use rig::pipeline::{self, Op};

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // .. previoiusly entered code goes here!

    let chain = pipeline::new()
        // Generate a whole number that is either 0 and 1
        .prompt(rng_agent)
        .map(|x| x.unwrap())
        .prompt(adder_agent);

    // Prompt the agent and print the response
    let response = chain
        .call("Please generate a single whole integer that is 0 or 1".to_string())
        .await;

    println!("Pipeline result: {response:?}");

    Ok(())
}

Although this is a relatively basic technique, it's very common and can be applied to a wide variety of use cases.

Prompt routing

Prompt routing is a way to mitigate prompt injection by categorising the user statement into a number of topics. To illustrate, our example will categorise a user's statement by either sheep, cows or dogs and then changing the prompt based on the given topic.

As before, we'll create our agents:

use std::env;
use rig::providers::openai::Client;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    // Note that you can also create your own semantic router for this
    // that uses a vector store under the hood
    let animal_agent = openai_client.agent("gpt-4")
        .preamble("
            Your role is to categorise the user's statement using the following values: [sheep, cow, dog]

            Return only the value.
        ")
        .build();

    let default_agent = openai_client.agent("gpt-4").build();

    // .. more code to go down here!
}

Note that while we use a default agent as the final step to illustrate how to use this pattern, you can essentially put whatever you want there - be it an AI agent with tools, another mapping function or something else.

use rig::pipeline::{self, Op, TryOp};

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // previous code goes here

    let chain = pipeline::new()
        // Use our classifier agent to classify the agent under a number of fixed topics
        .prompt(animal_agent)
        // Change the prompt depending on the output from the prompt
        .map_ok(|x: String| match x.trim() {
            "cow" => Ok("Tell me a fact about the United States of America.".to_string()),
            "sheep" => Ok("Calculate 5+5 for me. Return only the number.".to_string()),
            "dog" => Ok("Write me a poem about cashews".to_string()),
            message => Err(format!("Could not process - received category: {message}")),
        })
        .map(|x| x.unwrap().unwrap())
        // Send the prompt back into another agent with no pre-amble
        .prompt(default_agent);

    // Prompt the agent and print the response
    let response = chain.try_call("Sheep can self-medicate").await?;

    println!("Pipeline result: {response:?}");

    Ok(())
}

This technique is quite helpful as it allows you to essentially categorise your prompt by topic then create a response or do something else based on the topic.

Parallelization

In some workloads, you may want to parallelize your API calls by sending them at exactly the same time, return the result from all of them and display the results.

Below is an example of how you can create a pipeline using Rig and parallelize your API calls extracting the information we want as a JSON object, then map the three results into a single result that we then present to the user.

First, let's declare our JSON output - we can do this by declaring a struct that derives schemars::JsonSchema which allows a JSON schema to be outputted for a struct.

use rust::{Deserialize, Serialize};
use schemars::JsonSchema;

#[derive(Deserialize, JsonSchema, Serialize)]
struct DocumentScore {
    /// The score of the document
    score: f32,
}

Next, let's define our agents. In this example, we define three different agents to get three different types of sentiment for a given statement (note these are arbitrary characteristics - you can use basically whatever you want):

How manipulative it sounds
How depressive it sounds
How intelligent it sounds

use std::env;
use rig::providers::openai::Client;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    let manipulation_agent = openai_client
        .extractor::<DocumentScore>("gpt-4")
        .preamble(
            "
            Your role is to score a user's statement on how manipulative it sounds between 0 and 1.
        ",
        )
        .build();

    let depression_agent = openai_client
        .extractor::<DocumentScore>("gpt-4")
        .preamble(
            "
            Your role is to score a user's statement on how depressive it sounds between 0 and 1.
        ",
        )
        .build();

    let intelligent_agent = openai_client
        .extractor::<DocumentScore>("gpt-4")
        .preamble(
            "
            Your role is to score a user's statement on how intelligent it sounds between 0 and 1.
        ",
        )
        .build();

    // .. more code down here
}

Once that's done, we'll create our pipeline and run it. We use the parallel!() macro, which allows us to be able to chain multiple function calls in a single operation and allows us to then map the results as a tuple of results in the .map() function. Note that we also use passthrough() which allows us to simply pass the original prompt through to the next step in the chain.

use rig::pipeline::agent_ops::extract;
use rig::{
    parallel,
    pipeline::{self, passthrough, Op},.
};

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // .. previous code goes here!

    let chain = pipeline::new()
        .chain(parallel!(
            passthrough(),
            extract(manipulation_agent),
            extract(depression_agent),
            extract(intelligent_agent)
        ))
        .map(|(statement, manip_score, dep_score, int_score)| {
            format!(
                "
                Original statement: {statement}
                Manipulation sentiment score: {}
                Depression sentiment score: {}
                Intelligence sentiment score: {}
                ",
                manip_score.unwrap().score,
                dep_score.unwrap().score,
                int_score.unwrap().score
            )
        });

    // Prompt the agent and print the response
    let response = chain
        .call("I hate swimming. The water always gets in my eyes.")
        .await;

    println!("Pipeline run: {response:?}");

    Ok(())
}

Orchestrator-worker

Many AI-assisted production workloads use a central orchestrator to manage the execution of agents across a pipeline. The orchestrator decides which agent to trigger based on factors like the current task’s progress, complexity, priority, and resource availability. For instance, it may start with an agent handling data preprocessing, then move on to model training, and finally route the data for post-processing or analysis, depending on the results.

The orchestrator can also monitor system performance, ensuring that tasks are executed efficiently and that delays or failures in one part of the pipeline don’t affect the rest of the workflow. In production, it ideally should dynamically adjust the pipeline based on real-time metrics, scaling or optimizing agent operations as needed to maintain performance and avoid bottlenecks.

To get started, we'll define our structs for extracting JSON output from a response:

use schemars::JsonSchema;

#[derive(serde::Deserialize, JsonSchema, serde::Serialize, Debug)]
struct Specification {
    tasks: Vec<Task>,
}

#[derive(serde::Deserialize, JsonSchema, serde::Serialize, Debug)]
struct Task {
    original_task: String,
    style: String,
    guidelines: String,
}

#[derive(serde::Deserialize, JsonSchema, serde::Serialize, Debug)]
struct TaskResults {
    style: String,
    response: String,
}

Next, we'll create our agents. The premable (system prompt) is relatively long for the classify agent, so we'll give it its own constant value to assist with readability.

use std::env;
use rig::providers::openai::Client;

const CLASSIFY_PREAMBLE: &str = "Analyze the given task and break it down into 2-3 distinct approaches.

            Provide an Analysis:
            Explain your understanding of the task and which variations would be valuable.
            Focus on how each approach serves different aspects of the task.

            Along with the analysis, provide 2-3 approaches to tackle the task, each with a brief description:

            Formal style: Write technically and precisely, focusing on detailed specifications
            Conversational style: Write in a friendly and engaging way that connects with the reader
            Hybrid style: Tell a story that includes technical details, combining emotional elements with specifications

            Return only JSON output.";

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    // create an agent that gives a specification for a task
    // given a couple of differnet provided approaches
    let classify_agent = openai_client.extractor::<Specification>("gpt-4")
        .preamble(CLASSIFY_PREAMBLE)
        .build();

    // using the previous agent, we'll extract a specification from this prompt
    let specification = classify_agent.extract("
        Write a product description for a new eco-friendly water bottle.
        The target_audience is environmentally conscious millennials and key product features are: plastic-free, insulated, lifetime warranty
        ").await.unwrap();

    // finally, we'll generate content based on the generated response
    let content_agent = openai_client
        .extractor::<TaskResults>("gpt-4")
        .preamble(
            "
                Generate content based on the original task, style, and guidelines.

                Return only your response and the style you used as a JSON object.
                ",
        )
        .build();

    // .. more code goes down here
}

For each task result, we want to try and create some content using each of the style, the task given and the guidelines. We'll then collect this into an array of task results and use an LLM to figure out which one is the most appropriate for our task.

use std::env;
use rig::providers::openai::Client;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // .. previous code goes here

    let mut vec: Vec<TaskResults> = Vec::new();

    for task in specification.tasks {
        let results = content_agent
            .extract(&format!(
                "
            Task: {},
            Style: {},
            Guidelines: {}
            ",
                task.original_task, task.style, task.guidelines
            ))
            .await
            .unwrap();

        vec.push(results);
    }

    // .. more code goes here
}

Finally now that we have the content results that we want, we use an LLM-as-judge to help us figure out what the best piece of writing was, and to give us its reasoning. It'll return the style it has chosen as well as the corresponding material.

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // .. previous code goes here
    let judge_agent = openai_client
        .extractor::<Specification>("gpt-4")
        .preamble(
            "
            Analyze the given written materials and decide the best one, giving your reasoning.

            Return the style as well as the corresponding material you have chosen as a JSON object.
            ",
        )
        .build();

    let task_results_raw_json = serde_json::to_string_pretty(&vec).unwrap();

    let results = judge_agent.extract(&task_results_raw_json).await.unwrap();

    println!("Results: {results:?}");

    Ok(())
}

Although this pattern can be a little long-winded when it comes to writing all the code out, it's a very effective design for large systems where you don't really want a person to have to intervene in the loop.

Evaluator-optimizer

For coding use cases and other similar tasks where you need your agent to carry out some task without human guidance, you may want to use the evaluator-optimizer pattern which asks the LLM to complete the given task based on user input. If there is any feedback from previous message generation, then the AI agent should reflect and use them to improve the solution.

This workflow is quite similar to Chain of Thought prompting where you use multiple intermediate steps to be able to increase the LLM's reasoning capabilities.

To start with, we'll create our structs for extracting JSON from an LLM output:

use schemars::JsonSchema;

#[derive(serde::Deserialize, JsonSchema, serde::Serialize, Debug)]
struct Evaluation {
    evaluation_status: EvalStatus,
    feedback: String,
}

#[derive(serde::Deserialize, JsonSchema, serde::Serialize, Debug, PartialEq)]
enum EvalStatus {
    Pass,
    NeedsImprovement,
    Fail,
}

Once done, we'll need to create our agents as usual. The preamble for each agent has been extracted into its own variable for readability - note that we ask provide specific details on what the AI agent should output.

use std::env;

use rig::{completion::Prompt, providers::openai::Client};

const TASK: &str = "Implement a Stack with:
1. push(x)
2. pop()
3. getMin()
All operations should be O(1).
";

const GEN_AGENT_PREAMBLE: &str = "Your goal is to complete the task based on <user input>. If there are feedback
            from your previous generations, you should reflect on them to improve your solution

            Output your answer concisely in the following format:

            Thoughts:
            [Your understanding of the task and feedback and how you plan to improve]

            Response:
            [Your code implementation here]";

const EVAL_AGENT_PREAMBLE: &str = "Evaluate this following code implementation for:
            1. code correctness
            2. time complexity
            3. style and best practices

            You should be evaluating only and not attempting to solve the task.

            Only output \"PASS\" if all criteria are met and you have no further suggestions for improvements.

            Provide detailed feedback if there are areas that need improvement. You should specify what needs improvement and why.

            Only output JSON.";
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    let generator_agent = openai_client
        .agent("gpt-4")
        .preamble(GEN_AGENT_PREAMBLE)
        .build();

    let evaluator_agent = openai_client.extractor::<Evaluation>("gpt-4")
        .preamble(EVAL_AGENT_PREAMBLE)
        .build();
    // .. more code below
}

Next, we'll need to prompt our generate agent with the task. Initially there will be no chat history, so we simply just prompt it and add the response to a list of memories. We then loop, extracting an answer from our evaluation agent and checking if the struct (deserialized from JSON) has passed the evaluation. If it has then break the loop - if not, add the feedback and prompt the generator agent, adding the response from the generator agent back to the list of memories.

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // .. previous code goes here

    let mut memories: Vec<String> = Vec::new();

    let mut response = generator_agent.prompt(TASK).await.unwrap();
    memories.push(response.clone());

    loop {
        let eval_result = evaluator_agent
            .extract(&format!("{TASK}\n\n{response}"))
            .await
            .unwrap();

        if eval_result.evaluation_status == EvalStatus::Pass {
            break;
        } else {
            let context = format!("{TASK}\n\n{}", eval_result.feedback);

            response = generator_agent.prompt(context).await.unwrap();
            memories.push(response.clone());
        }
    }

    println!("Response: {response}");

    Ok(())

The evaluator-optimizer pattern is a relatively useful pattern because it's a good way to get your agents to do work autonomously and self-improve the output by itself without needing any kind of human interaction. In comparison to autonomous agents, this design pattern primarily revolves around using the LLM as a judge to figure out when it's considered "done" rather than an arbitrary condition in the environment.

Autonomous agent

The last workflow to showcase is the autonomous agent, which carries out a task until a goal has been achieved. This pattern is relatively simple, but requires no human intervention which can be pretty helpful.

In this example, we will simply have an LLM add a random number between 1 and 64 to a number that we give it, using only whole numbers. To start with, we'll define a Counter struct:

use schemars::JsonSchema;

#[derive(Debug, serde::Deserialize, JsonSchema, serde::Serialize)]
struct Counter {
    /// The score of the document
    number: u32,
}

Next, we'll create our Counter agent which will extract the number as the Counter struct. This is somewhat relatively simple as there isn't really much to this - we'll just tell the LLM to add a random number between 1 and 64 to the given number.

use rig::providers::openai::Client;
use std::env;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // Create OpenAI client
    let openai_api_key = env::var("OPENAI_API_KEY").expect("OPENAI_API_KEY not set");
    let openai_client = Client::new(&openai_api_key);

    let agent = openai_client.extractor::<Counter>("gpt-4")
        .preamble("
            Your role is to add a random number between 1 and 64 (using only integers) to the previous number.
        ")
        .build();

    // .. more code below
}

Finally, we'll create our original number, the amount of time we want to wait in between each loop iteration, then simply loop until we have the number we want (in this case, we've set it at 2000 but this can technically be whatever number you want).

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
    // previous code
    let mut number: u32 = 0;

    let mut interval = tokio::time::interval(std::time::Duration::from_secs(1));

    // Loop the agent and allow it to run autonomously. If it hits the target number (2000 or above)
    // we then terminate the loop and return the number
    // Note that the tokio interval is to avoid being rate limited
    loop {
        // Prompt the agent and print the response
        let response = agent.extract(&number.to_string()).await.unwrap();

        if response.number >= 2000 {
            break;
        } else {
            number += response.number
        }

        interval.tick().await;
    }

    println!("Finished with number: {number:?}");

    Ok(())
}

Autonomous agents, while a bit less complex than some of the other design patterns, are quite simple to implement (being that they are just LLM prompts on a loop) and so are useful for simple tasks.

Beyond this article

Thanks for reading! I hope this has helped you gain a better understanding of how to use design patterns with agentic AI to significantly improve the effectiveness of your AI-assisted applications.

For additional Rig resources and community engagement:

Check out more examples in our gallery.
Contribute or report issues on our GitHub.
Join discussions in our Discord community!

DEV Community

Implementing Design Patterns for Agentic AI with Rig & Rust

Prompt chaining

Prompt routing

Parallelization

Orchestrator-worker

Evaluator-optimizer

Autonomous agent

Beyond this article

Top comments (0)

Read next

What is THIS ?

How to Avoid Common Mistakes When Writing Tailwind Utility Classes

How Positional Encoding & Multi-Head Attention Powers Transformers?

10 Essential JavaScript Concepts Every New Developer Should Master