Anand Sukumaran

Posted on Jan 20

How to build an AI Agent in Javascript from scratch

#ai #aiagent #llm #javascript

In this tutorial, I'll explain how to build a simple appointment scheduler AI Agent from scratch in Javascript. We won't be using any agentic AI frameworks.

So you will understand how agents work without any abstractions!

I've published the same tutorial as a video on Youtube as well. You can check that here.

Architecture of the AI Agent

You already know what an LLM is. It is similar to a human brain. It can think and take decisions. But cannot execute things without the help of some external tools. Just like how brain uses hands, legs and other organs to perform tasks!

So, an LLM combined with external tools, and a process for cordinating this could be defined as an "AI Agent".

In our case, the Node program will act as the controller who'll interact with the LLM and interfaces it to external tools, and the end-user.

Prerequisites

Javascript and Typescript
OpenAI API Key
Readline library for accepting inputs from commandline

Building the Agent

Note: The finished source code of this agent is available on Github.

Import the dependencies and initialize them

import OpenAI from "openai";
import readline from "readline";

const client = new OpenAI({
    apiKey: ""
});

const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
});

Define the `messages` array

We'll define a global messages array for storing the inputs from the end-user, and the responses from the llm. We'll feed the LLM with this array so that the LLM have the entire context of the conversation!

const messages = [] as any;

Define the SYSTEM_PROMPT

The system prompt is what makes the LLM capable of doing these tasks. We're simply instructing the LLM to act as a scheduler agent, defining his duties and finally the list of available functions the agent can use!

const SYSTEM_PROMPT = `
You are an appointment scheduler AI agent. You're always interacting with a system. You have the ability to do function calls. 
Your response can be either a reply to the user, or to the system to do a function call. But you cannot reply to the user and system in the same response.
So your response should be in JSON format as specified. -

{
    "to": ""
    "message": "",
    "function_call": {
       "function": "",
       "arguments": []
    }
}

I will explain the keys -

1. to - values could be system or user, depending on whom you are replying
2. message - plain text message. Use this only if you are replying to the user not system
3. function_call - Use this only if you are replying to the system. It is a JSON object that determines which function to call, and it's arguments.
4 a. function - name of the function
4 b. arguments - An array of arguments for the function call where each array item is the value for the argument.

Available functions:

function name - check_appointment_availability
arguments - datetime (ISO 8601 format, UTC timezone)

function name - schedule_appointment
arguments - datetime (ISO 8601 format, UTC timezone), name (String), email (string)

function name - delete_appointment
arguments - datetime (ISO 8601 format, UTC timezone), name (String), email (string)

Here are some instructions - 

Chat with user who wants to schedule an appointment with your owner.
Ask if they have any choice for the appointment time.
You must be able to understand that users might be from a different time zone.
Always use their timezone while chatting about times and dates to the user.
Before scheduling the appointment, you must ask their name and email.
Your owner is in IST timezone (+05:30)
Time and date now for your owner is ${getCurrentTimeInTimeZone("Asia/Kolkata")}
`;

**Key points to note here

We're instructing the LLM to always respond in a JSON format.
LLM can respons either to the system (this node program), or to the end user.
LLM must think and decide which functions to call based on the end users requirement.
If a function call is requested, the system (this node program) will execute the function and feed back the response into the LLM

**### Push the system prompt to messages array

As we mentioned, we need to push the prompt to our messages array.

messages.push({
    role: 'system',
    content: SYSTEM_PROMPT
});

Define the functions

As we instructed the LLM, we need to define the available functions. For simplicity we will instruct all functions to return true.

function check_appointment_availability(datetime: string){
    console.log("Calling check_appointment_availability ", datetime)
    return true;
}

function schedule_appointment(datetime: string, name: string, email: string){
    console.log("Calling schedule_appointment ", datetime, name, email)
    return true;
}

function delete_appointment(datetime: string, name: string, email: string){
    console.log("Calling delete_appointment ", datetime, name, email)
    return true;
}

We will also define a function map to call these functions dynamically by using their name from a string variable.

const function_map = {
    'check_appointment_availability': check_appointment_availability,
    'schedule_appointment': schedule_appointment,
    'delete_appointment': delete_appointment
} as any

Defint the `send_to_llm` function

This is the one of the core components of this script. This function accepts a message, adds it to the messages array, and then finally pass the entire array to LLM and wait for it to respond.

async function send_to_llm(content: string){

    messages.push({
        role: 'user',
        content
    });

    const response = await client.chat.completions.create({
        messages,
        model: 'gpt-4o'
    });

    messages.push(response.choices[0].message);

    return response.choices[0].message.content;
}

You can also see that, once the response is received, we're pushing the response back to the messages array so the entire conversation history can be maintained.

Define the `process_llm_response` function

When we receive an input from the LLM, it could be for two purposes.

Send a message to the end user
Execute a function call

The process_llm_response function will parse the received response and do the appropriate action.

async function process_llm_response(response: any){
    const parsedJson = JSON.parse(response);

    if(parsedJson.to == 'user'){
        console.log(parsedJson.message);
    }else if(parsedJson.to == 'system'){
        const fn = parsedJson.function_call.function;
        const args = parsedJson.function_call.arguments;

        const functionResponse = function_map[fn](...args);

        await process_llm_response(await send_to_llm('response is '+functionResponse ? 'true' : 'false'))
    }
}

Here, if the message is sent to the user, then we're simply printing it using console.log().

However, if the message is for the system, we will find the function name, arguments and finally call the function using the function_map we defined before.

The response from the function is fed back into the LLM in a specific format, and we'll call the same process_llm_response function to process the response!

Combining everything

Finally, we will write the main() function.

/**
 * main function
 */
async function main(){

    while(true){

        const input: string = await new Promise((resolve)=>{
            rl.question("Say something: ", resolve);
        });

        const response = await send_to_llm(input);
        await process_llm_response(response);

    }

}

main();

Here we will first accept an input from the user using the readline library. And then the input is fed into the LLM using the send_to_llm function.

Then the response is processed using the process_llm_response function, which decides whether to print the response, or to execute a function!

Final thoughts

Building an agent is not rocket science. It's just instructing the LLM to cordinate with a system program (such as this node script), to execute function or API calls and return the responses accordingly!

As mentioned earlier, this tutorial is available as a video on Youtube as well. You can check that here.

Top comments (1)

Lokesh Karthik • Feb 11

Thanks for this! Was very insightful :)

DEV Community

How to build an AI Agent in Javascript from scratch

Architecture of the AI Agent

Prerequisites

Building the Agent

Import the dependencies and initialize them

Define the `messages` array

Define the SYSTEM_PROMPT

Define the functions

Defint the `send_to_llm` function

Define the `process_llm_response` function

Combining everything

Final thoughts

Top comments (1)

Read next

🚀 Debugging AWS CloudWatch Logs with DevOps-GPT: My Journey & Lessons Learned 🚀

How LLMs Simplify Parsing Complex Content and Save Developers from Regex Hell

Install-DeepSeek_Ollama-on-AWS-EC2

An AI Chatbot with NextJS, Groq and Llama

Architecture of the AI Agent

Prerequisites

Building the Agent

Import the dependencies and initialize them

Define the messages array

Define the SYSTEM_PROMPT

Define the functions

Defint the send_to_llm function

Define the process_llm_response function

Combining everything

Final thoughts

Read next

🚀 Debugging AWS CloudWatch Logs with DevOps-GPT: My Journey & Lessons Learned 🚀

How LLMs Simplify Parsing Complex Content and Save Developers from Regex Hell

Install-DeepSeek_Ollama-on-AWS-EC2

An AI Chatbot with NextJS, Groq and Llama

Define the `messages` array

Defint the `send_to_llm` function

Define the `process_llm_response` function