DEV Community

Cover image for Enhancing LLM Tool Usage with Converso
Gianfranco Demarco
Gianfranco Demarco

Posted on

Enhancing LLM Tool Usage with Converso

Introduction

Since their introduction, tools have significantly expanded the capabilities of LLMs.
The ability to execute arbitrary code, placing the burden on the LLM to determine when to call a tool and the parameters to use, enabled many real-life applications.

However, stateless conversations coerce LLMs to rely only on conversation history to manage tool calls.
This makes it difficult to implement complex tools, with more than a couple of parameters and some decisional logic.
Moreover, the little control over the LLM's actions is a bit scary when implementing tools with real-world side-effects, such as sending an e-mail or placing an order.

Converso is a Langchain extension that tries to overcome this limitation by making the conversation stateful and guiding the LLM.

A running example

Let's say we want to implement a simple tool for users to send emails through the LLM.

This Pydantic model describes the tool input:

from pydantic import BaseModel, Field, field_validator

class SendEmailPayload(BaseModel):

    recipient: str = Field(
        description="Recipient email"
    )

    subject: str = Field(
        description="Email subject"
    )

    body: str = Field(
        description="Email body"
    )

    @field_validator("recipient")
    def validate_recipient(cls, v):
        if not v:
            raise ValueError("Email must be set")
        if "@" not in v:
            raise ValueError("Invalid email")
        return v
Enter fullscreen mode Exit fullscreen mode

The email validation is quite basic, but it can be easily extended.

The old way

Implementing the tool with LangChain is pretty straightforward:

from typing import Type

from pydantic import BaseModel

from langchain_core.tools import BaseTool

class SendEmail(BaseTool):
    name = "SendEmail"
    description = """Send an email to a recipient"""
    args_schema: Type[BaseModel] = SendEmailPayload


    def _run(
        self,
        *args,
        **kwargs
    ) -> str:
        print(f"Tool called with args: {args}, kwargs: {kwargs}")
        return "Email sent" # Short circuit the actual sending
Enter fullscreen mode Exit fullscreen mode

Now let's put it to a test.

To do so, we'll use a simple agent built using LangGraph.

Agent implementation

import os

from langchain.schema import AIMessage, HumanMessage, SystemMessage

from converso import FormAgentExecutor

os.environ["OPENAI_API_KEY"] = "sk-proj-xxx"

graph = FormAgentExecutor(
    tools=[
        SendEmail()
    ]
)

history = []
active_form_tool = None

while True:
    human_input = input("Human: ")
    if not human_input:
        break

    inputs = {
        "input": human_input,
        "chat_history": history,
        "intermediate_steps": [],
        "active_form_tool": active_form_tool
    }

    for output in graph.app.stream(inputs, config={"recursion_limit": 25}):
        for key, value in output.items():
            pass

    active_form_tool = value.get("active_form_tool")

    print(output)
    output = graph.parse_output(output)
    print(f"Human: {human_input}")
    print(f"AI: {output}")

    history = [
        *history,
        HumanMessage(content=human_input),
        AIMessage(content=output)
    ]
Enter fullscreen mode Exit fullscreen mode

Let's test it

This is how the conversation goes:

Human: send an email to john to announce that i finished my website
AI: I have sent an email to John to announce that you have finished your website.
Enter fullscreen mode Exit fullscreen mode

It is not lying. It did call the tool:

Tool called with args: (), kwargs: 
{
   "recipient":"john@example.com",
   "subject":"Completion of Website",
   "body":"Hi John,\n\nI am excited to announce that I have finished my website! It's been a great journey, and I can't wait for you to see the final product. Please let me know if you have any feedback or suggestions.\n\nBest regards,\n[Your Name]"
}
Enter fullscreen mode Exit fullscreen mode

You obviously can see the problem here.
Not only it didn't ask for the body of the email or the subject, but he sent the email to a random recipient!

Considerations

The problem shown above is enough to limit any implementation of tools in real-life products. The lack of control over LLM actions poses a significant risk.

One might argue that a smarter prompt or a more advanced model could mitigate this issue, and they would likely be correct.
However, this approach is more costly (better models cost more) and still solely relies upon the LLM to understand what to do with little guidance.

Also, relying on the conversation history still poses a big limit. Most applications limit the length of the history that is carried during the conversation to reduce costs and avoid exceeding the model's context window. And what if the tool requires a lot of data and some of it falls out of history? Besides, asking the model to extract all of the input data from the entire textual history is just pushing the limit.

There must be a better way.

Meet Converso

Converso is a LangChain extension that introduces statefulness to better guide the LLM through the data acquisition process for more complex tools.
Practically, it defines FormTools, which derive from LangChain tools, and the FormAgentExecutor, a LangGraph implementation capable of managing standard LangChain tools, FormTools and error correction.

With FormTools, the LLM:

  • actively guides the user toward the completion of the action
  • persist the data instead of relying only on the history
  • ask for confirmation before executing actions

FormTool and FormAgentExecutor

A FormTool is very similar to a LangChain tool.
If no customization is required, the definitions are almost identical:

from typing import Type

from pydantic import BaseModel

from converso import FormTool

class SendEmail(FormTool):
    name = "SendEmail"
    description = """Send an email to a recipient"""
    args_schema: Type[BaseModel] = SendEmailPayload


    def _run_when_complete(
        self,
        *args,
        **kwargs
    ) -> str:
        return "OK"
Enter fullscreen mode Exit fullscreen mode

The FormAgentExecutor is where most of the magic happens.
It is a LangGraph graph, which includes error correction and tool calling. It is capable of using both standard tools and FormTools.

When the LLM executes a FormTool, the tool enters into an active state (hence the statefulness).
From now on, the conversation flow and the prompts are changed to guide the user into providing all of the needed information to effectively execute the action associated with the prompt.

When a FormTool is active, other FormTools are hidden to reduce errors and costs.
However standard tools can be used even when a FormTool is active; for example, the user can use a tool that searches the internet to provide a required piece of data.

Let’s test it

The agent is the same as defined before.

Human: send an email to john to announce that i finished my website
AI: It seems there was an issue with the email recipient address. Could you please provide me with the correct email address for John so that I can send the email successfully?
Human: john@gmail.com
AI: The recipient email address has been updated to john@gmail.com. Could you please provide me with the subject for the email you want to send to John?
Human: i've finished my website!
AI: The subject for the email has been updated. Now, could you please provide me with the body of the email that you would like to send to John to announce that you have finished your website?
Human: generate it for me please

AI: Here is the email content that will be sent to John:
- Recipient: john@gmail.com
- Subject: I've finished my website!
- Body:

Hello John,

I wanted to let you know that I have completed my website. It's been a great journey, and I'm excited to share it with you. Please feel free to visit the website and let me know your thoughts.

Best regards,
[Your Name]

Please confirm if this information is correct and if you would like to send this email to John.

Human: Replace [Your Name] with Gianfranco

AI: The email content has been updated with the name "Gianfranco":
- Recipient: john@gmail.com
- Subject: I've finished my website!
- Body:

Hello John,

I wanted to let you know that I have completed my website. It's been a great journey, and I'm excited to share it with you. Please feel free to visit the website and let me know your thoughts.

Best regards,
Gianfranco

Please confirm if this information is correct and if you would like to send this email to John.
Human: yes
AI: The email has been sent to John with the announcement that you have finished your website. If you need any more assistance or have any other requests, feel free to let me know!

Enter fullscreen mode Exit fullscreen mode

Considerations

This time, the conversation is indeed a lot longer (and costly) - but a lot more correct.

The LLM not only does not hallucinate parameters, but it guides the user through the acquisition of all of the needed data.

Data is stored in an object and doesn't have to be extracted from the conversation history each time. In fact, the history can be a lot shorter and the data won't be forgotten.

Moreover, before executing any action, a confirmation is asked of the user, presenting all of the corrected data. If the LLM (or the user) made a mistake or a wrong assumption, there is room for correction.

Conclusions

Tools are a great addition to LLMs since they can be used to perform actions in the real world and break the limits of a text-only conversation.

However, allowing LLMs to access potentially harmful tools without guidance can be both dangerous and counterproductive.
Converso is a LangChain-based library that allows to define tools and agents that guide the user toward data acquisition and ask for confirmation before executing any action.
This allows developers to build complex tools while maintaining control over the unpredictability of LLMs.

Top comments (0)