In the rapidly evolving field of Artificial Intelligence (AI), Agentic AI workflows mark a major advancement and a fundamental shift in AI system design, emphasizing autonomy, adaptability, and decision-making. Unlike traditional zero-shot prompting, Agentic AI enables the development of more sophisticated, autonomous, and capable systems. The goal is not just to enhance AI intelligence but to create systems that operate independently, make data-driven decisions, and continuously optimize their performance. In this article, we will explore Agentic AI, its key components and AWS service mappings with code examples.
What is an Agent?
The "agent" is essentially a Large Language Model (LLM) with a predefined prompt and access to a specific set of “tools”, which are self-contained functions designed to perform a specific task. For instance, these tools could include a text summarizer or an image generator. Agent AWS leverages custom-built tools to query AWS documentation, generate code, and create architectural diagrams. The concept of agents revolves around leveraging a Large Language Model (LLM) as a reasoning engine to determine the appropriate actions needed to accomplish a task. To enhance its capabilities, an agent utilizes various tools that enable it to perform specific functions. For instance, if an agent needs to analyze financial market trends, it could be equipped with a real-time stock market data API to fetch and interpret the latest market information, allowing it to make informed decisions.
Core Components of an AI Agent
Foundation Model (FM)
At the heart of an AI agent lies a foundation model (FM), which serves as the primary reasoning engine. This model is responsible for:
- Understanding user inputs and intent
- Decomposing tasks into actionable steps
- Generating responses and determining follow-up actions
Update during Re:ReInvent- (Introducing multi-agent collaboration capability for Amazon Bedrock (preview))
While the foundation model is often the most recognized aspect of a generative AI system, it is only one piece of a larger, more complex architecture.
Instructions & Prompts
To shape the agent’s behavior, developers provide carefully crafted instructions in the form of prompts. These prompts define the agent’s role and interaction style and are categorized into two types:
- System Prompts: These set overarching guidelines for the AI’s behavior. They include structured instructions such as response formatting, citation rules, and even personality traits. Typically, system prompts remain hidden from end users but can be customized when configuring an agent.
- User Prompts: These are direct inputs from users, ranging from simple queries to more complex analytical or content-generation requests. User prompts guide the agent in performing specific tasks within a given interaction. This prompt helps the model generate suitable responses to user requests. Structure of the Prompt- The prompt is structured broadly into two parts:
- Role: How the agent should behave, explanation of the concept of tools.
- Instructions: Provides examples of tasks and their solutions
Action Modules & Tools
An AI agent extends its capabilities by using action modules, which define the tasks it can execute. These modules function as a bridge between the agent’s reasoning ability and real-world actions, such as API calls or service interactions.
For instance, if an agent is designed to assist with travel planning, it could include a flight booking module that queries airline APIs for ticket availability and pricing. Action modules typically follow structured formats, such as OpenAPI specifications, and are executed through cloud-based functions like AWS Lambda.
Knowledge Base
To enhance its accuracy and contextual understanding, an agent can integrate with external knowledge bases. These sources provide supplemental information, enabling the agent to:
- Retrieve relevant data from enterprise systems
- Use Retrieval Augmented Generation (RAG) to improve response quality
- Incorporate domain-specific insights into its outputs
Memory Capabilities
AI agents benefit from both short-term and long-term memory:
- Short-term memory allows the agent to retain context within an ongoing conversation, ensuring coherent responses.
- Long-term memory stores key information from past interactions, enabling personalized and context-aware user experiences over time.
Prompt Templates & Workflow Orchestration
Developers can fine-tune an agent’s operation through customizable prompt templates, which influence various stages of processing, including:
- Pre-processing: Refining user input before passing it to the FM
- Decision-making & orchestration: Structuring tasks and invoking action modules
- Knowledge retrieval: Enhancing responses with external data
- Post-processing: Formatting and refining final outputs
Guardrails & Feedback Loops
AWS enforces safety and compliance through:
- Content filters to block harmful inputs/outputs.
- Contextual grounding checks to reduce hallucinations.
- Feedback mechanisms where user ratings trigger model retraining or prompt adjustments.
By leveraging these components, AI agents can handle complex workflows, and provide intelligent, context-aware interactions tailored AI agents to diverse use cases.
Creating a Knowledge Base in Amazon Bedrock
To set up a Knowledge Base, users must first prepare their data sources, which can be:
- Unstructured (e.g., documents stored in an Amazon S3 bucket)
- Structured (e.g., databases in Amazon Redshift or AWS Glue Data Catalog)
During setup, users select an embedding model to convert their data into vector embeddings and choose a vector store to index them. Amazon Bedrock simplifies this process by offering automatic creation and management of a vector store in Amazon OpenSearch Serverless. Additionally, other supported vector stores include:
- Amazon OpenSearch Serverless
- Amazon Aurora PostgreSQL etc ### Using a Knowledge Base in Amazon Bedrock Once created, a Knowledge Base can be leveraged through multiple operations:
- Retrieve: Fetches relevant information based on user queries.
- RetrieveAndGenerate: Enhances AI responses by combining retrieved data with generative AI models.
- SQL Query Conversion: For structured data sources, Bedrock can translate natural language queries into SQL statements, enabling seamless access to enterprise data without requiring complex migrations or preprocessing.
Key Benefits of Using Amazon Bedrock Knowledge Bases
Integrating a Knowledge Base into generative AI applications provides several advantages:
- Fully Managed RAG Workflow: Automates data ingestion, retrieval, and augmentation, reducing operational complexity.
- Improved Accuracy & Relevance: Enhances response precision by grounding AI-generated content in authoritative sources.
- Transparency & Source Attribution: Reduces AI hallucinations by linking responses to verified data.
- Multimodal Support: Enables processing and analysis of both text-based and visual data, broadening AI capabilities.
How an Agent Works Behind the Scenes
An agent is designed to intelligently handle user queries, such as questions about return policies. It not only provides answers but also autonomously determines how to retrieve the relevant information.
Example of Agent Operation
Understanding the Task:
- When a user submits a request, the agent analyzes multiple inputs, including the user prompt, conversation history, available actions, and relevant knowledge sources.
- It then determines the best approach to fulfill the request.
Planning the execution:
- The agent leverages a generative AI model that employs "chain of thought" reasoning to break down the task into logical steps.
- These steps may involve API calls, database queries, or retrieving information from a Knowledge Base.
Executing the steps:
The agent performs each step sequentially, such as searching company policies, retrieving structured data, or invoking external tools.
Generating the Final Response:
- Once the necessary data is collected, another Amazon Bedrock model synthesizes the results and formulates a coherent, human-like response.
- This ensures that the output is accurate, well-structured, and contextually relevant.
Example Scenario: AI Agent Assisting a Customer
User Query: "Can I get a refund for an item I purchased last week?"
Agent Analyzes Request: Checks conversation history and identifies relevant data sources.
Knowledge Base Search: Retrieves the company's refund policy from an S3-stored document.
API Call to Order System: Queries the user’s order history via AWS Lambda.
Response Generation: Synthesizes refund policy and order details into a final answer.
Final Response: "You are eligible for a refund within 30 days. Would you like me to process it for you?"
Multi-Agent Orchestration
Multi-agent orchestration is an advanced approach to building complex AI systems that leverages multiple specialized agents working together to solve intricate problems and execute multi-step tasks. This collaborative framework enhances the capabilities of individual AI agents by combining their strengths and expertise.
Key Components
- Supervisor Agent: A central agent that coordinates the overall workflow by:
- Breaking down complex tasks into manageable subtasks
- Delegating work to specialized agents
- Consolidating outputs from various agents
- Specialist Agents Multiple AI agents with specific areas of expertise, designed to handle particular aspects of a given problem. Inter-Agent Communication: A standardized protocol allowing agents to exchange information and coordinate their actions efficiently.
Benefits
- Enhanced Problem-Solving: Tackles complex, multi-step tasks more effectively than single-agent systems
- Improved Accuracy: Combines specialized knowledge from multiple agents
- Increased Efficiency: Enables parallel processing of subtasks
- Scalability: Allows for the addition of new specialized agents as needed
Build the AI Agents:
- Definr Agent Definition file
# agent_definitions.py
AGENTS = [
{
"agentName": "ValidationAgent",
"instruction": "Use this agent to validate user input.",
"knowledgeBaseId": "validation_kb"
},
{
"agentName": "ProcessingAgent",
"instruction": "Use this agent to process data.",
"knowledgeBaseId": "processing_kb"
},
{
"agentName": "ReportingAgent",
"instruction": "Use this agent to generate reports.",
"knowledgeBaseId": "reporting_kb"
}
]
INvoke an agent with the knowledge base
import boto3
import json
import uuid
# Create Bedrock clients
bedrock = boto3.client('bedrock')
bedrock_runtime = boto3.client('bedrock-runtime')
def create_simple_agent():
# Create a knowledge base
kb_response = bedrock.create_knowledge_base(
name="SimpleKnowledgeBase",
description="A simple knowledge base for our agent",
roleArn="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_ROLE_NAME"
)
knowledge_base_id = kb_response['knowledgeBaseId']
# Create an agent
agent_response = bedrock.create_agent(
name="SimpleAgent",
description="A simple Bedrock agent",
instruction="You are a helpful assistant that provides information about AWS ecosystem, solutions.",
knowledgeBaseId=knowledge_base_id,
foundationModel="anthropic.claude-v2",
idleSessionTTLInSeconds=300,
roleArn="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_CUSTOM_ROLE_NAME"
)
agent_id = agent_response['agent']['agentId']
# Create an agent alias
alias_response = bedrock.create_agent_alias(
agentId=agent_id,
name="SimpleAgentAlias"
)
agent_alias_id = alias_response['agentAlias']['agentAliasId']
return agent_id, agent_alias_id
def invoke_agent(agent_id, agent_alias_id, input_text):
session_id = str(uuid.uuid4())
response = bedrock_runtime.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId=session_id,
inputText=input_text
)
return json.loads(response['completion'])['content']
# Create the agent
agent_id, agent_alias_id = create_simple_agent()
# Invoke the agent
input_text = "What is AWS Managed FLink?"
result = invoke_agent(agent_id, agent_alias_id, input_text)
print(result)
Conclusion
AWS Bedrock Agents revolutionize AI-driven automation by streamlining complex workflows, intelligently retrieving data, and executing multi-step tasks autonomously. This enables businesses to enhance efficiency and deliver smarter, context-aware interactions at scale.
Top comments (0)