At the core of Microsoft.Extensions.AI
is an IChatClient
, which serves as a unified approach to working with various OpenAI service providers.
It utilizes a list of messages (IList<ChatMessage>
) to provide input for CompleteAsync()
.
Each message has a property called Contents
, allowing each message have multiple contents sequentially, making it a multi-modal message.
IList<ChatMessage>
1- ChatMessage
- Role: User
- Contents:
- Text: "Hello, what is in my image?"
- Image: "[Image1.jpg]"
2- ChatMessage
- Role: Assistant
- Contents:
- Text: "There is a BMW car in your image"
- Audio: "[Voice of above sentence.]"
3- ChatMessage
- Role: User
- Contents:
- Text: "What’s its price?"
4- ChatMessage
- Role: Tool
- Contents:
- FunctionCall: "GetPrice(“BMW“)"
In this blog, I will clarify the types of content currently supported. Firstly, of these contents derive from a base class called AIContent
.
The inheritance hierarchy AIContent
as follows:
Different AI Contents
AIContent
:
TextContent
-
DataContent
ImageContent
AudioContent
UsageContent
FunctionCallContent
FunctionResultContent
AIContent
Every content is an AIContent
. It contains the shared functionalities between different types.
- AdditionalProperties: Some key-value properties. -RawRepresentation*: The original raw representation of the content from an underlying implementation. It contains an of the proper with the underlying technology, whether it is an *OpenAI, AzureOpenAI, Ollama, or ...
TextContent
It's simple text content having one property: Text
.
new TextContent
{
Text = "Hello, what can you do for me?"
}
ImageContent & AudioContent
Both of these types are DataContent
and are very like each other.
new ImageContent(
uri: new Uri("https://www.example.com/image)"),
mediaType: "image/png"
);
new AudioContent(
uri: new Uri("https://www.example.com/voice)"),
mediaType: "audio/wav"
);
UsageContent
This type of content includes data about the token consumption of the request.
new UsageContent(
new UsageDetails {
InputTokenCount = 10,
OutputTokenCount = 20,
TotalTokenCount = 30}
);
FunctionCallContent
It indicates a function call that is requested by AI to be evaluated.
new FunctionCallContent(
callId: "fx12",
name: "GetFoodMenu",
arguments: new Dictionary<string, object?>
{
["mood"] = "Happy"
}
);
FunctionResultContent
It indicates that a function call been invoked by client, and the result is ready to be reported to the AI
new FunctionResultContent(
callId: "fx12",
name: "GetFoodMenu",
result: "Pizza, Burger, Ice Cream"
);
Summary
In conclusion, the Microsoft.Extensions.AI
library offers a versatile and unified approach to working with various OpenAI service providers through its IChatClient
. By utilizing a list of messages (IList<ChatMessage>
) with a Contents
property, it supports multi-modal messages that can include text, images, audio, and function calls. This flexibility allows for more dynamic and interactive AI-driven applications.
The different types of content supported by Microsoft.Extensions.AI
all inherit from the base class AIContent
, which provides shared functionalities. These content types include TextContent
, ImageContent
, AudioContent
, UsageContent
, FunctionCallContent
, and FunctionResultContent
, each with specific properties and functionalities. The provided code examples demonstrate how these content types can be effectively used in practice, making the library a powerful tool for developers looking to integrate AI capabilities into their applications.
Top comments (0)