Forem

Johnny Z
Johnny Z

Posted on • Edited on

Lightweight AI Evaluation with SemanticKernel

For quick and easy evaluation or comparison of AI responses in .NET applications, particularly tests. We can leverage autoevals excellent 'LLM-as-a-Judge' prompts with the help of Semantic Kernel.

Sample code

Note that you need to setup semantic kernel with chat completion first. It is also recommended to set 'Temperature' to 0.

var json = 
    """
    {
        "humor" : {
            "output" : "this maybe funny"
        }
    }
    """;
await foreach (var result in 
        kernel.Run(json, executionSettings: executionSettings))
{
    Console.WriteLine($"[{result.Key}]: result: {result.Value?.Item1}, score: {result.Value?.Item2}");
}
Enter fullscreen mode Exit fullscreen mode

Source

While Microsoft.Extensions.AI.Evaluation is in the making, it currently involves a little too much 'ceremonies' for simple use cases.

Please feel free to reach out on twitter @roamingcode

Top comments (1)

Collapse
 
ai_joddd profile image
Vinayak Mishra

This topic always excites me. I came across something interesting on Using a Jury of LLMs Instead of a Single Judge to evaluate LLM generations