Daniel Elegberun

Posted on Feb 19

Building a personalized workout recommender with ML.NET: A step-by-step guide

#csharp #ai #machinelearning #dotnet

We are currently experiencing the Artificial Intelligence(AI) revolution. The explosion of large language models (LLMs) and machine learning has transformed multiple industries, including the health and fitness space.

One of the most exciting applications of AI in fitness is the creation of personalized workout plans. With tools like ML.NET, developers can leverage the power of machine learning in the C# and .NET ecosystem. In this article, we’ll explore how to build a simple workout recommender application using content filtering in ML.NET.

What is content filtering?

Content-based filtering is a recommendation technique that suggests items (e.g., workouts, movies, products) to users based on the similarity between the item’s features and the input preferences of the system. Content filtering is used in recommendation systems and information retrieval (e.g., searching for similar documents).

In our workout recommendation, the user would enter the query “beginner chest exercises with dumbbells” and get exercise recommendations that are most similar to the query.

How does content-based filtering work?

Content-based filtering works by analyzing the characteristics of the features of items in a dataset comparing it to the user’s query and then recommending items or features that are similar.

1. Exercise representation

Each exercise in our dataset is represented by its features, such as the body part(s) it targets (chest, legs, arms), equipment needed (dumbbells, resistance bands), or difficulty level (beginner, intermediate, advanced).

2. User query as input

The user enters a query, such as "beginner exercises for the chest with dumbbells"

This query is processed and transformed into a feature vector that represents the user’s intent. For example:

Query: "Exercises for the chest with dumbbells"
Features: {"chest": 1, "equipment": "dumbbell", "difficulty": beginner}

3. Similarity measurement

The system compares the user’s query vector with the feature vectors of all exercises in the database.

For this project, we’ll be making use of cosine similarity. Cosine similarity is a common metric used to measure how similar two vectors are. It calculates the cosine of the angle between the two vectors, providing a score between 0 and 1, where 1 indicates a perfect match and 0 indicates no match.

4. Recommendation generation

Based on the similarity scores, the system ranks the exercises and recommends the top-N exercises that best match the user’s query.

Cosine similarity in depth

Cosine similarity is particularly useful for comparing the user’s query with the exercise features because it focuses on the direction of the vectors rather than their magnitude. This makes it ideal for comparing sparse or unevenly weighted features.

To compute cosine similarity, we first need to represent the text as numerical vectors. One common way is to use the term frequency or TF-IDF (Term Frequency-Inverse Document Frequency).

Let's consider a simple user query looking for exercises targeting the chest. The original query is “recommend chest exercises”. From the query, we can extract the word: chest. We want to compare this to existing chest workouts in our database for similarity.

User query: "Chest"
Database entry: "Chest press"

We can represent these exercises as vectors based on their features. For simplicity, let’s use "Chest," and "press”, as the key features:

Words	Chest	Press
Chest	1	0
Chest press	1	1

The resulting vectors are:
Vector A (Chest): [1, 0]
Vector B (Chest press): [1, 1]

The angle between the two vectors determines the cosine similarity. In this case, the angle is 45°, and the cosine of 45° is approximately 0.707.

Cosine similarity ignores the frequency of the words

Words	Chest	Press
Chest chest chest	3	0
Chest press	1	1

The point of the Chest chest chest on the graph will be further out on the X axis but the angle will remain the same and thus the same cosine similarity. Cosine similarity is determined by the angle between the lines and ignores the magnitude of the vectors.

Cosine similarity when the words are the same

Words	Chest	Press
Chest press	1	1
Chest press	1	1

Since the 2 words are the same the angle between the lines will be 0 and the cosine similarity will be Cos 0 = 1.

While these examples work well in a 2-dimensional space, real-world scenarios often involve higher-dimensional data (e.g., 5 or more features). In such cases, we use the cosine similarity formula:

How content-based filtering works in our fitness app

Let’s break it down step by step:

1. Define Item features

The first step in the process is determining what the item’s features are. In our dataset, our item features are:

Bodypart: chest, legs, abs.
Level: beginner, intermediate, expert.
Equipment: dumbbell, barbell, bodyweight.

Example:
Dumbbell Bench Press → Features: BodyPart=chest, Level=beginner, Equipment=dumbbell.

2. Create a user profile

When the user types a query like “beginner chest exercises with dumbbells”, the system:

Extracts keywords: beginner, chest, dumbbells.
Expands synonyms: Maps chest to ["chest", "pectoral"] and legs to [“quads”, “hamstrings”]
Encodes preferences: Converts these terms into a numerical vector (a list of numbers representing the user’s preferences). User Vector: (0.6, 0.3, 0.1, 1, 0, 0] {chest, beginner, dumbbell}

3. Choose a similarity metric

As discussed previously we use cosine similarity to compare the user’s preferences to exercises.

Example:

User vector: [0.6, 0.3, 0.1, 1, 0, 0] {chest, beginner, dumbbell}
Exercise vector: [0.5, 0.3, 0.2, 1, 0, 0] {chest, beginner, dumbbell}
Similarity Score: 0.98 {nearly identical!}

4. Score and rank exercises

The system calculates the similarity between the user’s vector and every exercise’s vector, then ranks them from most to least similar.

Exercise	BodyPart	Level	Equipment	Similarity Score
Dumbbell Bench Press	chest	beginner	dumbell	0.98
Push-ups	chest	beginner	bodyweight	0.85
Barbell Squats	legs	intermediate	barbell	0.12

The user gets the top recommendations: Dumbbell Bench Press and Push-ups.

The Dataset

We’ll be using the Gym Exercise Dataset from Kaggle. This dataset contains the following columns:

Title: The name of the exercise (e.g., Barbell Squats).
BodyPart: The muscle group targeted (e.g., legs, chest).
Equipment: The equipment required (e.g., dumbbell, barbell).
Level: The difficulty level (e.g., beginner, intermediate).
ExerciseType: The type of exercise (e.g., strength, cardio).

Building the code

Prerequisites

Vscode or any code editor of your choice
Download .NET (I'm using .NET 8)
Full GitHub code here

Step 1: Setting up the project

Create a .NET console app:

dotnet new console -n WorkoutRecommender
cd WorkoutRecommender

Install the ML package dotnet add package Microsoft.ML
Download the dataset: Place the gym_exercise_data.csv file in a Data folder within your project.

Step 2: Loading and preprocessing the data

The first step is to load the dataset and preprocess it for machine learning.

Create a class to represent the exercises: I added a processed exercises class which contains an attribute for the body part synonyms.

using Microsoft.ML.Data;

public class Exercise
{
   [LoadColumn(1)] public string Title { get; set; }
   [LoadColumn(2)] public string Desc { get; set; }
   [LoadColumn(3)] public string Type { get; set; }
   [LoadColumn(4)] public string BodyPart { get; set; }
   [LoadColumn(5)] public string Equipment { get; set; }
   [LoadColumn(6)] public string Level { get; set; }
   // Used for synonym mapping
}

public class ProcessedExercise : Exercise
{
   public string BodyPartSynonyms { get; set; }
}

public class ExerciseVector
{
   [VectorType] // Indicates this is a numerical vector
   public float[] Features { get; set; }
}

Create a body part synonym: Add a recommendation class file and add a dictionary. Since users might enter terms like “legs” instead of “glutes”, we’ll create a synonym dictionary to map related terms.

private static readonly Dictionary<string, List<string>> BodyPartSynonyms = new()
   {
       { "chest", new List<string> { "chest", "pectoral" } },
       { "legs", new List<string> { "legs", "glutes", "quads", "hamstrings" } },
       { "abs", new List<string> { "abs", "core", "abdominals" } },
       { "arms", new List<string> { "arms", "biceps", "triceps" } }
   };`

Load the CSV file: Use ML.NET’s LoadFromTextFile to read the dataset and convert it to a list of Exercise objects. This method loads the list of exercises and converts it to the processed exercises object containing the body part synonyms.

// Method to load exercises from CSV
   public List<ProcessedExercise> LoadExercises()
   {
       // Step 1: Load raw CSV data using ML.NET
       var mlContext = new MLContext();
       var dataPath = Path.Combine(Directory.GetCurrentDirectory(), "Data", "megaGymDataset.csv");
       var dataView = mlContext.Data.LoadFromTextFile<Exercise>(
           path: dataPath,
           separatorChar: ',',
           hasHeader: true // If your CSV has headers
       );

       // Convert to list of Exercise objects
       var exercises = mlContext.Data.CreateEnumerable<Exercise>(dataView, reuseRowObject: false).ToList();

       // Convert to ProcessedExercise and add synonyms
       var processedExercises = exercises.Select(e => new ProcessedExercise
       {
           Title = e.Title,
           Desc = e.Desc,
           BodyPart = e.BodyPart,
           Equipment = e.Equipment,
           Level = e.Level,
           Type = e.Type,
           BodyPartSynonyms = string.Join(",", GetSynonymsForBodyPart(e.BodyPart)) // Join synonyms into a single string
       }).ToList();


       return processedExercises;
   }


   private List<string> GetSynonymsForBodyPart(string bodyPart)
   {
       // Normalize body part to lowercase
       var normalizedBodyPart = bodyPart.Trim().ToLower();


       // Find the synonym group that contains this body part
       var matchingGroup = BodyPartSynonyms
           .FirstOrDefault(kvp => kvp.Value.Contains(normalizedBodyPart));

       return matchingGroup.Value ?? new List<string> { normalizedBodyPart };
   }

Step 3: Building the ML pipeline

The ML pipeline is the heart of our application. It transforms raw data into a format the computer can understand.Here’s how we build the ML pipeline in the Program.cs file.

 var pipeline = mlContext.Transforms.Text.FeaturizeText(
       outputColumnName: "BodyPartFeatures",
       inputColumnName: nameof(ProcessedExercise.BodyPartSynonyms))
   .Append(mlContext.Transforms.Categorical.OneHotEncoding(
       outputColumnName: "LevelFeatures",
       inputColumnName: nameof(Exercise.Level)))
   .Append(mlContext.Transforms.Categorical.OneHotEncoding(
       outputColumnName: "EquipmentFeatures",
       inputColumnName: nameof(Exercise.Equipment)))
   .Append(mlContext.Transforms.Concatenate(
       outputColumnName: "Features",
       "BodyPartFeatures",
       "LevelFeatures",
       "EquipmentFeatures"));`

FeaturizeText: Converts body part synonyms into numerical vectors.
OneHotEncoding: Converts categorical features like Level and Equipment into binary vectors.
Concatenate: Combines all features into a single vector for each exercise.

Step 4: Preprocessing and Prediction Engine

var preprocessedData = pipeline.Fit(dataView);
var predictionEngine = mlContext.Model.CreatePredictionEngine<ProcessedExercise, ExerciseVector>(preprocessedData);`

Fit: Trains the pipeline on the exercise data.
predictionEngine: Generates feature vectors for new user query inputs.

Step 5: Handling User Queries

To make the system user-friendly, we’ll parse natural language queries and extract keywords.

// Get user input
Console.WriteLine("Enter your query:");
var query = Console.ReadLine();


var userQuery = helper.ParseInput(query); // "Recommend leg workouts for intermediates"

1.ParseInput: Extracts key components (e.g., BodyParts, Level, Equipment) from the user’s query using Regex.

public UserQuery ParseInput(string query)
   {
       var userQuery = new UserQuery();

       // Case-insensitive regex patterns
       const string bodyPartPattern = @"(?i)\b(chest|legs|abs|arms|core|glutes|back|traps|neck|shoulders)\b";
       const string levelPattern = @"(?i)\b(beginner|intermediate|expert|advanced)\b";
       const string equipmentPattern = @"(?i)\b(dumbbell|barbell|kettlebells|bodyweight|bands|cable|machine|body)\b";
       // Extract body parts
       userQuery.BodyParts = Regex.Matches(query, bodyPartPattern)
           .Select(m => m.Value.ToLower())
           .ToList();

       // Extract fitness level (default to "beginner" if unspecified)
       var levelMatch = Regex.Match(query, levelPattern);
       userQuery.Level = levelMatch.Success ? levelMatch.Value.ToLower() : "beginner";

       // Extract equipment (optional)
       var equipmentMatch = Regex.Match(query, equipmentPattern);
       userQuery.Equipment = equipmentMatch.Success ? equipmentMatch.Value.ToLower() : null;

       return userQuery;
   }

Expand Synonyms: Map user-friendly terms like "legs" to dataset-specific terms like "glutes".

public string ExpandQuery(List<string> userBodyParts)
   {
       var expandedTerms = new List<string>();
       foreach (var term in userBodyParts)
       {
           var normalizedTerm = term.Trim().ToLower();
           if (BodyPartSynonyms.ContainsKey(normalizedTerm))
           {
               expandedTerms.AddRange(BodyPartSynonyms[normalizedTerm]);
           }
           else
           {
               var matchingGroup = BodyPartSynonyms
                   .FirstOrDefault(kvp => kvp.Value.Contains(normalizedTerm));
               if (matchingGroup.Value != null)
               {
                   expandedTerms.AddRange(matchingGroup.Value);
               }
               else
               {
                   expandedTerms.Add(normalizedTerm);
               }
           }
       }
       return string.Join(",", expandedTerms.Distinct()); // Join into a single string
   }

Step 5: Generating Recommendations

Finally, we’ll compare the user’s input to the dataset and recommend exercises.

var recommendations = exercises
   .Select(e => new
   {
       Exercise = e,
       Similarity = helper.ComputeSimilarity(userVector, predictionEngine.Predict(e).Features)
   })
   .OrderByDescending(x => x.Similarity)
   .Take(5);

ComputeSimilarity: Calculates cosine similarity between the user’s vector and each exercise’s vector.
OrderByDescending: Ranks exercises by similarity score.
Take(5): Returns the top 5 recommendations.

Full program.cs class here.

Output

In this article, we built a personalized workout recommendation engine by leveraging ML.NET., We created a system that understands user queries like "leg workouts for intermediates" and ranks exercises based on semantic similarity.

Follow me on Dev. to and Medium for more AI, .NET, and fitness content.

DEV Community