DEV Community

Cover image for Automating Sports Highlights Backup with AWS ECS, DynamoDB, and S3
Maxwell Ugochukwu
Maxwell Ugochukwu

Posted on

Automating Sports Highlights Backup with AWS ECS, DynamoDB, and S3

Introduction

In this technical blog, we will explore the SportsDataBackup project, which automates fetching sports highlights, storing data in Amazon S3 and DynamoDB, processing videos, and running on a schedule using AWS ECS Fargate and EventBridge. This guide will walk you through the setup, configuration, and deployment of this cloud-native automation.

Project Overview

The SportsDataBackup system is designed to:

  • Retrieve sports highlights from RapidAPI.
  • Store metadata in Amazon DynamoDB.
  • Save highlight videos in Amazon S3.
  • Process videos using AWS MediaConvert.
  • Schedule execution using AWS EventBridge and ECS Fargate.
  • Monitor logs via Amazon CloudWatch.

Prerequisites

Before we begin, ensure the following dependencies are installed:

1. Create a RapidAPI Account

  1. Register on RapidAPI.

  2. Get your API key to access sports highlights.

2. Install Required Tools

  • Docker (Pre-installed in most environments)
    docker --version

  • AWS CLI (Pre-installed in AWS CloudShell)
    aws --version

-Python3
python3 --version

  • gettext (For environment variable substitution)

Install on Ubuntu/Debian: sudo apt install gettext

Install on macOS (Homebrew): brew install gettext

Install on Windows (Chocolatey): choco install gettext
Also, you can follow the installation step here

3. Retrieve AWS Account ID

  • Run the following command: aws sts get-caller-identity --query "Account" --output text

Image description

  • Save your AWS Account ID for later.

4. Retrieve AWS Access Keys

  1. Navigate to IAM Dashboard > Users > Security Credentials.
  2. Create and save Access Key and Secret Access Key.

Step-by-Step Setup

Step 1: Clone the Repository

git clone https://github.com/princemaxi/SportsDataBackup
cd SportsDataBackup/src
Enter fullscreen mode Exit fullscreen mode

Image description

Step 2: Configure Environment Variables

Modify the .env file with the relevant values:

AWS_ACCOUNT_ID=your-account-id
AWS_ACCESS_KEY=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-access-key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket
RAPIDAPI_KEY=your-rapidapi-key
MEDIA_CONVERT_ENDPOINT=
aws mediaconvert describe-endpoints --query "Endpoints[0].Url" --output text

Image description

SUBNET_ID=subnet-xxx
SECURITY_GROUP_ID=sg-xxx

Steps for getting SubnetID and Security Group ID:

  1. In the github repo, there is a resources folder and copy the entire contents
  2. In the AWS Cloudshell or vs code terminal, create the file vpc_setup.sh and paste the script inside. Run the script vpc_setup.sh
  3. You will see variables in the output, paste these variables into Subnet_ID and Security_Group_ID

Image description

Step 3: Load Environment Variables

set -a
source .env
set +a
Enter fullscreen mode Exit fullscreen mode

Verify the variables:

echo $AWS_LOGS_GROUP
echo $TASK_FAMILY
echo $AWS_ACCOUNT_ID
Enter fullscreen mode Exit fullscreen mode

Image description

Step 4: Generate JSON Configuration Files

Use envsubst to replace placeholders in template files:

envsubst < taskdef.template.json > taskdef.json
envsubst < s3_dynamodb_policy.template.json > s3_dynamodb_policy.json
envsubst < ecsTarget.template.json > ecsTarget.json
envsubst < ecseventsrole-policy.template.json > ecseventsrole-policy.json
Enter fullscreen mode Exit fullscreen mode

Step 5: Build and Push Docker Image

  • Create an ECR Repository

aws ecr create-repository --repository-name sports-backup

Image description

  • Login to AWS ECR
aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
Enter fullscreen mode Exit fullscreen mode

Image description

  • Build Docker Image

docker build -t sports-backup .

Image description

  • Tag and Push the Image
docker tag sports-backup:latest ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/sports-backup:latest
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/sports-backup:latest
Enter fullscreen mode Exit fullscreen mode

Image description

Step 6: Create AWS Resources

  • Register the ECS Task Definition

aws ecs register-task-definition --cli-input-json file://taskdef.json --region ${AWS_REGION}

Image description

  • Create CloudWatch Log Group

aws logs create-log-group --log-group-name "${AWS_LOGS_GROUP}" --region ${AWS_REGION}

Image description

  • Attach S3/DynamoDB Policy to ECS Task Execution Role
aws iam put-role-policy --role-name ecsTaskExecutionRole --policy-name S3DynamoDBAccessPolicy --policy-document file://s3_dynamodb_policy.json
Enter fullscreen mode Exit fullscreen mode
  • Create ECS Events Role
aws iam create-role --role-name ecsEventsRole --assume-role-policy-document file://ecsEventsRole-trust.json
Enter fullscreen mode Exit fullscreen mode
  • Attach ECS Events Role Policy
aws iam put-role-policy --role-name ecsEventsRole --policy-name ecsEventsPolicy --policy-document file://ecseventsrole-policy.json
Enter fullscreen mode Exit fullscreen mode

Step 7: Schedule the Task with AWS EventBridge

  • Create the EventBridge Rule
aws events put-rule --name SportsBackupScheduleRule --schedule-expression "rate(1 day)" --region ${AWS_REGION}
Enter fullscreen mode Exit fullscreen mode
  • Add Target to the Rule
aws events put-targets --rule SportsBackupScheduleRule --targets file://ecsTarget.json --region ${AWS_REGION}
Enter fullscreen mode Exit fullscreen mode

Step 8: Manually Test the ECS Task

aws ecs run-task \
  --cluster sports-backup-cluster \
  --launch-type FARGATE \
  --task-definition ${TASK_FAMILY} \
  --network-configuration "awsvpcConfiguration={subnets=[\"${SUBNET_ID}\"],securityGroups=[\"${SECURITY_GROUP_ID}\"],assignPublicIp=\"ENABLED\"}" \
  --region ${AWS_REGION}
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Key Learnings

  • Using templated JSON files for automated AWS configurations.
  • Storing and backing up sports highlight data in DynamoDB and S3.
  • Deploying event-driven ECS tasks with AWS Fargate and EventBridge.
  • Monitoring logs and task execution in CloudWatch.

Future Enhancements

  • Automated backup of DynamoDB tables to S3.
  • Batch processing of JSON files to handle multiple videos per execution.

Conclusion

The SportsDataBackup project demonstrates how AWS services can be combined to automate data ingestion, storage, processing, and scheduling. With a fully automated setup, this solution ensures reliable backup and processing of sports highlights using AWS cloud-native tools.

If you're interested in contributing, feel free to fork the repository and submit pull requests! 🚀

Top comments (0)