DEV Community

Gbenga Ojo-Samuel
Gbenga Ojo-Samuel

Posted on

Sports Data Backup - DynamoDB, ECS, AWS MediaConvert & EventBridge

Introduction
Managing and processing sports data efficiently requires a robust and scalable infrastructure. This project integrates Amazon DynamoDB to store game data fetched from RapidAPI, ensuring persistent storage, scalability, and efficient querying for processing and analytics. Additionally, we leverage AWS Systems Manager (SSM) Parameter Store to securely manage API keys, enhancing security and maintainability. Furthermore, AWS MediaConvert is utilized for transcoding game highlight videos stored in S3 Bucket, ensuring optimal video quality and compatibility across multiple platforms.

Solution Architecture

Image description

  1. RapidAPI fetches NCAA game data:
    • stores the raw json sports data in s3 bucket
    • stores structured sports data for quick retrieval and long-term storage in Amazon DynamoDB.

  2. Containerized Processing with ECS:
    • ECS (Fargate) hosts a containerized service that processes game data.
    • retrieves secured API keys from AWS Systems Manager Parameter Store
    • AWS ECR stores and manages Docker images

  3. Video Transcoding with AWS MediaConvert:
    • video clips quality is enhanced and the output save back in S3 using AWS MediaConvert.

  4. Monitoring and Alerting with CloudWatch:
    • cloudWatch Logs capture application and infrastructure logs.
    • cloudWatch Alarms notify on failures or performance degradation.

Prerequisites

Before we get started, ensure you have the following:

  1. RapidAPI Account: Sign up for a RapidAPI account and subscribe to an API that provides NCAA game highlights.
  2. AWS Account: AWS access with the required permission to access the necessary services.
  3. Docker Installed: Install Docker on your system to run the containerized workflow.
  4. Terraform Installed: Ensure Terraform is installed for infrastructure deployment.
  5. Basic CLI Knowledge: Familiarity with using command-line tools for API requests, AWS configurations, and Terraform commands.

Tech Stack

  • Python
  • AWS ECR, ECS, DynamoDB, & Elemental MediaConvert
  • Docker
  • Terraform

Step 1: Clone the project Repository

Use the git clone command to clone the project repository to your local machine.

https://github.com/OjoOluwagbenga700/sports-data-backup.git

Ensure that you have Git installed and configured on your system.

Navigate to Repository: cd sports-data-backup

Step 2: Reviewing the content of the code

let's take a closer look at the file structure and its contents.

Image description

app folder: Contains main application configuration files, with each Python script serving a specific function.

  • config.py: Loads environment variables, assigning defaults where needed.

  • fetch.py: Retrieves highlights from the API and stores them as a JSON file in S3 and also as structured data in DynamoDB

  • process_videos.py: Extracts videos URLs from JSON data file in S3, download them , and save them to S3 under the videos/ folder. Logs each step.

  • mediaconvert_process.py: Processes videos using MediaConvert, configuring codec, resolution, bitrate, and audio settings before storing them back in S3.

  • run_all.py: Executes scripts sequentially, ensuring proper task execution with buffer time.

  • Dockerfile: Provides the step by step approach to build the docker image.

container_definitions.tpl: A template file for defining ECS container definitions in Terraform.

Terraform Infrastructure Files: These .tf files define the AWS infrastructure using Terraform.

  • dynamodb.tf: Configures the Amazon DynamoDB table to store sports data.
  • ecr.tf: Defines the AWS Elastic Container Registry (ECR) for storing Docker images.
  • ecs.tf: Configures AWS Elastic Container Service (ECS) to run the application.
  • iam.tf: Defines IAM roles and permissions required for AWS services.
  • networking.tf: Sets up networking components (VPC, subnets, security groups).
  • provider.tf: Specifies the AWS provider configuration for Terraform.
  • s3.tf: Manages S3 buckets for storing raw and processed videos.
  • scheduler.tf: Defines scheduled tasks (potentially using CloudWatch events or EventBridge scheduler).
  • ssm.tf: Configures AWS Systems Manager (SSM) Parameter Store for secret management.
  • terraform.tfvars: Contains variable values used in Terraform.
  • variable.tf: Declares input variables for Terraform configurations.

Step 3: Running Terraform Commands

Initialize Terraform: Run terraform initto initialize the Terraform working directory and download the necessary plugins.

Image description

Plan the Deployment: Run terraform planto preview the resources that Terraform will create.

Image description

Apply the Configuration: Run terraform apply –auto-approve to deploy the infrastructure.

Image description

Step 4: Verify Application deployment by confirming resources deployed on AWS

SSM Parameter Store

Image description

EventBridge Scheduler

Image description

S3 Bucket

Image description

Image description

Image description

Image description

Image description

AWS ECR

Image description

Image description

AWS ECS

Image description

Image description

Image description

AWS Elemental MediaConvert

Image description

DynamoDB

Image description

Image description

AWS Cloudwatch Logs

Image description

Image description

Image description

Image description

Conclusion

This solution enhances reliability by fetching data from RapidAPI and storing it in Amazon DynamoDB, improving processing with ECS, refining video workflows using MediaConvert, and adding robust monitoring with CloudWatch. Additionally, AWS SSM Parameter Store ensures secure management of API keys and secrets, reducing exposure risks and improving security. By using Terraform, we ensure a repeatable and scalable infrastructure.

Top comments (0)