Welcome to the AWS Audio Transcription Automation project! This CloudFormation stack automates transcription of audio files (MP4, MP3, and WAV) using Amazon Transcribe. Easily upload your audio files to S3, trigger transcription jobs, and store results in an output S3 bucket — all automated! 🎉
Architecture Diagram
GitHub repo: https://github.com/Warner-Bell/audio-transcription-build
🚀 Features
- Automatic Transcription: Supports MP4, MP3, and WAV audio files, using Amazon Transcribe.
- Secure Storage: AES-256 encryption for S3 buckets.
- Lifecycle Management: Automatically expire input files after 1 day and log files after 2 days.
- Logging: Logs Lambda function execution and S3 access.
- Event-Driven Transcription: Automatically triggers transcription jobs upon file upload.
- Final HTML Output: Transcription results are processed into an HTML file, viewable on any browser for easy reading.
🛠️ Technology Stack
- Amazon S3: Storage for audio files, transcription results, and access logs.
- AWS Lambda: Event-driven function to trigger transcription jobs.
- Amazon Transcribe: Speech-to-text transcription service.
- Amazon CloudWatch: Logs Lambda function activity.
- IAM Roles: Manages permissions for Lambda and S3.
🎯 How It Works
- Upload audio files (MP4, MP3, or WAV) to the designated S3 input bucket.
- Lambda Triggered: Upon upload, a Lambda function triggers an Amazon Transcribe job.
- Transcription Results: Transcription results are stored in the specified S3 output bucket, and formatted as an HTML file for easy viewing.
🧩 CloudFormation Resources
This CloudFormation stack provisions the following resources:
- Input S3 Bucket: For audio files awaiting transcription.
- Output S3 Bucket: Stores completed transcription results.
- Logging S3 Bucket: Logs access events for both input and output buckets.
- AWS Lambda Function: Automatically triggers transcription jobs.
- IAM Role: Provides necessary permissions for the Lambda function.
- CloudWatch Log Group: Logs Lambda function execution.
📦 Installation & Setup
Prerequisites
- AWS CLI: Installed and configured on your local machine.
- AWS Account: Ensure permissions to create CloudFormation stacks, Lambda functions, and S3 buckets.
-
Edit Deploy File: Edit the
STACK_NAME
constant to a unique name in thedeploy-transcription.sh
file. Update the region in thes3-trigger.sh
file if your region is notus-east-1
.
Deploy the Stack
- Clone the repository:
git clone https://github.com/Warner-Bell/audio-transcription-build.git
cd audio-transcription-automation
Update the s3-trigger.sh
file with you stack name
- Run the deployment script:
./bin/deploy-transcription.sh
This script deploys the CloudFormation stack using a template file, setting up the required S3 buckets, Lambda function, and permissions.
-
Deploy S3 Trigger for Lambda:
After deploying the CloudFormation stack, set up the S3 bucket trigger using
s3-trigger.sh
:
./bin/s3-trigger.sh
Note: The S3 trigger configuration script (
s3-trigger.sh
) is created as a separate script to avoid circular dependencies between the S3 bucket, Lambda function, and logging configuration.
Verify Deployment: Monitor the deployment progress in the AWS CloudFormation Console to ensure all resources are created successfully.
Start Transcribing: Once deployed, simply upload audio files to the designated input S3 bucket to start transcription. After running the S3 bucket trigger, upload your files to S3 using
upload-to-s3.sh
(Edit the file variables with your info.)
./bin/upload-to-s3.sh
Note on Output: Transcription output files are stored in both JSON and HTML format in the output S3 bucket. The HTML format can be viewed directly in any web browser for easy reading, eliminating the need to manually copy text from JSON files.
📂 Project Structure
.
├── audio-samples # 3 sample audio files in mp3, mp4, and wav format
├── bin/
│ ├── deploy-transcription.sh # Deployment script
│ ├── empty-buckets.sh # Empty all Project Buckets
│ ├── s3-trigger.sh # Lambda role configuration and S3 bucket notifications
│ ├── upload-to-s3.sh # Upload selected audio file to s3
├── cfn/audio-transcription.yaml # CloudFormation template
├── README.md # Project documentation
📊 Monitoring & Logging
- CloudWatch Logs: View Lambda execution logs in the AWS CloudWatch console.
- S3 Access Logs: Access logs for input and output S3 buckets are stored in the logging bucket.
⚙️ Recommended Practices
- Adjust Bucket Retention: Modify lifecycle policies to suit your data retention requirements.
-
Configure Notifications: Install and configure
notify-send
on Linux for deployment notifications (optional). - Review IAM Policies: Ensure permissions are as restrictive as possible for production use.
🤝 Contributing
We welcome contributions! To contribute, follow these steps:
How to Contribute
- Fork the repo 🍴
- Create a new branch:
git checkout -b feature/awesome-feature
- Commit your changes 💻
- Push your branch and submit a PR 🛠️
🌍 License
This project is licensed under the MIT License. See the LICENSE file for details.
🏆 Acknowledgements
- Thanks to AWS for their robust services. 💪
- Special thanks to OpenAI for inspiring innovation with AI-based tools. 🙌
📬 Contact
Warner Bell - Tap In!
✨ Happy Transcribing! ✨
Top comments (0)