This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.
What I Built
VideoToBlogAI is an AI-powered web application that automates the creation of technical blog posts from video and audio. This application convert the uploaded media into text format and process the transcriptions by generating an well structured technical blog posts.
Key Features:
Credit-Based System: It employs an credit based system means before processing any media, the platform verifies the remaining credit balance of the user.
Code Extraction: This app can automatically extract code snippets from videos, audio files saving time and effort for developers and technical content creators.
Advanced Analysis: It offers word count, character count, and a dynamic table of contents to help users refine their blog posts.
How it Works:
1. Upload Your Content: Users can register and upload MP4 videos or MP3 audio files (up to 30 MB) which is send to the backend. Before processing any media, the platform verifies the user credit balance and proceed.
User Schema:
- email: String, required, unique
- password: String, required
- username: String, required, unique
- secondsRemaining: Number, default: 1200
- role: String, enum: ["user", "admin"], default: "user", required
Blog Post Schema:
- blogPostId: String, required, unique
- userId: ObjectId (ref: "User"), required
- videoUrl: String, required
- text: String, required
- createdAt: Date, default: Date.now
- status: String, default: "completed"
2. AI-Powered Transcription: The backend will upload media into uploads folder and send the media to AssemblyAI speech-to-text API which will converts the uploaded media into text format.
3. Blog Generation: After that the text is sent to Google Gemini language model to process the transcription to generate the technical blog post. The blog post is saved in the mongodb database.
Demo
Project Link: https://shark-app-n5snu.ondigitalocean.app
Demo Video:
Source code:
bakkeshks / VideoToBlogAI
Convert your video to technical blog post using AI
VideoToBlogAI
Project Proposal for the AssemblyAI Challenge
Overview
VideoToBlogAI is a project designed to generate technical blog posts from various sources such as local videos, and audio files (with a 30 MB limit). It leverages the Google Gemini AI API for language model tasks and AssemblyAI API for speech-to-text functionality.
Features
- User registration and sign-in with error handling.
- Admin analytics: blog post generation, total hours processed.
- Content uploads: MP4 videos, MP3 audio (max 30 MB).
- Credit check: Verify user credits before generating blog posts and manage available time for processing.
- AI-generated blog posts: view, edit, delete, save.
- Automatic code extraction from videos, audios and youtube url.
- Features: Word count, character count, dynamic table of contents, semantic analysis.
- Rendering: Markdown format with Next.js for a user-friendly interface.
-
Transcription services:
- Video/audio: AssemblyAI's speech-to-text API.
- Google Gemini API: Transform transcriptions into blog posts.
Technologies Used
โฆTechnologies Used:
Frontend: Next.js, Shadcn/UI, Tailwind CSS, Highlight.js
Backend: Node.js, Express.js, MongoDB
AI APIs: Google Gemini AI API (for language model tasks), AssemblyAI API (for speech-to-text)
Authentication: JWT (JSON Web Tokens)
Journey
Building VideoToBlogAI has been an great project. The most challenging part was implementing a video-to-text API that could accurately convert videos into text. Once I achieved this, I was able to leverage Google Gemini API to generate technical blog posts.
VideoToBlogAI leverages the robust capabilities of Universal-2, AssemblyAI's state-of-the-art speech-to-text model, to accurately transcribe audio and video content. This integration significantly enhances the platform ability to process diverse media formats and generate high quality blog posts.
Price categories:
Sophisticated Speech-to-Text
Team Members:
@bakkeshks
Top comments (7)
An impressive project that transforms videos and audio into structured technical blog posts perfect for developers!
The edit post design was great and very similar to Dev.to, making it user-friendly for content creators!
Thanks!
Nice UI! Great Project idea , try to integrate wordpress to publish post from your platform
Sure! I had noted down your feature. Thanks for your feedback
Great work ๐ best of luck ๐
Thank you ๐
Very interesting entry!