This is a submission for the AssemblyAI Challenge : No More Monkey Business.
Submission for Realtime Audio Prompt
What I Built
For the AssemblyAI Challenge, I developed a real-time audio transcription and note-taking application. This project combines the power of AssemblyAI's Streaming API with a user-friendly interface to provide instant transcription, live note-taking, and AI-assisted content generation.
The application consists of three main components:
- A Chrome extension for capturing tab audio, displaying subtitles, and fetching audio from any webpage or microphone
- A server-side component for handling WebSocket connections and interacting with AssemblyAI's API
- A frontend web application for displaying transcriptions and managing notes. The user will be able to rewrite and generate notes from a recorded session
Demo
Source Code
Screenshots
Review and edit transcribed sessions with additional context and notes
AI-assisted note generation based on the transcribed content
The dashboard interface showing live transcription and note-taking stats
Journey
Integrating AssemblyAI's Lemur API was an interesting part of this project:
API Integration: I added server-side actions to interact with Lemur for summarization, question answering, and action item generation. This involved learning the API endpoints and response structures.
User Interface: Incorporating Lemur's features into the frontend required some UI/UX considerations to make the AI-generated content accessible and useful to users.
Learning Curve: Getting familiar with Lemur's capabilities took some time. I experimented with different prompts and parameters to understand how to best utilize the API for our use case.
Added Value: Lemur allowed the application to go beyond simple transcription, offering users more insights from their audio content.
While integrating Lemur had its challenges, it ultimately enhanced the functionality of the application, providing users with AI-powered analysis of their transcribed content.
By integrating these additional tools, I was able to create a more comprehensive and powerful application that goes beyond simple transcription. The Chrome extension allows for seamless audio capture from any web content, while the AI-assisted content generation provides valuable insights and summaries to users, making their note-taking process more efficient and effective.
Throughout this project, I gained valuable experience in working with real-time audio processing, WebSocket communication, and integrating AI capabilities into a web application. The AssemblyAI Streaming API proved to be robust and reliable, enabling me to create a responsive and accurate transcription experience for users.
Top comments (0)