shashank agarwal

Posted on Oct 22, 2024

Most affordable Whisper API

#whisper #openai #api #ai

🎙️ Whisper Speech-to-Text API by NeevCloud

Try it here -> https://api.market/store/neevcloud/whisper

The Whisper Speech-to-Text API is the most affordable speech-to-text solution available online. With up to 6x cheaper pricing compared to OpenAI’s Whisper model and 3x more affordable than other providers, it delivers exceptional value while being super scalable. Whether you need to transcribe individual audio files or process vast amounts of multimedia data, this API can handle it all with high accuracy and performance.

🔑 Key Features:

💸 Most Affordable Solution: Up to 6x cheaper than OpenAI’s Whisper and 3x cheaper than other providers.
📝 High-Accuracy Transcriptions: Leverages advanced machine learning to provide accurate speech-to-text conversion.
🗣️ Speaker Diarization: Identify and separate different speakers in an audio file.
🌍 Multilingual Support: Transcribe audio in multiple languages or translate spoken language into English.
⚡ Real-Time or Async Processing: Choose between real-time or asynchronous processing for flexibility.
📂 Flexible Input: Submit audio files via URLs or upload files directly.

💡 About Whisper:

Whisper is a state-of-the-art automatic speech recognition (ASR) system trained on 680,000 hours of multilingual data from the web. It supports transcribing audio in multiple languages and translating spoken content into English. Designed by OpenAI, Whisper is particularly useful in scenarios such as:

🎧 Transcription Services: Easily transcribe meetings, interviews, lectures, and other spoken content.
📝 Subtitling and Closed Captioning: Generate subtitles for videos, improving accessibility for deaf or hard-of-hearing viewers.
🌐 Language Learning and Translation: Use Whisper for language learning, pronunciation practice, and cross-lingual communication.
📱 Accessibility Tools: Integrate Whisper into assistive technologies for those with speech impairments or disabilities.
🔍 Content Searchability: Transcribe multimedia content into text to allow efficient search and analysis.
🎙️ Voice-Controlled Applications: Use Whisper to build voice-driven applications and interact naturally with technology.
📞 Customer Support Automation: Transcribe and analyze calls in real-time to automate customer support.
🎙️ Podcasting and Journalism: Transcribe interviews and podcasts quickly for faster content creation.

💰 API Pricing:

The Whisper Speech-to-Text API charges per second of audio processed, with 1 second = 1 API unit. This pricing is highly competitive for developers working with audio and video transcription at scale.

Audio Length	Conversion to Seconds	API Units Consumed
12.9 minutes of audio	12.9 * 60 = 774 seconds	774 API units
5 minutes of audio	5 * 60 = 300 seconds	300 API units
30 seconds of audio	30 seconds	30 API units

Important:

✅ No charges for checking task status.
❌ No charges if the task fails (only charged for 200 HTTP responses).

📤 API Endpoints:

1. Process Audio via URL 🌐

Submit an audio file by URL for transcription or other tasks like speaker diarization.

Endpoint: POST /neevcloud/whisper/process_url/

Parameter	Description	Example
`url`	The URL of the audio file.	`"https://example.com/audio.wav"`
`task`	The task to perform: "transcribe", "translate".	`"transcribe"`
`language`	Language of the audio (or set to "None" for no detection).	`"None"`
`batch_size`	Size of the audio chunks to process (in seconds).	`64`
`timestamp`	Whether to generate timestamps ("none" or "chunk").	`"chunk"`
`diarise_audio`	Whether to separate speakers in the audio.	`false`
`is_async`	Choose asynchronous processing (true or false).	`false`

Request Example:

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/process_url/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>' \
  -H 'Content-Type: application/json' \
  -d '{
        "url": "https://example.com/audio.wav",
        "task": "transcribe",
        "language": "None",
        "batch_size": 64,
        "timestamp": "chunk",
        "diarise_audio": false,
        "is_async": false
    }'

2. Process Audio via File Upload 📂

Upload an audio file directly for transcription and other tasks.

Endpoint: POST /neevcloud/whisper/process_file/

Parameter	Description	Example
`file`	The audio file to upload.	`@your-audio-file.mp3`
`task`	The task to perform: "transcribe", "translate".	`"transcribe"`
`language`	The language of the audio file.	`"None"`
`batch_size`	Size of the audio chunks to process (in seconds).	`64`
`timestamp`	Generate timestamps for the chunks ("none" or "chunk").	`"chunk"`
`diarise_audio`	Whether to separate speakers in the audio.	`false`
`is_async`	Choose asynchronous processing (true or false).	`false`

Request Example:

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/process_file/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@your-audio-file.mp3' \
  -F 'task=transcribe' \
  -F 'language=None' \
  -F 'batch_size=64' \
  -F 'timestamp=chunk' \
  -F 'diarise_audio=false' \
  -F 'is_async=false'

3. Check Task Status 📊

Retrieve the status and result of a specific transcription task using its task ID.

Endpoint: GET /neevcloud/whisper/status/{task_id}

Parameter	Description	Example
`task_id`	The unique ID for the task you wish to check.	`"ad371472-e6e9-4ecf-b20f-10884230a09e"`

Request Example:

curl -X 'GET' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/status/ad371472-e6e9-4ecf-b20f-10884230a09e' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>'

Response Example:

{
  "status": "completed",
  "output": {
    "text": "Many people think that the best way to escape war is to dwell upon its horrors...",
    "chunks": [
      {
        "timestamp": [0, 7],
        "text": "Many people think that the best way to escape war..."
      },
      {
        "timestamp": [9.84, 16.5],
        "text": "Them vividly upon the minds of the younger generation..."
      }
    ]
  },
  "task_id": "ad371472-e6e9-4ecf-b20f-10884230a09e",
  "audio_duration_seconds": 30.589
}

📢 Error Handling:

In the event of an error, the API will return standard HTTP error codes:

Error Code	Description
400 Bad Request	Invalid parameters or missing fields.
401 Unauthorized	Invalid or missing API key.
500 Server Error	An issue occurred on the server while processing.

Important: No API units will be charged in case of task failures or errors.

Try it here -> https://api.market/store/neevcloud/whisper

DEV Community

Most affordable Whisper API

🎙️ Whisper Speech-to-Text API by NeevCloud

🔑 Key Features:

💡 About Whisper:

💰 API Pricing:

📤 API Endpoints:

1. Process Audio via URL 🌐

Request Example:

2. Process Audio via File Upload 📂

Request Example:

3. Check Task Status 📊

Request Example:

Response Example:

📢 Error Handling:

Top comments (0)

Read next

DroidSpeak: A Breakthrough in AI-to-AI Communication Speed Using Neural Caching

Simplifying Project Management with AI-Powered Task Generation

New ML Compiler Uses Pattern Matching to Speed Up AI Code, Verified with Formal Proofs

A beginner's guide to the Remove-Bg model by Lucataco on Replicate