DEV Community

shashank agarwal
shashank agarwal

Posted on

Most affordable Whisper API

πŸŽ™οΈ Whisper Speech-to-Text API by NeevCloud

Try it here -> https://api.market/store/neevcloud/whisper

The Whisper Speech-to-Text API is the most affordable speech-to-text solution available online. With up to 6x cheaper pricing compared to OpenAI’s Whisper model and 3x more affordable than other providers, it delivers exceptional value while being super scalable. Whether you need to transcribe individual audio files or process vast amounts of multimedia data, this API can handle it all with high accuracy and performance.


πŸ”‘ Key Features:

  • πŸ’Έ Most Affordable Solution: Up to 6x cheaper than OpenAI’s Whisper and 3x cheaper than other providers.
  • πŸ“ High-Accuracy Transcriptions: Leverages advanced machine learning to provide accurate speech-to-text conversion.
  • πŸ—£οΈ Speaker Diarization: Identify and separate different speakers in an audio file.
  • 🌍 Multilingual Support: Transcribe audio in multiple languages or translate spoken language into English.
  • ⚑ Real-Time or Async Processing: Choose between real-time or asynchronous processing for flexibility.
  • πŸ“‚ Flexible Input: Submit audio files via URLs or upload files directly.

πŸ’‘ About Whisper:

Whisper is a state-of-the-art automatic speech recognition (ASR) system trained on 680,000 hours of multilingual data from the web. It supports transcribing audio in multiple languages and translating spoken content into English. Designed by OpenAI, Whisper is particularly useful in scenarios such as:

  1. 🎧 Transcription Services: Easily transcribe meetings, interviews, lectures, and other spoken content.
  2. πŸ“ Subtitling and Closed Captioning: Generate subtitles for videos, improving accessibility for deaf or hard-of-hearing viewers.
  3. 🌐 Language Learning and Translation: Use Whisper for language learning, pronunciation practice, and cross-lingual communication.
  4. πŸ“± Accessibility Tools: Integrate Whisper into assistive technologies for those with speech impairments or disabilities.
  5. πŸ” Content Searchability: Transcribe multimedia content into text to allow efficient search and analysis.
  6. πŸŽ™οΈ Voice-Controlled Applications: Use Whisper to build voice-driven applications and interact naturally with technology.
  7. πŸ“ž Customer Support Automation: Transcribe and analyze calls in real-time to automate customer support.
  8. πŸŽ™οΈ Podcasting and Journalism: Transcribe interviews and podcasts quickly for faster content creation.

πŸ’° API Pricing:

The Whisper Speech-to-Text API charges per second of audio processed, with 1 second = 1 API unit. This pricing is highly competitive for developers working with audio and video transcription at scale.

Audio Length Conversion to Seconds API Units Consumed
12.9 minutes of audio 12.9 * 60 = 774 seconds 774 API units
5 minutes of audio 5 * 60 = 300 seconds 300 API units
30 seconds of audio 30 seconds 30 API units

Important:

  • βœ… No charges for checking task status.
  • ❌ No charges if the task fails (only charged for 200 HTTP responses).

πŸ“€ API Endpoints:

1. Process Audio via URL 🌐

Submit an audio file by URL for transcription or other tasks like speaker diarization.

Endpoint: POST /neevcloud/whisper/process_url/

Parameter Description Example
url The URL of the audio file. "https://example.com/audio.wav"
task The task to perform: "transcribe", "translate". "transcribe"
language Language of the audio (or set to "None" for no detection). "None"
batch_size Size of the audio chunks to process (in seconds). 64
timestamp Whether to generate timestamps ("none" or "chunk"). "chunk"
diarise_audio Whether to separate speakers in the audio. false
is_async Choose asynchronous processing (true or false). false

Request Example:

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/process_url/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>' \
  -H 'Content-Type: application/json' \
  -d '{
        "url": "https://example.com/audio.wav",
        "task": "transcribe",
        "language": "None",
        "batch_size": 64,
        "timestamp": "chunk",
        "diarise_audio": false,
        "is_async": false
    }'
Enter fullscreen mode Exit fullscreen mode

2. Process Audio via File Upload πŸ“‚

Upload an audio file directly for transcription and other tasks.

Endpoint: POST /neevcloud/whisper/process_file/

Parameter Description Example
file The audio file to upload. @your-audio-file.mp3
task The task to perform: "transcribe", "translate". "transcribe"
language The language of the audio file. "None"
batch_size Size of the audio chunks to process (in seconds). 64
timestamp Generate timestamps for the chunks ("none" or "chunk"). "chunk"
diarise_audio Whether to separate speakers in the audio. false
is_async Choose asynchronous processing (true or false). false

Request Example:

curl -X 'POST' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/process_file/' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@your-audio-file.mp3' \
  -F 'task=transcribe' \
  -F 'language=None' \
  -F 'batch_size=64' \
  -F 'timestamp=chunk' \
  -F 'diarise_audio=false' \
  -F 'is_async=false'
Enter fullscreen mode Exit fullscreen mode

3. Check Task Status πŸ“Š

Retrieve the status and result of a specific transcription task using its task ID.

Endpoint: GET /neevcloud/whisper/status/{task_id}

Parameter Description Example
task_id The unique ID for the task you wish to check. "ad371472-e6e9-4ecf-b20f-10884230a09e"

Request Example:

curl -X 'GET' \
  'https://api.magicapi.dev/api/v1/neevcloud/whisper/status/ad371472-e6e9-4ecf-b20f-10884230a09e' \
  -H 'accept: application/json' \
  -H 'x-magicapi-key: <your-api-key>'
Enter fullscreen mode Exit fullscreen mode

Response Example:

{
  "status": "completed",
  "output": {
    "text": "Many people think that the best way to escape war is to dwell upon its horrors...",
    "chunks": [
      {
        "timestamp": [0, 7],
        "text": "Many people think that the best way to escape war..."
      },
      {
        "timestamp": [9.84, 16.5],
        "text": "Them vividly upon the minds of the younger generation..."
      }
    ]
  },
  "task_id": "ad371472-e6e9-4ecf-b20f-10884230a09e",
  "audio_duration_seconds": 30.589
}
Enter fullscreen mode Exit fullscreen mode

πŸ“’ Error Handling:

In the event of an error, the API will return standard HTTP error codes:

Error Code Description
400 Bad Request Invalid parameters or missing fields.
401 Unauthorized Invalid or missing API key.
500 Server Error An issue occurred on the server while processing.

Important: No API units will be charged in case of task failures or errors.


Try it here -> https://api.market/store/neevcloud/whisper

Top comments (0)