Overview of My Submission
Translate-io - One command to record audio for n minutes and get a recording file and a transcription out of it. One command to translate a file. Java/Spring implementation to get the transcription. SDK's for node.js, python, and c# is there this could open path for Java.
Submission Category:
Accessibility Advocates
Github
mahadev-k / translate-io
A language translation project
translate-io
A language translation project
This is a spring shell project offering deepgram speech to text translation
mvn spring-boot:run
shell command
sptt {mins} {seconds} {language(Optional)}- This will start recording audio for n minutes and then the whole audio will transcribed and shown in the notepad : Windows stop-sptt {id} - Will stop recording and translation. translate-file {filepath} {language(Optional)} - Ex translate-file recordings/record.wav will translate the file to english and show you the transcription
Additional Resources / Info
translate-file recordings/song.mp3 it
-> Bella ciao transcript.
My Deepgram Use-Case
Hey everyone, welcome to spring-shell with deepgram API. Deepgram had provided sdks for python, node.js, and c# I believe but where’s Java. Also one of the idea was to translate the audio based on the language and then translating the transcript to the language the person understands. Like we watching a spanish movie and we get spanish transcript and we can send this spanish transcript to some service that could translate that into english.
Didnt get the time for this.
I tried to implement some speech-to-text translation with Java and had a lot of challenges with that. I haven’t worked on multithreading but in the past two days I learned a lot.
This application is built with spring-boot and spring-shell.
Spring-shell will help you type commands to run which found to be cool and I wanted to develop something based on that and I read about this hackathon on Friday and thought why not??
Application structure that I thought in my mind will be like follows.
Controller – Takes the command and pass it to TranslateService.
TranslateService – Offers multiple services speechToText, LiveTranslation(Not working 🤷♂️) etc...
Sub services – There can be apis that can support speechToText, Live translation etc... But there can be api’s that may not support both so we have two interfaces speechToText and LiveTranslation and Deepgram Implements both.
And the main service will call the subservice that we wish, now it is deepgram and if we want to change, we can easily have a class implementing the interfaces thus having a good abstraction and loose coupling. This is what I thought and you know what have happened when you go through the code.
The sub services perform the task and returns us the result.
Now, I think implementation of this in JS was the easiest way but you know Java made me strain a lot on multi-threading.
Multi-threading is mainly required to capture the audio parallelly as program executes other functionalities.
To brief about functionalites, see the following commands.
sptt <minutes> <seconds> <language (Optional)>
Capture audio for n minutes and m seconds and then send the recording for transcription which will be saved in /recordings folder of the project.
stop-sptt <threadid> to stop a running transcription.
translate-file <filepath> <language>
Run project : mvn spring-boot:run
Generate transcripts for a file. For hindi audio you can provide “hi” on the language.
Watch the video and checkout the project to get better understanding.
I saw a lot of cool projects done by a lot of people kudos to you all. Happy coding!
Top comments (0)