When someone talks about NLU or even AI in the context of chatbots or conversational apps, they probably mean either one of these concepts: ASR, NLP, NLU, NLG or TTS.
These are probably the most confusing concepts when it comes to understanding the inner concepts of conversational apps. Let’s quickly demystify them (with really simple words):
ASR or Automatic Speech Recognition is the process of taking the speech (voice) signal as an input and then finding out the words that were actually spoken.
NLP or Natural Language Processing is an umbrella term that describes the ability of the machine to manipulate (syntactic parsing, text categorization...etc) the human language (English, French…etc).
NLU or Natural Language Understanding is a subset of NLP that is responsible for the semantic parsing and analysis, entities extracting…etc. NLU tries to structure the input data so it can easily be understood by the machine.
NLG or Natural Language Generation is the step when the machine tries to transform the structured (from NLU) data into human-readable language.
TTS or Text-to-Speech Synthesis, put simply, takes the generated text from the NLG and converts it to speech.
That's all you need to know to get started building your next conversational app.
Follow me at @manekinekko to learn more about chatbots and conversational apps.
Top comments (3)
Can you present the libraries that are considered mature enough on each parts ?
Last time i checked (1-2 years ago) it was difficult to visualize the steps needed to produce a TTS that was a decent voice on a local machine ie. without using a not-free web service.
What i want to achieve is taking enough samples from a celebrity voice and produce the data required to make it enough credible to speak notifications.
The open source software i found are either producing some very un-natural voice or too cryptic on what data they need and how to produce it.
Thanks :)
Do you actively develop natural language interpreting applications in your daily job?
I was fortunate enough to be able to work on conversational apps for many enterprises; so I'm quite familiar with the concepts. However, I am no data scientist nor a linguistic engineer. But I'm happy to help.