Developers - those who provide solutions to computer problems, establish base procedures, program, and maintain solutions - are indeed programmers, but that doesn't automatically make them experts in everything related to code. Take, for example, the creation of an ML-dependent function: it necessitates familiarity with models and algorithm training, knowledge that isn't common among all programmers.
Fortunately, there are ready-to-use APIs that leverage existing, previously trained models to execute ML functions, and they can be used without the need for ML knowledge. Additionally, they ensure the security of the information shared with them. Up next, I'll introduce you to some specific ML API services and four use cases to get you familiar with them and let your imagination run wild.
How Do Ready-to-Use ML-Function APIs Work? Just Follow These 3 Simple Steps:
- Define the input, the location of the object in an Amazon S3 bucket or text.
- Invoke the API using this input.
- Output will be in JSON format.
Let's Take a Look at the APIs
AWS offers a variety of ML and AI services designed to expedite their implementation in your applications. These services range from those that equip you with the necessary infrastructure to train your own models to those that come as ready-to-use, pre-trained API calls. Let's now focus specifically on some examples of the latter:
API Type | How you can do | Service Name |
---|---|---|
๐ Analysis of images (.png, .jpg) /videos (.mp4) |
|
Amazon Rekognition |
๐ Detection and analysis of text in documents (PNG, JPG, PDF or TIFF) |
|
Amazon Textract |
๐ Natural Language Processing (NLP) and text analysis | Processes documents and extracts information such as:
|
Amazon comprehend |
๐ Text to speech |
|
Amazon Polly |
๐ Speech to Text |
|
Amazon Transcribe |
๐ Translate | Translate unstructured text (UTF-8) documents or to build applications that work in multiple languages | Amazon Translate |
๐ Use Cases
The most effective way to learn programming is by solving problems through code development. The same principle applies when learning how to use a service: you need to actively use it to understand it. The following four use cases are examples of both real and hypothetical problems that I tackled during my learning process.
If you're passionate about utilizing video as a tool for education, it would be ideal to reach as many people as possible. One common barrier to this is language. This application enables you to create subtitles and translate them into any desired language to remove this barrier.
- Upload the .mp4 video to an Amazon s3 bucket.
- A Lambda Function makes the call to Transcribe API.
- Subtitles file in the original language are downloaded to S3 Bucket.
- A Lambda Function makes the call to Translate API.
- Subtitles file in the new language is downloaded to S3 Bucket.
Here's the code to create this solution.
Many people possess piles of documents at home, ranging from letters from past lovers to medical records, children's school memorabilia, and bank statements, etc. Wouldn't it be convenient to neatly store these in the cloud? Explore and learn about the functionalities of Textract and Comprehend with this app.
- Upload the document (PNG, JPG, PDF or TIFF) to an S3 Bucket.
- A Lambda Function makes the call to Textract API.
- With the response from Textract, Lambda Function makes the call to Comprehend API.
- A Lambda Function makes the call to the Translate API.
- The response is saved in an S3 bucket.
Here's the code to create this solution
- Use case 3: Make Polly Talk ๐ฆ
I was curious how an Italian speaking Chinese sounded, and since Polly has native voices for each language I created this notebook to play ๐.
- From a Jupyter Notebook make the call to Polly API.
- Polly stores the result in a S3 bucket.
- Retrieves the audio.
Here's the code to create this solution
- Use case 4: Video content moderation โฏ๏ธ ๐ซ ๐ฌ
Iยดm fan of action movies and wanted to try Rekognition with the trailer of Die Hard, so I created this application and wow! Each dataframe is pure violence ๐ซฃ... I invite you to try it with a trailer of your favorite movie.
- Upload the .mp4 video to an s3 bucket.
- A Lambda Function makes the call to Rekognition API.
- Once the video review is finished, a new Lambda Function retrieves the result and stores it in an s3 bucket.
Here's the code to create this solution
Conclusion
You've now learned that AIML can be used via an API call to perform a variety of tasks such as analyzing images and videos, detecting and analyzing text in scanned documents, and leveraging Natural Language Processing (NLP) to extract sentiment from dominant languages, among many other things. In addition, you have the capability to convert text to speech and vice versa, and to utilize a language translator, all within the reach of a single API call.
This just scratches the surface of what can be achieved by leveraging AIML applications via API calls.
No doubt, there's a real or hypothetical problem you'd like to address using one of these services. Even if you don't have one in mind, I've provided these links for you to continue experimenting and learning:
- Amazon Translate Code Samples.. more code samples
- Amazon Transcribe and Amazon Comprehend Code Samples
- Amazon Polly Code Samples
- Amazon Rekognition Code Samples
Top comments (0)