How my kids learned English with Serverless and GenAI

#genai #serverless #lambda #programming

I'm a father of two kids; currently, they are 8 and 6 years old.

Last year, I received an invite to participate in the AWS Global Heroes Summit, a yearly meeting of all AWS Heroes from around the world.

Happily, the meeting was scheduled to take place during the kids' vacation here in Brazil, so it was an easy decision to have our family vacation together in Seattle, WA—the city where the event was held.

My kids love learning new languages and having new experiences when traveling. When we traveled to Argentina, they wanted to learn some words and sentences in Spanish. When we told them we were traveling to the US, they wanted to learn more English.

The school where they study has English classes every day, but we knew they needed to practice more before the trip to be better prepared.

Here is where I, as the nerd that I am, stepped in: how could I use technology to help with that? Especially Serverless and GenAI. And that’s how this project started.

The Idea

The idea was very simple. What if I created a memorization app for my kids? But I didn’t want to manually think about the words, images, or speech. First, because I wanted all words with no limitations, including not needing to translate every word. I didn't want to find an image for each of these words. Second, the biggest restriction I had was time. As a busy executive, I didn't have enough time to do all of these tasks and also use the solution with my kids.

With this in mind, I built an app that generates everything automatically. I just need to specify the type of words I want, for example, "animals" or "vehicles," and everything is created—including the words (both in English and Portuguese), the images, and the speech in both languages.

The app works like this: my kids see an image and an English word. When they click on it, they hear the word in English. If they click on the "card," the word changes to Portuguese, and clicking it again plays the pronunciation in Portuguese. That simple.

It’s a low-cost way to memorize English words while also learning more about AWS services.

Loading Words into the Database

First of all, to use the app, you need to load words into the database.

I created an endpoint that generates words based on what I request using GenAI. Because I personally load these words into the database, a simple REST endpoint is good to trigger the process. A Lambda function receives an invocation from an API Gateway and calls the Bedrock service to generate words in English along with their Portuguese translations, so I get a pair of words—a word and its translation. Basically, I have a prompt that asks Bedrock for a JSON with these words, concatenating the instructions I send in the payload of the endpoint. I'm using the Claude 3 Haiku model from Anthropic for that.

I receive a JSON like this:

{substantivos: [{ 'português': 'mesa', 'inglês': 'table' },{ 'português': 'cadeira', 'inglês': 'chair' }]}

Because I want these words to be generated and saved quickly in my database, I send each word pair as an event to EventBridge. The idea of using EventBridge is leveraging its power to parallelize the generation. Also, I can later have other applications listening to these events, for example, translating into another language (maybe for a future trip with the kids).

Then, I use Step Functions to handle the rest. In that workflow, I create an image for each word as well as speech for both words. Here, I have a classic necessity of doing things step by step. I need to create an image for the word, then generate speech for it, and in the end, save all the data. When I explain this, it’s clear to me that a workflow is necessary. Step Functions is great for that. One good thing was that I could parallelize the generation of images and speech by putting them side by side in my workflow. Once both are finished, I save the data. No fancy controls are needed—just a clean workflow. Easy to build, easy to read, and easy to understand what is happening.

For image generation, I used GenAI through Bedrock. The model is Amazon Titan Image—simple, cheap, and perfect for this use case. No need for complex images, just simple and clean visuals that are easily recognized by a child.

For speech, I used Polly.

Both images and audio files are saved in S3 buckets.

Afterward, all the data is saved in a DynamoDB table. Here’s an important detail: I have two tables. One to save all word data and another to store the number of words available. I do this to easily retrieve a random word in the app. I’ll explain more about that.

With that, I can load as many words as I want into my app.

Now, let’s talk about the app itself.

Here’s the app architecture:

As I explained, the idea of the app is to show random images to my kids so they can memorize and practice speaking each word.

So, I have an endpoint that retrieves a random word from the database and displays it to the user. Since I need to fetch a random word, I use a Lambda function between API Gateway and the DynamoDB table. This Lambda function gets the number of words available and generates a random number based on that. With this number, it finds the word in a DynamoDB table.

I'm serving images and audio from S3 through CloudFront with caching enabled.

Finally, I have a simple Angular page hosted on S3 and served via CloudFront.

You can try it out on our demo website, where everything is running: https://mee-moo.com

The code of this solution you can find in this repository in my GitHub: https://github.com/epiresdasilva/learn-english-with-serverless-genai

Conclusions

I like using challenges from my daily life with my family to demonstrate how we can use these services to solve real problems—usually with low cost and in a quick way.

It’s not different from the challenges we face in our companies. Why not use more of this approach to solve everyday problems? Sometimes, this is the first step in showcasing the value of these kinds of services, which can later be applied to core solutions.

This solution is basically free, running within the AWS Free Tier. Even during development, when I ran multiple tests to improve the solution, I spent no more than 10 dollars.