DynamoDB is used as the chat history backend along with AWS Lambda Web adapter for response streaming
In a previous blog, I demonstrated how to Redis (Elasticache Serverless as an example) as a chat history backend for a Streamlit app using LangChain. It was deployed to EKS and also make use of EKS Pod Identity to manage application Pod
permissions for invoking Amazon Bedrock.
This use-case here is a similar one - a chat application. I will switch back to implementing things in Go using langchaingo (I used Python for the previous one) and continue to use Amazon Bedrock. But there are few unique things you can explore in this blog post:
- The chat application is deployed as an AWS Lambda function along with a Function URL.
- It uses DynamoDB as the chat history store (aka Memory) for each conversation - I extended langchaingo to include this feature.
- Thanks to the AWS Lambda Web Adapter, the application built as a (good old) REST/HTTP API using a familiar library (in this case, Gin.
- And the other nice add-on was to be able to combine Lambda Web Adapter streaming response feature with Amazon Bedrock streaming inference API.
As always, a diagram always helps.....
Deploy using SAM CLI (Serverless Application Model)
Make sure you have Amazon Bedrock prerequisites taken care of and the SAM CLI installed
git clone https://github.com/abhirockzz/chatbot-bedrock-dynamodb-lambda-langchain
cd chatbot-bedrock-dynamodb-lambda-langchain
Run the following commands to build the function and deploy the entire app infrastructure (including the Lambda Function, DynamoDB, etc.)
sam build
sam deploy -g
Once deployed, you should see the Lambda Function URL in your terminal. Open it in a web browser and start conversing with the chatbot!
Inspect the DynamoDB table to verify that the conversations are being stored (each conversation will end up being a new item in the table with a unique chat_id
):
aws dynamodb scan --table-name langchain_chat_history
Scan
operation is used for demonstration purposes. UsingScan
in production is not recommended.
Quick peek at the good stuff....
- Using DynamoDB as the backend store history: Refer to the GitHub repository if you are interested in the implementation. To summarize, I implemented the required functions of the schema.ChatMessageHistory.
- Lambda Web Adapter Streaming response + LangChain Streaming: I used the chains.WithStreamingFunc option with chains.Call call and then let Gin Stream do the heavy-lifting of handling the streaming response.
Here is a sneak peek of the implementation (refer to the complete code here):
<span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">=</span> <span class="n">chains</span><span class="o">.</span><span class="n">Call</span><span class="p">(</span><span class="n">c</span><span class="o">.</span><span class="n">Request</span><span class="o">.</span><span class="n">Context</span><span class="p">(),</span> <span class="n">chain</span><span class="p">,</span> <span class="k">map</span><span class="p">[</span><span class="kt">string</span><span class="p">]</span><span class="n">any</span><span class="p">{</span><span class="s">"human_input"</span><span class="o">:</span> <span class="n">message</span><span class="p">},</span> <span class="n">chains</span><span class="o">.</span><span class="n">WithMaxTokens</span><span class="p">(</span><span class="m">8191</span><span class="p">),</span> <span class="n">chains</span><span class="o">.</span><span class="n">WithStreamingFunc</span><span class="p">(</span><span class="k">func</span><span class="p">(</span><span class="n">ctx</span> <span class="n">context</span><span class="o">.</span><span class="n">Context</span><span class="p">,</span> <span class="n">chunk</span> <span class="p">[]</span><span class="kt">byte</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
<span class="n">c</span><span class="o">.</span><span class="n">Stream</span><span class="p">(</span><span class="k">func</span><span class="p">(</span><span class="n">w</span> <span class="n">io</span><span class="o">.</span><span class="n">Writer</span><span class="p">)</span> <span class="kt">bool</span> <span class="p">{</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Fprintf</span><span class="p">(</span><span class="n">w</span><span class="p">,</span> <span class="p">(</span><span class="kt">string</span><span class="p">(</span><span class="n">chunk</span><span class="p">)))</span>
<span class="k">return</span> <span class="no">false</span>
<span class="p">})</span>
<span class="k">return</span> <span class="no">nil</span>
<span class="p">}))</span>
Closing thoughts...
I really like the extensibility of LangChain. While I understand that langchaingo
may not be as popular as the original python version (I hope it will reach there in due time 🤞), but it's nice to be able to use it as a foundation and build extensions as required. Previously, I had written about how to use the AWS Lambda Go Proxy API to run existing Go applications on AWS Lambda. The AWS Lambda Web Adapter offers similar functionality but it has lots of other benefits, including response streaming and the fact that it is language agnostic.
Oh, and one more thing - I also tried a different approach to building this solution using the API Gateway WebSocket. Let me know if you're interested, and I would be happy to write it up!
If you want to explore how to use GO for Generative AI solutions, you can read up on some of my earlier blogs:
- Building LangChain applications with Amazon Bedrock and Go - An introduction
- Serverless Image Generation Application Using Generative AI on AWS
- Generative AI Apps With Amazon Bedrock: Getting Started for Go Developers
- Use Amazon Bedrock and LangChain to build an application to chat with web pages
Happy building!
Top comments (0)