LangChain and FastAPI working in tandem provide a strong setup for the asynchronous streaming endpoints that LLM-integrated applications need. Modern chat applications live or die by how effectively they handle live data streams and how quickly they can respond.
Introduction to LangChain
LangChain is a library that simplifies the incorporation of language models into applications. It provides an abstracted layer over various components such as large language models (LLMs), data retrievers, and vector storage solutions. This abstraction allows developers to integrate and switch between different backend providers or technologies seamlessly.
Introduction to FastAPI
FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.7+ based on standard Python type hints. It is designed for creating RESTful APIs quickly and efficiently, with automatic interactive API documentation provided by Swagger and ReDoc.
Combining LangChain with FastAPI
By combining LangChain with FastAPI, developers can create robust, asynchronous streaming APIs that handle real-time data efficiently. This integration is particularly useful for applications that require live updates, such as chat applications or real-time analytics dashboards.
Setting Up the FastAPI Project
First, install the necessary packages:
pip install fastapi langchain pydantic uvicorn
Next, define the FastAPI router and Pydantic models to structure and validate incoming messages.
from fastapi import FastAPI, APIRouter
from pydantic import BaseModel
from typing import List
app = FastAPI()
router = APIRouter()
class Message(BaseModel):
role: str
content: str
class ChatPayload(BaseModel):
messages: List[Message]
class Config:
schema_extra = {
"example": {
"messages": [{"role": "user", "content": "Who are you?"}]
}
}
Creating the Streaming API Endpoint
Create an endpoint to receive chat messages and stream responses back to the client using LangChain. This is done by emitting server-sent events (SSE).
from fastapi import Request
from fastapi.responses import StreamingResponse
from langchain_openai import ChatOpenAI
import json
@router.post("/api/completion")
async def stream(request: Request, payload: ChatPayload):
chat = ChatOpenAI()
return StreamingResponse(send_completion_events(payload.messages, chat=chat), media_type="text/event-stream")
async def send_completion_events(messages, chat):
async for patch in chat.astream_log(messages):
for op in patch.ops:
if op["op"] == "add" and op["path"] == "/streamed_output/-":
content = op["value"] if isinstance(op["value"], str) else op["value"].content
json_dict = {"type": "llm_chunk", "content": content}
json_str = json.dumps(json_dict)
yield f"data: {json_str}\\n\\n"
app.include_router(router)
Running the FastAPI Application
Run the FastAPI application using Uvicorn, an ASGI server implementation for Python.
uvicorn main:app --reload
Navigate to http://127.0.0.1:8000/docs to see the interactive API documentation generated by FastAPI. This documentation provides an easy way to test the API endpoints.
Why JSON Patch?
LangChain's astream_log method uses JSON Patch to stream events, which is why understanding JSON Patch is essential for implementing this integration effectively. JSON Patch provides an efficient way to update parts of a JSON document incrementally without needing to send the entire document. This is particularly useful in real-time applications where data needs to be updated frequently and incrementally.
Brief Overview of JSON Patch
JSON Patch supports several operation types, including:
- Add: Inserts a new value into the JSON document at the specified path.
- Remove: Removes the value at the specified path.
- Replace: Replaces the value at the specified path with a new value.
- Move: Moves a value from one path to another in the document.
- Copy: Copies a value from one path to another.
- Test: Tests that a specified value at a specified path exists.
Consider the original document:
{
"baz": "qux",
"foo": "bar"
}
Applying the patch:
[
{ "op": "replace", "path": "/baz", "value": "boo" },
{ "op": "add", "path": "/hello", "value": ["world"] },
{ "op": "remove", "path": "/foo" }
]
Results in:
{
"baz": "boo",
"hello": ["world"]
}
JSON Patch allows for efficient, incremental updates, making it ideal for applications that require frequent or real-time updates to their data.
Conclusion
By integrating LangChain with FastAPI, developers can build efficient asynchronous streaming APIs capable of handling real-time data. This setup is ideal for applications like chatbots, where timely responses and data processing are crucial. FastAPI's ease of use and LangChain's abstraction capabilities, combined with the efficiency of JSON Patch, make this combination a powerful tool for modern web development.
Want to learn more about building Responsive LLMs? Check out my course on newline: Responsive LLM Applications with Server-Sent Events
I cover :
- How to design systems for AI applications
- How to stream the answer of a Large Language Model
- Differences between Server-Sent Events and WebSockets
- Importance of real-time for GenAI UI
- How asynchronous programming in Python works
- How to integrate LangChain with FastAPI
- What problems Retrieval Augmented Generation can solve
- How to create an AI agent ... and much more.
Worth checking out if you want to build your own LLM applications - in the course I will provide extensive code examples and help you go from concept to deployment.
Top comments (0)