What is LangChain?
It is a framework which helps build LLM driven applications. One of the problems LangChain tries to solve is RAG or Retrieval Augmented Generation.
RAG or Retrieval Augmented Generation
RAG is a model architecture utilized in certain LLMs. Once a user sends a query to the LLM, a retriever module selects all the related information to the query. The retrieved information is used as context for a generator module which augments the relevant response.
Ingestion
Ingestion is the process of transforming the documents into numbers that computers can understand called vectors. The whole process includes:
- Loading the source document: This is done with the help of loaders. The source documents can be PDF, CSV, JSON files or any other web source too.
- Splitting the data of the document into chunks: This is done with the help of splitters. They split the data into smaller chunks.
- Converting each of those chunks into vectors: The chunks are converted to vectors which are known as embeddings, which are then stored to the vector store.
What happens when you query the LLM?
Once the data is loaded and is stored into the vector store, when a user sends a query, behind the scenes the query gets converted into embeddings. The vector store is checked for similar vectors and are returned. The vectors are then converted into text as a response to the user.
Architecture of a chatbot in LangChain
In Langchain, one of the architectures of a chatbot that is followed is shown below. It is structured in such a way that it is able to handle follow-up questions.
Initially, the chat history and the new question is passed to the LLM. Then, the question is passed to the LLM and is asked to create a standalone question of the same. The relevant documents are also retrieved from the vector store. Now, both the relevant documents and standalone question, will be used by the LLM to generate a response to the user.
Technical Terms
A few technical terms associated with LangChain:
- Embedding: Vector representations of text, that capture the semantic meaning of the text.
- Runnable: Allows you to run a set of runnables like model, prompt, parser, etc. in a pre-defined order.
- Streaming: Allows you to get the output chunk by chunk.
- Batch: Allows you to process multiple inputs efficiently in a batch or parallelized manner.
Sources
LangChain Documentation: https://js.langchain.com/docs/get_started
LangChain YouTube Channel: https://youtu.be/AKsfHK_4tf4?si=ZEIlHWBp4Qfco3O-
For more:
Chatting with a PDF using LangChain - 1
Chatting with a PDF using LangChain - 2
Top comments (0)