In the field of large language model (LLM) application development, LangChain Expression Language (LCEL) and AgentExecutor have been valuable tools for developers. However, as application scenarios become more complex, the limitations of these tools are becoming increasingly apparent. This article delves into the shortcomings of LCEL and AgentExecutor and introduces a new solution.
Limitations of LCEL Chain Expressions
LangChain Expression Language (LCEL) provides a convenient way to create chain applications, linking components such as knowledge bases, LLMs, prompts, tool calls, and output parsers into a directed acyclic graph. Although LCEL greatly reduces the difficulty of developing LLM applications, it still has some obvious limitations when dealing with complex, dynamic conversational flows:
- Linear Processes: LCEL chains are usually linear, executing steps in a predefined order. This linear structure limits the ability to perform dynamic routing and conditional branching in conversations, making it difficult to handle complex conversational scenarios.
- Complex State Management: In multi-turn conversations, state management with LCEL becomes very complex. Each chain call requires manually passing and updating the state, which not only increases code complexity but also raises the possibility of errors.
- Non-intuitive Tool Integration: Although LCEL chains can call external tools, integrating and coordinating the use of multiple tools within the chain's internal structure is not intuitive, especially when dynamically selecting tools based on conversational context.
For example, when constructing a parallel sub-problem optimization chain for a problem decomposition strategy, the complexity of LCEL expressions becomes apparent:
# Decomposition chain
decomposition_chain = (
{"question": RunnablePassthrough()}
| decomposition_prompt
| ChatOpenAI(model="gpt-4o-mini", temperature=0)
| StrOutputParser()
| (lambda x: x.strip().split("\n"))
)
# Sub-question answer generation chain
sub_question_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| sub_question_prompt
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)
# Assembly chain
chain = (
{"question": RunnablePassthrough(), "context": decomposition_chain}
| {"questions": RunnablePassthrough(), "answers": sub_question_chain.map()}
| RunnableLambda(format_qa_pairs)
| prompt
| llm_output_str
)
This example shows that when the nesting level of tools is slightly deeper, constructing LCEL chains becomes quite complex.
Limitations of AgentExecutor
The emergence of AgentExecutor partially addresses some of LCEL's shortcomings by allowing agents to dynamically select tools and operations based on input. However, AgentExecutor also has some notable limitations:
- Complexity: Configuring and using AgentExecutor is relatively complex, especially when handling complex conversational flows and multi-turn dialogues. It requires manually managing the agent's state and tool calls, increasing development difficulty.
- Limited Dynamic Routing Capability: Although AgentExecutor supports dynamic tool selection, it is still not flexible enough when handling complex conditional branches and dynamic routing. It lacks an intuitive way to define and execute complex conversational flows.
- Lack of State Persistence: AgentExecutor lacks a built-in state persistence mechanism for long-running conversations. Each time a conversation restarts, it must start from scratch, unable to resume the previous conversation state.
- Over-encapsulation: AgentExecutor requires that the wrapped agent meet specific requirements to be used, such as fixed input variables, fixed input prompts, fixed parsers, etc., making secondary development of AgentExecutor difficult.
- Black-box Uncontrollability: When building complex agents, it is impossible to modify the order of tool usage or insert human interaction during execution.
Introducing New Concepts: Graphs and State Machines
Faced with the limitations of LCEL and AgentExecutor, we need a more flexible and powerful framework to build complex agent applications. Before introducing this new framework, let's first understand the concepts of "graphs" and "state machines" through a simple example.
Imagine the scenario of a "baby care state graph":
- State: The baby's behavioral state, including hunger level, sleep state, body temperature, etc.
- Events: Actions like soothing to sleep, feeding, changing diapers.
- Nodes: Mother (decision-maker), grandmother/grandfather (checker), father (executor).
We can construct this scenario into a state graph:
In this state graph:
- The mother decides what event needs to be executed.
- The elders judge whether the mother's decision is reasonable.
- The father executes the specific event.
- After executing a specific event, the baby's state is updated, and the mother checks the state again to continue making decisions.
This "baby care state graph" is essentially a simplified graph structure and state machine model. Applying this concept to LLM/Agent application development, we can obtain a more flexible and powerful framework.
Conclusion
By analyzing the limitations of LCEL and AgentExecutor, we realize the need for a more flexible framework when building complex LLM applications. The concepts of graph structures and state machines provide us with new ideas. In the next article, we will introduce a new framework based on these concepts—LangGraph, how it addresses existing technical issues, and how to use it to build more powerful LLM applications.
Top comments (0)