DEV Community

foxgem
foxgem

Posted on

Code Explanation: "AI-Powered Hedge Fund"

Disclaimer: this is a report generated with my tool: https://github.com/DTeam-Top/tsw-cli. See it as an experiment not a formal research, 😄。


Summary

This repository simulates an AI-powered hedge fund, where multiple AI agents, each embodying the investment strategy of a famous investor, analyze financial data and make trading decisions. The system incorporates tools for backtesting these strategies. It is important to note that this is a proof-of-concept and is intended for educational and research purposes only.

Modules

  • src/agents: Contains the definitions for each investment agent (e.g., Warren Buffett, Cathie Wood) and their specific logic for analyzing stocks and generating trading signals.
  • src/tools: Defines tools used by the agents, primarily API calls for fetching financial data.
  • src/llm: Manages the configuration and instantiation of Large Language Models (LLMs) from different providers (OpenAI, Groq, etc.).
  • src/graph: Defines the structure of the agent workflow.
  • src/utils: Provides utility functions for tasks such as displaying information, managing progress, and interacting with LLMs.
  • src/data: Includes data models (using Pydantic) and a caching mechanism for API responses.
  • src/backtester.py: Implements the backtesting functionality to evaluate the performance of the AI hedge fund over historical data.
  • src/main.py: Serves as the main entry point for running the AI hedge fund.

Code Structure

Section 1: Agent Definitions (src/agents)

This section defines the core logic of the AI hedge fund: the investment agents. Each agent is responsible for analyzing stocks based on a specific investment philosophy.

  • Key Classes/Files:

    • Each file (e.g., warren_buffett.py, cathie_wood.py) defines a single agent.
    • Each agent function takes the current AgentState as input and returns an updated AgentState.
    • Agents use the tools.api module to fetch financial data.
    • Agents use LLMs from llm.models to generate investment signals based on their analysis.
    • Agents output a trading signal (bullish, bearish, or neutral) with a confidence level and reasoning.
  • Workflow:

  1. Data Fetching: Agents retrieve financial data using functions from src/tools/api.py. This includes financial metrics, line items, insider trades, and company news.
  2. Analysis: The agents apply their specific investment strategies to the fetched data. This often involves calculating financial ratios, analyzing trends, and assessing market sentiment.
  3. Reasoning (LLM): Many agents utilize LLMs to synthesize their analysis and generate a well-reasoned investment signal. The prompt engineering within these agents is crucial for guiding the LLM's decision-making process.
  4. Signal Generation: Each agent outputs a trading signal (bullish, bearish, or neutral), along with a confidence level and a textual explanation of their reasoning.
  • Key Functions:

    • warren_buffett_agent(state: AgentState): Example of an agent function.
    • generate_buffett_output(...): Example of function uses LLM to generate investment decisions.
    • analyze_fundamentals(...): Example of function analyzes company fundamentals based on Buffett's criteria.
  • Usage of LLMs:

    • The agents use LLMs to provide human-like reasoning for their investment decisions. The quality of the LLM's output depends heavily on the prompts provided to it, which are designed to guide the LLM in emulating the investment style of the specific investor.
    • The file llm.py contains the call_llm function, which simplifies the process of calling an LLM, handling errors, and parsing the LLM's output into a Pydantic model. This function is used in multiple agents.

Section 2: Data Handling (src/data) and API Calls (src/tools)

This section focuses on how the system retrieves and manages financial data.

  • Key Classes/Files:

    • src/data/models.py: Defines Pydantic models for representing financial data (e.g., Price, FinancialMetrics, CompanyNews).
    • src/data/cache.py: Implements an in-memory cache to store API responses and reduce the number of external API calls.
    • src/tools/api.py: Provides functions for fetching financial data from the FinancialDatasets API.
  • Workflow:

  1. API Calls: The src/tools/api.py module makes calls to the FinancialDatasets API to retrieve financial data. It constructs the appropriate URLs and handles API authentication using an API key stored in environment variables.
  2. Data Modeling: The API responses are parsed using Pydantic models defined in src/data/models.py. This ensures data consistency and type safety.
  3. Caching: The src/data/cache.py module provides an in-memory cache to store API responses. Before making an API call, the system checks the cache for the requested data. If the data is found in the cache and is still valid, it is used directly, avoiding the need for an external API call.
  • Key Functions:

    • get_prices(ticker: str, start_date: str, end_date: str): Fetches price data for a given ticker within a specified date range.
    • get_financial_metrics(ticker: str, end_date: str, period: str = "ttm", limit: int = 10): Retrieves financial metrics for a given ticker.
    • get_insider_trades(ticker: str, end_date: str, start_date: str | None = None, limit: int = 1000): Retrieves insider trading data for a given ticker.
    • get_company_news(ticker: str, end_date: str, start_date: str | None = None, limit: int = 1000): Retrieves company news articles for a given ticker.
  • External API Calls:

    • FinancialDatasets API: Used to retrieve financial data, including stock prices, financial metrics, insider trades, and company news.

Section 3: LLM Management (src/llm)

This section handles the configuration and instantiation of Large Language Models (LLMs) from different providers.

  • Key Classes/Files:

    • src/llm/models.py: Defines the available LLMs, their providers, and utility functions for interacting with them.
  • Workflow:

  1. Model Definition: The src/llm/models.py file defines the available LLMs and their providers using the LLMModel class.
  2. Model Instantiation: The get_model(model_name: str, model_provider: ModelProvider) function instantiates an LLM from the specified provider using the appropriate API key stored in environment variables.
  3. LLM Interaction: The call_llm function in src/utils/llm.py simplifies the process of calling an LLM, handling errors, and parsing the LLM's output into a Pydantic model.
  • Key Functions:
    • get_model(model_name: str, model_provider: ModelProvider): Instantiates an LLM from the specified provider.
    • call_llm(prompt: Any, model_name: str, model_provider: str, pydantic_model: Type[T], agent_name: Optional[str] = None, max_retries: int = 3, default_factory = None): Simplifies the process of calling an LLM, handling errors, and parsing the LLM's output into a Pydantic model.

Section 4: Workflow Management (src/graph)

This section defines the structure of the agent workflow using LangGraph.

  • Key Classes/Files:

    • src/graph/state.py: Defines the AgentState class, which represents the state of the agent workflow.
    • src/main.py: Creates the LangGraph workflow.
  • Workflow:

  1. State Definition: The AgentState class in src/graph/state.py defines the data that is passed between agents in the workflow. This includes the list of messages, the data dictionary, and metadata.
  2. Workflow Creation: The create_workflow(selected_analysts=None) function in src/main.py creates the LangGraph workflow. This involves adding nodes for each selected agent, as well as nodes for risk management and portfolio management.
  3. Workflow Execution: The run_hedge_fund function in src/main.py executes the LangGraph workflow. This involves invoking the compiled LangGraph app with the initial state and then iterating over the resulting messages.
  • Key Functions:
    • create_workflow(selected_analysts=None): Creates the LangGraph workflow.
    • run_hedge_fund(...): Executes the LangGraph workflow.

Section 5: Backtesting (src/backtester.py) and Main Execution (src/main.py)

This section covers the backtesting functionality and the main entry point for running the AI hedge fund.

  • Key Classes/Files:

    • src/backtester.py: Implements the backtesting functionality.
    • src/main.py: Serves as the main entry point for running the AI hedge fund.
  • Workflow:

  1. Backtesting: The Backtester class in src/backtester.py simulates the execution of the AI hedge fund over historical data. It iterates over a date range, fetches historical data, executes trades based on the agent's decisions, and tracks the portfolio's performance.
  2. Main Execution: The src/main.py file parses command-line arguments, prompts the user for input (e.g., tickers, analysts, model), creates the LangGraph workflow, and executes the AI hedge fund.
  • Key Functions:
    • Backtester.run_backtest(): Executes the backtesting simulation.
    • run_hedge_fund(...): Executes the AI hedge fund.

Db Schema

There is no explicit database schema defined, as the system uses the FinancialDatasets API for data retrieval and an in-memory cache for temporary storage.

External API Calls

  • FinancialDatasets API: This API is used to retrieve financial data, including:
    • Stock prices (/prices/)
    • Financial metrics (/financial-metrics/)
    • Insider trades (/insider-trades/)
    • Company news (/news/)
    • Search for line items (/financials/search/line-items)

Insights

  • Modular Design: The codebase is well-organized into modules, each with a specific responsibility. This makes the code easier to understand, maintain, and extend.
  • Agent-Based Architecture: The use of independent agents, each with its own investment strategy, allows for a flexible and extensible system. New agents can be easily added to the system to incorporate different investment philosophies.
  • LLM Integration: The integration of LLMs into the agents' decision-making process is a key feature of this project. The LLMs provide human-like reasoning and explanations for the agents' investment decisions.
  • Caching: The in-memory cache helps reduce the number of external API calls and improve the system's performance.
  • Backtesting: The backtesting functionality allows for evaluating the performance of the AI hedge fund over historical data. This is crucial for validating the effectiveness of the investment strategies.
  • Progress Tracking: The src/utils/progress.py module provides a mechanism for tracking the progress of the agents, providing valuable insights into the system's operation.

Design Experiences

  • Agent Customization: The system allows users to select which agents to include in the workflow. This enables users to customize the AI hedge fund to their specific investment preferences.
  • Flexibility: The modular design and agent-based architecture make the system highly flexible and extensible. New agents, data sources, and trading strategies can be easily added to the system.

Creativity

  • Emulating Famous Investors: The concept of creating AI agents that emulate the investment styles of famous investors is a novel and engaging approach to exploring the use of AI in finance.
  • LangGraph: The use of LangGraph to orchestrate the agent workflow provides a structured and flexible way to manage the interactions between agents.

Github Repo: AI-Powered Hedge Fund


Report generated by TSW-X
Advanced Research Systems Division
Date: 2025-03-11 01:49:55.145783

Top comments (0)