foxgem

Posted on Mar 7

Code Explanation: "Minions: Collaborative On-Device and Cloud LLMs"

#ai #llm #aiagent #rag

Disclaimer: this is a report generated with my tool: https://github.com/DTeam-Top/tsw-cli. See it as an experiment not a formal research, 😄。

Summary

This repository implements the Minions protocol, enabling collaboration between small, on-device language models (LLMs) and larger, more powerful cloud-based LLMs. By leveraging the strengths of both local and remote models, the system aims to reduce cloud costs while maintaining or improving the quality of generated outputs. The repository provides a demonstration of this protocol through a Streamlit application and example code.

Modules

minions: Core logic for the Minion and Minions protocols, including job preparation, execution, and output transformation.
minions.clients: Clients for interacting with various local (e.g., Ollama, Tokasaurus, MLX) and remote (e.g., OpenAI, Anthropic, Together, Groq, OpenRouter, Perplexity) LLM providers.
minions.utils: Utility functions for tasks like PII extraction.
minions.prompts: Prompts used by the local and remote LLMs.
minions.examples: Example tasks and data for different domains (health, code, finance, etc.).
mcp: Tools for integration with the Model Context Protocol.
app.py: Streamlit application demonstrating the Minion/Minions protocol.

Code Structure

Section 1: Entry Points and Core Logic (`run_minion_cli.py`, `app.py`, `minions`)

This section contains the entry points for running the Minion/Minions protocols and the core logic that orchestrates the interaction between local and remote LLMs.

run_minion_cli.py: A command-line interface for running the Minion/Minions protocols. It handles argument parsing, context loading, client initialization, and protocol execution.
app.py: A Streamlit application that provides a user-friendly interface for interacting with the Minion/Minions protocols. It allows users to specify a task, upload context documents, select local and remote LLMs, and view the results. This section uses a callback function message_callback to display messages from the local and remote models in the chat interface.
minions directory: Contains the core logic for the Minion and Minions protocols:
- minion.py: Implements the Minion protocol, where a local LLM refines the prompt and a remote LLM generates the final answer.
- minions.py: Implements the Minions protocol, where a local LLM processes chunks of the document and a remote LLM aggregates the results to produce the final answer.
- minions_mcp.py: Implements the Minions protocol with Model Context Protocol (MCP) integration, allowing the local LLM to access external tools and data sources.

The Minion and Minions classes both implement the __call__ method, which is the entry point for executing the respective protocols. This method orchestrates the interaction between the local and remote LLMs, passing prompts and data between them and handling any necessary pre-processing or post-processing steps.

Section 2: LLM Clients (`minions/clients`)

This section contains the clients for interacting with various local and remote LLM providers.

Each client (e.g., OllamaClient, OpenAIClient, AnthropicClient, TogetherClient, GroqClient, PerplexityAIClient, OpenRouterClient, MLXLMClient) provides a chat method for sending messages to the corresponding LLM and receiving responses.
The chat method typically takes a list of message dictionaries as input, where each dictionary represents a turn in the conversation and contains the role (e.g., "user", "system", "assistant") and content of the message.
The OllamaClient is used for interacting with local LLMs served by Ollama. It supports specifying a structured output schema, which allows the LLM to generate responses in a predefined format.
The OpenAIClient is used for interacting with remote LLMs served by OpenAI. It supports specifying various parameters such as model name, temperature, and max tokens.
The OpenRouterClient inherits from OpenAIClient, allowing it to interact with various LLMs through a unified API, thus reducing code duplication.

All client's chat method returns a tuple of (List[str], Usage), which includes the response strings and token usage information. The Usage dataclass stores the number of prompt tokens and completion tokens used during the API call.

Section 3: Utility Functions (`minions/utils`)

This section contains utility functions for tasks such as PII extraction.

pii_extraction.py: Provides a PIIExtractor class for extracting personally identifiable information (PII) from text. It uses regular expressions and the spaCy NLP library to identify various types of PII, such as names, emails, phone numbers, and addresses. The extract_pii method returns a dictionary with PII types as keys and lists of found instances as values.

The PIIExtractor class is used in the Minion protocol to remove PII from the context before sending it to the remote LLM, thus protecting user privacy.

Section 4: Prompts (`minions/prompts`)

This section contains the prompts used by the local and remote LLMs.

Each file (e.g., minion.py, minions.py, minions_mcp.py) defines a set of prompts for different tasks, such as:
- SUPERVISOR_INITIAL_PROMPT: The initial prompt sent to the remote LLM to start the conversation.
- SUPERVISOR_CONVERSATION_PROMPT: The prompt used by the remote LLM to analyze the response from the local LLM and decide whether to provide a final answer or request additional information.
- SUPERVISOR_FINAL_PROMPT: The prompt used by the remote LLM to generate the final answer.
- WORKER_SYSTEM_PROMPT: The system prompt given to the local LLM to set its role and context.
The prompts are designed to guide the LLMs in performing their respective tasks and to ensure that the overall protocol is executed correctly.

Section 5: Examples (`minions/examples`)

This section contains example tasks and data for different domains, such as health, code, finance, and novel writing.

Each directory (e.g., health, code, finance) contains a task.json file with the question and answer and a sample.txt file with the context data.
The examples are used to demonstrate the capabilities of the Minion/Minions protocols and to provide a starting point for developing new applications.

Section 6: Model Context Protocol (MCP) Integration (`minions/minions_mcp.py`, `mcp.json`)

This section implements the Minions protocol with Model Context Protocol (MCP) integration, allowing the local LLM to access external tools and data sources.

minions_mcp.py: Defines the SyncMinionsMCP class, which inherits from the Minions class and integrates with the MCP.
- The SyncMinionsMCP class uses the SyncMCPClient class to communicate with the MCP server.
- The SyncMCPToolExecutor class provides a synchronous interface for executing MCP tools.
- The prepare_jobs method in SyncMinionsMCP is modified to allow the local LLM to use MCP tools to gather information and prepare tasks.
mcp.json: Contains the configuration for the MCP server, including the command to start the server and the arguments to pass to the command.

The MCP integration allows the Minions protocol to access a wider range of information and capabilities, such as file system access, web search, and database queries.

Db Schema

No database schema is explicitly defined in the provided code.

External API Calls

OpenAI API: Used by the OpenAIClient to interact with OpenAI's language models.
Anthropic API: Used by the AnthropicClient to interact with Anthropic's language models.
Together API: Used by the TogetherClient to interact with Together AI's language models.
Groq API: Used by the GroqClient to interact with Groq's language models.
Perplexity AI API: Used by the PerplexityAIClient to interact with Perplexity AI's language models.
OpenRouter API: Used by the OpenRouterClient to interact with various LLMs through a unified API.
Ollama API: Used by the OllamaClient to interact with local LLMs served by Ollama.

Insights

The Minions protocol provides a flexible and cost-effective way to leverage the strengths of both local and remote LLMs. By offloading tasks such as prompt refinement and context processing to a local LLM, the system can reduce the cost of using expensive cloud-based LLMs.
The use of structured output schemas allows the system to ensure that the LLMs generate responses in a predefined format, which makes it easier to process and integrate the results.
The Model Context Protocol (MCP) integration extends the capabilities of the Minions protocol by allowing the local LLM to access external tools and data sources.
The code is well-organized and modular, making it easy to extend and adapt to new LLM providers and tasks.
The Streamlit application provides a user-friendly interface for experimenting with the Minions protocol and exploring its capabilities.

The design experiences of the codebase:

The code is highly modular and extensible.
The use of Pydantic models for data validation and serialization helps to ensure data integrity and consistency.
The separation of concerns between the core protocol logic and the LLM clients makes it easy to add support for new LLM providers.

The creativity in the codebase:

The Minions protocol itself is a creative solution to the problem of reducing the cost of using large language models.
The use of a local LLM to refine prompts and process context is a novel way to leverage the strengths of both local and remote LLMs.

Github repository: Minions

Report generated by TSW-X
Advanced Research Systems Division
Date: 2025-03-05 21:13:16

DEV Community

Code Explanation: "Minions: Collaborative On-Device and Cloud LLMs"

Summary

Modules

Code Structure

Section 1: Entry Points and Core Logic (`run_minion_cli.py`, `app.py`, `minions`)

Section 2: LLM Clients (`minions/clients`)

Section 3: Utility Functions (`minions/utils`)

Section 4: Prompts (`minions/prompts`)

Section 5: Examples (`minions/examples`)

Section 6: Model Context Protocol (MCP) Integration (`minions/minions_mcp.py`, `mcp.json`)

Db Schema

External API Calls

Insights

Top comments (0)

Read next

Automating Your Workflow with Python

These are the best large language models for coding

Crash Course on Developing AI Applications with LangChain

Introducing VisionParser: A Modern OCR API for Receipt and Invoice Processing

Summary

Modules

Code Structure

Section 1: Entry Points and Core Logic (run_minion_cli.py, app.py, minions)

Section 2: LLM Clients (minions/clients)

Section 3: Utility Functions (minions/utils)

Section 4: Prompts (minions/prompts)

Section 5: Examples (minions/examples)

Section 6: Model Context Protocol (MCP) Integration (minions/minions_mcp.py, mcp.json)

Db Schema

External API Calls

Insights

Read next

Automating Your Workflow with Python

These are the best large language models for coding

Crash Course on Developing AI Applications with LangChain

Introducing VisionParser: A Modern OCR API for Receipt and Invoice Processing

Section 1: Entry Points and Core Logic (`run_minion_cli.py`, `app.py`, `minions`)

Section 2: LLM Clients (`minions/clients`)

Section 3: Utility Functions (`minions/utils`)

Section 4: Prompts (`minions/prompts`)

Section 5: Examples (`minions/examples`)

Section 6: Model Context Protocol (MCP) Integration (`minions/minions_mcp.py`, `mcp.json`)