AutoGen and MCP： Building Powerful Multi-Agent Systems

Introduction

In today's rapidly evolving artificial intelligence landscape, applications of Large Language Models (LLMs) have evolved from simple conversational systems to complex agent-based architectures. Microsoft's AutoGen framework, a powerful tool for building multi-agent systems, is leading innovation in this field. This article delves into a key component of the AutoGen framework—the Model Context Protocol (MCP)—and explores how it can be leveraged to build powerful multi-agent systems.

Introduction to AutoGen

AutoGen is an open-source framework developed by Microsoft, designed to simplify the process of building multi-agent systems based on large language models. It provides a flexible set of tools and APIs that enable developers to create networks of agents that can collaborate with each other, with each agent performing specific tasks or playing specific roles.

The core advantages of AutoGen include:

Multi-agent collaboration: Supports complex interactions and collaboration between multiple agents
Tool utilization capabilities: Agents can use various tools to extend their capabilities
Flexible conversation flow: Supports customized conversation flows and control logic
Scalability: Easy integration of new models and tools

What is MCP (Model Context Protocol)?

The Model Context Protocol (MCP) is an open standard designed to unify how AI models interact with external tools and services. In the AutoGen framework, MCP serves as a bridge connecting agents with external tools.

The core philosophy of MCP is to provide a standardized protocol that allows AI models (such as large language models) to interact consistently with various external services and tools. These tools can be local command-line utilities, remote API services, or even other AI systems.

Key Features of MCP

Standardized interfaces: Provides unified tool invocation and response formats
Multiple communication methods: Supports standard input/output (STDIO) and Server-Sent Events (SSE) communication
Tool discovery mechanism: Allows dynamic discovery and use of available tools
Session management: Supports maintaining session state for tool calls

MCP Implementation in AutoGen

In the AutoGen framework, MCP support is provided through the autogen_ext.tools.mcp module. This module offers various components that make it easy for developers to integrate MCP-compatible tools into AutoGen agents.

Core Components

McpWorkbench: Wraps an MCP server and provides an interface to list and call tools provided by the server
StdioMcpToolAdapter: Allows interaction with MCP tools via standard input/output
SseMcpToolAdapter: Allows interaction with MCP tools that support Server-Sent Events (SSE) over HTTP
McpSessionActor: Manages sessions with MCP servers

Configuration Parameters

MCP tool adapters require specific server parameters to establish connections:

StdioServerParams: Parameters for connecting to an MCP server via standard input/output
- command: The command to execute
- args: Command arguments
- env: Environment variables
- read_timeout_seconds: Read timeout duration
SseServerParams: Parameters for connecting to an MCP server via HTTP/SSE
- url: Server URL
- headers: HTTP headers
- timeout: Connection timeout
- sse_read_timeout: SSE read timeout

Real-world Case: Building a Multi-source Information Retrieval System

Let's explore a practical case that demonstrates how to use AutoGen and MCP to build a system capable of retrieving information from multiple sources (GitHub, Jira, and Confluence).

System Architecture

The system consists of three main components:

Search Agent: Responsible for retrieving relevant information from multiple sources
Summary Agent: Responsible for processing and summarizing the retrieved information
User Proxy: Represents the user in interactions with other agents

The system uses MCP tools to connect to GitHub and Atlassian (Jira and Confluence) services, enabling agents to access information on these platforms.

Code Implementation

python

from typing import Sequence
import os
import asyncio
from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
from autogen_agentchat.messages import BaseAgentEvent, BaseChatMessage
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console


def get_model_client() -> AzureOpenAIChatCompletionClient:
    return AzureOpenAIChatCompletionClient(
        azure_deployment=os.getenv("AZURE_OPENAI_MODEL_DEPLOYMENT"),
        api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        model="gpt-4o"
    )

search_agent_prompt = """
Please search for as much related information as possible from Jira, Confluence and Github using the tools provided to you, 
based on the user's question. Return the information you have found without any processing or comments. 
If no information is found, reply with "Sorry, I couldn't find any relevant information on this issue.
"""

summary_agent_prompt = """
You are an AI assistant. Please help answer the user's question according to Search_Agent's response.
"""

async def main() -> None:
    github_server_params = StdioServerParams(
            command="docker",
            args=[
                "run",
                "-i",
                "--rm",
                "-e",
                "GITHUB_PERSONAL_ACCESS_TOKEN",
                "-e",
                "GH_HOST",
                "ghcr.io/github/github-mcp-server"
            ],
            env={
                "GITHUB_PERSONAL_ACCESS_TOKEN": os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN"),
                "GH_HOST": os.getenv("GH_HOST")
            }
    )
    github_tools = await mcp_server_tools(github_server_params)
    
    atl_server_params = StdioServerParams(
        command="uv",
        args=[
            'run',
            'mcp-atlassian',
            '-v',
            '--jira-url',
            os.getenv("JIRA_HOST"),
            '--jira-personal-token',
            os.getenv("JIRA_PERSONAL_TOKEN"),
            '--confluence-url',
            os.getenv("CONFLUENCE_HOST"),
            '--confluence-personal-token',
            os.getenv("CONFLUENCE_PERSONAL_TOKEN"),
        ],
    )

    atl_tools = await mcp_server_tools(atl_server_params)

    # Create an agent that can use the tools
    search_agent = AssistantAgent(
        name="search_agent",
        model_client=get_model_client(),
        tools=github_tools + atl_tools,
        system_message=search_agent_prompt,
    )

    summary_agent = AssistantAgent(
        name="summary_agent",
        model_client=get_model_client(),
        system_message=summary_agent_prompt,
    )
        
    user_proxy = UserProxyAgent("user", input_func=input)

    # Create the termination condition which will end the conversation when the user says "Exit".
    termination = TextMentionTermination("Exit")

    def selector_func(messages: Sequence[BaseAgentEvent | BaseChatMessage]) -> str | None:
        if messages[-1].source == "user":
            return search_agent.name
        elif messages[-1].source == search_agent.name:
            return summary_agent.name
        elif messages[-1].source == summary_agent.name:
            return user_proxy.name
        return user_proxy.name

    team = SelectorGroupChat(
        [search_agent, summary_agent, user_proxy],
        model_client=get_model_client(),
        termination_condition=termination,
        selector_func=selector_func,
        allow_repeated_speaker=False,  # Allow an agent to speak multiple turns in a row.
    )

    task = "what is Marvin?"

    await Console(team.run_stream(task=task))


if __name__ == "__main__":
    asyncio.run(main())

Code Analysis

MCP Server Configuration:
- Using StdioServerParams to configure GitHub and Atlassian MCP servers
- Passing authentication information and server addresses through environment variables
Tool Acquisition:
- Using the mcp_server_tools function to obtain available tools from MCP servers
- Combining GitHub and Atlassian tools into a single tool list
Agent Creation:
- Creating a search agent and assigning all MCP tools to it
- Creating a summary agent responsible for processing search results
- Creating a user proxy to handle user input
Conversation Flow Control:
- Using selector_func to define the interaction sequence between agents
- Implementing a simple workflow: user question → search agent retrieval → summary agent summarization → return to user
Termination Condition:
- Using TextMentionTermination to define conversation termination conditions

Advanced MCP Application Scenarios

Beyond the example above, MCP can be applied to various advanced scenarios:

1. File System Operations

Using a file system MCP server, agents can perform file creation, reading, writing, and other operations:

python

# Set up file system MCP server parameters
desktop = str(Path.home() / "Desktop")
server_params = StdioServerParams(
    command="npx.cmd", 
    args=["-y", "@modelcontextprotocol/server-filesystem", desktop]
)

# Get all available tools
tools = await mcp_server_tools(server_params)

# Create an agent that can use these tools
agent = AssistantAgent(
    name="file_manager",
    model_client=OpenAIChatCompletionClient(model="gpt-4"),
    tools=tools,
)

2. Web Content Retrieval

Using a fetch MCP server, agents can retrieve and process web content:

python

# Get the fetch tool
fetch_mcp_server = StdioServerParams(command="uvx", args=["mcp-server-fetch"])
tools = await mcp_server_tools(fetch_mcp_server)

# Create an agent that can use the fetch tool
agent = AssistantAgent(
    name="fetcher", 
    model_client=OpenAIChatCompletionClient(model="gpt-4o"), 
    tools=tools, 
    reflect_on_tool_use=True
)

3. Web Browser Automation

Using a Playwright MCP server, agents can control web browsers to perform complex interactions:

python

params = StdioServerParams(
    command="npx",
    args=["@playwright/mcp@latest"],
    read_timeout_seconds=60,
)

async with create_mcp_server_session(params) as session:
    await session.initialize()
    tools = await mcp_server_tools(server_params=params, session=session)
    
    agent = AssistantAgent(
        name="Assistant",
        model_client=model_client,
        tools=tools,
    )

Advantages and Limitations of MCP

Advantages

Standardized Interface: Provides a unified way to call tools, simplifying the integration process
Diverse Tool Support: Supports various types of tools, from local command-line tools to remote API services
Session Management: Supports maintaining session state for tool calls, suitable for stateful tools (like browsers)
Extensibility: Easy to add new tools and services

Limitations

External Service Dependency: Requires support from external MCP servers
Configuration Complexity: Configuration of some tools can be relatively complex
Performance Overhead: Inter-process or network communication may introduce additional latency
Security Considerations: Need to carefully handle tool permissions and authentication information

Conclusion

The MCP module in AutoGen provides crucial support for building powerful multi-agent systems. Through standardized tool interfaces, developers can easily integrate various external tools and services into agent systems, greatly expanding the range of agent capabilities.

From simple file operations to complex web browser automation, MCP enables agents to interact more richly with the real world. This capability is essential for building truly useful AI applications, as it allows AI systems not only to understand user intentions but also to take concrete actions to fulfill these intentions.

As AutoGen and MCP continue to evolve, we can expect to see more innovative multi-agent applications emerge, providing more intelligent and efficient services to users across various domains.

AutoGen and MCP： Building Powerful Multi-Agent Systems ​

Introduction ​

Introduction to AutoGen ​

What is MCP (Model Context Protocol)? ​

Key Features of MCP ​

MCP Implementation in AutoGen ​

Core Components ​

Configuration Parameters ​

Real-world Case: Building a Multi-source Information Retrieval System ​

System Architecture ​

Code Implementation ​

Code Analysis ​

Advanced MCP Application Scenarios ​

1. File System Operations ​

2. Web Content Retrieval ​

3. Web Browser Automation ​

Advantages and Limitations of MCP ​

Advantages ​

Limitations ​

Conclusion ​

References ​

AutoGen and MCP： Building Powerful Multi-Agent Systems

Introduction

Introduction to AutoGen

What is MCP (Model Context Protocol)?

Key Features of MCP

MCP Implementation in AutoGen

Core Components

Configuration Parameters

Real-world Case: Building a Multi-source Information Retrieval System

System Architecture

Code Implementation

Code Analysis

Advanced MCP Application Scenarios

1. File System Operations

2. Web Content Retrieval

3. Web Browser Automation

Advantages and Limitations of MCP

Advantages

Limitations

Conclusion

References