AutoGen and MCP: Building Powerful Multi-Agent Systems
Introduction
In today's rapidly evolving artificial intelligence landscape, applications of Large Language Models (LLMs) have evolved from simple conversational systems to complex agent-based architectures. Microsoft's AutoGen framework, a powerful tool for building multi-agent systems, is leading innovation in this field. This article delves into a key component of the AutoGen framework—the Model Context Protocol (MCP)—and explores how it can be leveraged to build powerful multi-agent systems.
Introduction to AutoGen
AutoGen is an open-source framework developed by Microsoft, designed to simplify the process of building multi-agent systems based on large language models. It provides a flexible set of tools and APIs that enable developers to create networks of agents that can collaborate with each other, with each agent performing specific tasks or playing specific roles.
The core advantages of AutoGen include:
- Multi-agent collaboration: Supports complex interactions and collaboration between multiple agents
- Tool utilization capabilities: Agents can use various tools to extend their capabilities
- Flexible conversation flow: Supports customized conversation flows and control logic
- Scalability: Easy integration of new models and tools
What is MCP (Model Context Protocol)?
The Model Context Protocol (MCP) is an open standard designed to unify how AI models interact with external tools and services. In the AutoGen framework, MCP serves as a bridge connecting agents with external tools.
The core philosophy of MCP is to provide a standardized protocol that allows AI models (such as large language models) to interact consistently with various external services and tools. These tools can be local command-line utilities, remote API services, or even other AI systems.
Key Features of MCP
- Standardized interfaces: Provides unified tool invocation and response formats
- Multiple communication methods: Supports standard input/output (STDIO) and Server-Sent Events (SSE) communication
- Tool discovery mechanism: Allows dynamic discovery and use of available tools
- Session management: Supports maintaining session state for tool calls
MCP Implementation in AutoGen
In the AutoGen framework, MCP support is provided through the autogen_ext.tools.mcp
module. This module offers various components that make it easy for developers to integrate MCP-compatible tools into AutoGen agents.
Core Components
- McpWorkbench: Wraps an MCP server and provides an interface to list and call tools provided by the server
- StdioMcpToolAdapter: Allows interaction with MCP tools via standard input/output
- SseMcpToolAdapter: Allows interaction with MCP tools that support Server-Sent Events (SSE) over HTTP
- McpSessionActor: Manages sessions with MCP servers
Configuration Parameters
MCP tool adapters require specific server parameters to establish connections:
StdioServerParams: Parameters for connecting to an MCP server via standard input/output
command
: The command to executeargs
: Command argumentsenv
: Environment variablesread_timeout_seconds
: Read timeout duration
SseServerParams: Parameters for connecting to an MCP server via HTTP/SSE
url
: Server URLheaders
: HTTP headerstimeout
: Connection timeoutsse_read_timeout
: SSE read timeout
Real-world Case: Building a Multi-source Information Retrieval System
Let's explore a practical case that demonstrates how to use AutoGen and MCP to build a system capable of retrieving information from multiple sources (GitHub, Jira, and Confluence).
System Architecture
The system consists of three main components:
- Search Agent: Responsible for retrieving relevant information from multiple sources
- Summary Agent: Responsible for processing and summarizing the retrieved information
- User Proxy: Represents the user in interactions with other agents
The system uses MCP tools to connect to GitHub and Atlassian (Jira and Confluence) services, enabling agents to access information on these platforms.
Code Implementation
from typing import Sequence
import os
import asyncio
from autogen_ext.tools.mcp import StdioServerParams, mcp_server_tools
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
from autogen_agentchat.messages import BaseAgentEvent, BaseChatMessage
from autogen_agentchat.agents import AssistantAgent, UserProxyAgent
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.ui import Console
def get_model_client() -> AzureOpenAIChatCompletionClient:
return AzureOpenAIChatCompletionClient(
azure_deployment=os.getenv("AZURE_OPENAI_MODEL_DEPLOYMENT"),
api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
model="gpt-4o"
)
search_agent_prompt = """
Please search for as much related information as possible from Jira, Confluence and Github using the tools provided to you,
based on the user's question. Return the information you have found without any processing or comments.
If no information is found, reply with "Sorry, I couldn't find any relevant information on this issue.
"""
summary_agent_prompt = """
You are an AI assistant. Please help answer the user's question according to Search_Agent's response.
"""
async def main() -> None:
github_server_params = StdioServerParams(
command="docker",
args=[
"run",
"-i",
"--rm",
"-e",
"GITHUB_PERSONAL_ACCESS_TOKEN",
"-e",
"GH_HOST",
"ghcr.io/github/github-mcp-server"
],
env={
"GITHUB_PERSONAL_ACCESS_TOKEN": os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN"),
"GH_HOST": os.getenv("GH_HOST")
}
)
github_tools = await mcp_server_tools(github_server_params)
atl_server_params = StdioServerParams(
command="uv",
args=[
'run',
'mcp-atlassian',
'-v',
'--jira-url',
os.getenv("JIRA_HOST"),
'--jira-personal-token',
os.getenv("JIRA_PERSONAL_TOKEN"),
'--confluence-url',
os.getenv("CONFLUENCE_HOST"),
'--confluence-personal-token',
os.getenv("CONFLUENCE_PERSONAL_TOKEN"),
],
)
atl_tools = await mcp_server_tools(atl_server_params)
# Create an agent that can use the tools
search_agent = AssistantAgent(
name="search_agent",
model_client=get_model_client(),
tools=github_tools + atl_tools,
system_message=search_agent_prompt,
)
summary_agent = AssistantAgent(
name="summary_agent",
model_client=get_model_client(),
system_message=summary_agent_prompt,
)
user_proxy = UserProxyAgent("user", input_func=input)
# Create the termination condition which will end the conversation when the user says "Exit".
termination = TextMentionTermination("Exit")
def selector_func(messages: Sequence[BaseAgentEvent | BaseChatMessage]) -> str | None:
if messages[-1].source == "user":
return search_agent.name
elif messages[-1].source == search_agent.name:
return summary_agent.name
elif messages[-1].source == summary_agent.name:
return user_proxy.name
return user_proxy.name
team = SelectorGroupChat(
[search_agent, summary_agent, user_proxy],
model_client=get_model_client(),
termination_condition=termination,
selector_func=selector_func,
allow_repeated_speaker=False, # Allow an agent to speak multiple turns in a row.
)
task = "what is Marvin?"
await Console(team.run_stream(task=task))
if __name__ == "__main__":
asyncio.run(main())
Code Analysis
MCP Server Configuration:
- Using
StdioServerParams
to configure GitHub and Atlassian MCP servers - Passing authentication information and server addresses through environment variables
- Using
Tool Acquisition:
- Using the
mcp_server_tools
function to obtain available tools from MCP servers - Combining GitHub and Atlassian tools into a single tool list
- Using the
Agent Creation:
- Creating a search agent and assigning all MCP tools to it
- Creating a summary agent responsible for processing search results
- Creating a user proxy to handle user input
Conversation Flow Control:
- Using
selector_func
to define the interaction sequence between agents - Implementing a simple workflow: user question → search agent retrieval → summary agent summarization → return to user
- Using
Termination Condition:
- Using
TextMentionTermination
to define conversation termination conditions
- Using
Advanced MCP Application Scenarios
Beyond the example above, MCP can be applied to various advanced scenarios:
1. File System Operations
Using a file system MCP server, agents can perform file creation, reading, writing, and other operations:
# Set up file system MCP server parameters
desktop = str(Path.home() / "Desktop")
server_params = StdioServerParams(
command="npx.cmd",
args=["-y", "@modelcontextprotocol/server-filesystem", desktop]
)
# Get all available tools
tools = await mcp_server_tools(server_params)
# Create an agent that can use these tools
agent = AssistantAgent(
name="file_manager",
model_client=OpenAIChatCompletionClient(model="gpt-4"),
tools=tools,
)
2. Web Content Retrieval
Using a fetch MCP server, agents can retrieve and process web content:
# Get the fetch tool
fetch_mcp_server = StdioServerParams(command="uvx", args=["mcp-server-fetch"])
tools = await mcp_server_tools(fetch_mcp_server)
# Create an agent that can use the fetch tool
agent = AssistantAgent(
name="fetcher",
model_client=OpenAIChatCompletionClient(model="gpt-4o"),
tools=tools,
reflect_on_tool_use=True
)
3. Web Browser Automation
Using a Playwright MCP server, agents can control web browsers to perform complex interactions:
params = StdioServerParams(
command="npx",
args=["@playwright/mcp@latest"],
read_timeout_seconds=60,
)
async with create_mcp_server_session(params) as session:
await session.initialize()
tools = await mcp_server_tools(server_params=params, session=session)
agent = AssistantAgent(
name="Assistant",
model_client=model_client,
tools=tools,
)
Advantages and Limitations of MCP
Advantages
- Standardized Interface: Provides a unified way to call tools, simplifying the integration process
- Diverse Tool Support: Supports various types of tools, from local command-line tools to remote API services
- Session Management: Supports maintaining session state for tool calls, suitable for stateful tools (like browsers)
- Extensibility: Easy to add new tools and services
Limitations
- External Service Dependency: Requires support from external MCP servers
- Configuration Complexity: Configuration of some tools can be relatively complex
- Performance Overhead: Inter-process or network communication may introduce additional latency
- Security Considerations: Need to carefully handle tool permissions and authentication information
Conclusion
The MCP module in AutoGen provides crucial support for building powerful multi-agent systems. Through standardized tool interfaces, developers can easily integrate various external tools and services into agent systems, greatly expanding the range of agent capabilities.
From simple file operations to complex web browser automation, MCP enables agents to interact more richly with the real world. This capability is essential for building truly useful AI applications, as it allows AI systems not only to understand user intentions but also to take concrete actions to fulfill these intentions.
As AutoGen and MCP continue to evolve, we can expect to see more innovative multi-agent applications emerge, providing more intelligent and efficient services to users across various domains.