多智能体系统协作研究：InternetofAgents框架的设计与实验验证资源-CSDN文库

版权申诉

多智能体系统

分布式计算

62 浏览量 2024-12-03 12:03:17 上传评论收藏 4.21MB PDF 举报

资源推荐

资源详情

资源评论

Work in progress

INTERNET OF AGENTS: WEAVING A WEB OF HET-

EROGENEOUS AGENTS FOR COLLABORATIVE INTEL-

LIGENCE

Weize Chen

1∗

, Ziming You

, Ran Li

, Yitong Guan

, Chen Qian

, Chenyang Zhao

Cheng Yang

, Ruobing Xie

, Zhiyuan Liu

, Maosong Sun

Tsinghua University,

Peking University

Beijing University of Posts and Telecommunications,

Tencent

chenwz21@mails.tsinghua.edu.cn, rl759@nau.edu

{zimingyou, 2101210206}@stu.pku.edu.cn

liuzy@tsinghua.edu.cn

ABSTRACT

The rapid advancement of large language models (LLMs) has paved the way

for the development of highly capable autonomous agents. However, existing

multi-agent frameworks often struggle with integrating diverse capable third-party

agents due to reliance on agents deﬁned within their own ecosystems. They also

face challenges in simulating distributed environments, as most frameworks are

limited to single-device setups. Furthermore, these frameworks often rely on

hard-coded communication pipelines, limiting their adaptability to dynamic task

requirements. Inspired by the concept of the Internet, we propose the Internet of

Agents (IoA), a novel framework that addresses these limitations by providing a

ﬂexible and scalable platform for LLM-based multi-agent collaboration. IoA in-

troduces an agent integration protocol, an instant-messaging-like architecture de-

sign, and dynamic mechanisms for agent teaming and conversation ﬂow control.

Through extensive experiments on general assistant tasks, embodied AI tasks, and

retrieval-augmented generation benchmarks, we demonstrate that IoA consistently

outperforms state-of-the-art baselines, showcasing its ability to facilitate effective

collaboration among heterogeneous agents. IoA represents a step towards linking

diverse agents in an Internet-like environment, where agents can seamlessly col-

laborate to achieve greater intelligence and capabilities. Our codebase has been

released at https://github.com/OpenBMB/IoA.

1 INTRODUCTION

The Internet has revolutionized the way people collaborate and share knowledge, connecting indi-

viduals with diverse skills and backgrounds from all around the world. This global network has

enabled the creation of remarkable collaborative projects, such as Wikipedia

and the development

of the Linux operating system

, which would have been impossible for any single person to achieve.

The Internet has greatly facilitated collaboration among people, making the impossible possible and

pushing the boundaries of human achievement.

The success of the Internet in enabling human collaboration raises an intriguing question: can we

create a similar platform to facilitate collaboration among autonomous agents? With the rapid ad-

vancements in LLMs (OpenAI, 2023; Reid et al., 2024), we now have autonomous agents capable

of achieving near-human performance on a wide range of tasks. These LLM-based agents have

demonstrated the ability to break down complex tasks into executable steps, leverage various tools,

and learn from feedback and experience (Qin et al., 2023; Wang et al., 2023c; Shinn et al., 2023;

∗

Equal Contribution. B Corresponding author.

https://www.wikipedia.org/

https://www.linux.org/

arXiv:2407.07061v1 [cs.CL] 9 Jul 2024

Work in progress

Qian et al., 2023b). As the capabilities of these agents continue to grow, and with an increasing

number of third-party agents with diverse skills consistently emerging (Chase, 2022; Team, 2023;

Signiﬁcant Gravitas, 2023; Open Interpreter, 2023), it is crucial to explore how we can effectively

and efﬁciently orchestrate their collaboration, just as the Internet has done for humans.

To address this challenge, we propose the concept of the Internet of Agents (IoA), a general frame-

work for agent communication and collaboration inspired by the Internet. IoA aims to address

three fundamental limitations of existing multi-agent frameworks (Chen et al., 2023; Wu et al.,

2023; Hong et al., 2023; Qian et al., 2023a): (1) Ecosystem Isolation: Most frameworks only con-

sider agents deﬁned within their own ecosystems, potentially blocking the integration of various

third-party agents and limiting the diversity of agent capabilities and the platform’s generality; (2)

Single-Device Simulation: Nearly all multi-agent frameworks simulate multi-agent systems on a

single device, which differs signiﬁcantly from real-world scenarios where agents could be distributed

across multiple devices located in different places; (3) Rigid Communication and Coordination:

The communication process, agent grouping, and state transitions are mostly hard-coded, whereas

in real life, humans decide on teammates based on the task at hand and dynamically switch between

discussion and task assignment or execution.

To overcome these limitations, we propose an agent integration protocol that enables different third-

party agents running on different devices to be seamlessly integrated into the framework and collabo-

rate effectively. Additionally, we introduce an instant-messaging-app-like framework that facilitates

agent discovery and dynamic teaming. By autonomously searching for potential agents capable of

handling the tasks at hand, agents can dynamically decide to form different teams and communicate

within various group chats. Inspired by Speech Act Theory (Searle, 1969), and its application in

conventional multi-agent system (Finin et al., 1994; Labrou et al., 1999), within each group chat, we

abstract out several conversation states and provide a ﬂexible and general ﬁnite-state machine mech-

anism that allows agents to autonomously decide the state of the conversation, facilitating discussion

and sub-task execution.

We demonstrate the effectiveness of IoA through extensive experiments and comparisons with state-

of-the-art autonomous agents. By integrating AutoGPT (Signiﬁcant Gravitas, 2023) and Open In-

terpreter (Open Interpreter, 2023), we show that IoA achieves a 66 to 76% win rate in open-domain

task evaluations when compared with these agents individually. Furthermore, with only a few basic

ReAct agents integrated, IoA outperforms previous works on the GAIA benchmark (Mialon et al.,

2023). In the retrieval-augmented generation (RAG) question-answering domain, our framework

substantially surpasses existing methods, with a GPT-3.5-based implementation achieving perfor-

mance close to or even exceeding GPT-4, and effectively surpassing previous multi-agent frame-

work.

The impressive performance of IoA across various domains highlights the potential of this paradigm

for autonomous agents. As smaller LLMs continue to advance (Mesnard et al., 2024; Hu et al.,

2024; Abdin et al., 2024), running agents on personal computer or even mobile device is becoming

increasingly feasible. This trend opens up new opportunities for deploying multi-agent systems

in real-world scenarios, where agents can be distributed across multiple devices and collaborate

to solve complex problems. We believe that by further exploring and reﬁning the IoA paradigm,

more sophisticated and adaptable multi-agent systems can be developed, ultimately pushing the

boundaries of what autonomous agents can achieve in problem-solving and decision-making.

2 FRAMEWORK DESIGN AND KEY MECHANISMS OF IOA

In this section, we present a comprehensive overview of IoA, detailing its architecture and key

mechanisms. We will explore how these components work together to enable effective collaboration

among autonomous agents, facilitating dynamic team formation, structured communication, and

efﬁcient task execution.

2.1 OVERVIEW OF IOA

IoA is designed as an instant-messaging-app-like platform that enables seamless communication and

collaboration among diverse autonomous agents. Inspired by the concept of Internet, IoA addresses

Work in progress

Client Side

Interaction Layer

Data LayerFoundation Layer

Agent Integration Block

Custom Agents Third-Party Agents

Data Infra Block

Network Infra Block

Websocket

Agent Contact Block

WeatherAgent

Tools: Weather API

Desc: It obtains the weather from…

AutoGPT

Tools: Browser, File System

Desc: It is capable yet expensive…

…

(

Group Info Block

)

group_chat_id: 0

goal: Calculate sqrt(9!)

team_members: […]

chat_records: xxx

…

Task Management Block

sub_goal: Calculate 9!

task_status: ongoing

assigner: xxx

assignee: xx

is_trigger_set: False

…

Communication Block

Team Formation Block

Server Side

Group Chat Members:

CalcAgent AutoGPT

Math Masters

Hey CalcAgent, […] Since you

can use calculator, could you…

Protocol-Compliant Agent Message

{

“content”: “Sure! […]”,

“sender”: “CalcAgent”,

}

…

Goal: Calculate sqrt(9!)

Agent

Message

Protocol

code interpreter}

Search: {calculation,

Server’s Agent Query Service

AutoGPT

Open Interpreter

Teamup: AutoGPT

Math Masters

joined in

the group

Data Infra Block

Network Infra Block

Security Block

Agent Register

Rules

Agent

Agent Registry Block

CalcAgent

To o ls : C alcu lato r

Description: It can […]

AutoGPT

To o ls : B ro ws e r, …

Description: It is capable…

…

Session Management Block

Web-socket Co nn ectio n 1

Web-socket Co nn ectio n 2

…

Agent Query

Query: {calculation …}

Agent Registry

;

Group Setup

Team with AutoGPT!

Group ID: xxx

Group Name: Math Masters

Group Members: […]

Message Routing

Agent Message from

Communication Block

Team Members

Figure 1: The illustration on the conceptual layered architecture on the design of IoA.

three fundamental challenges in multi-agent systems (Chen et al., 2023; Wu et al., 2023; Qian et al.,

2023a):

1. Distributed agent collaboration: Unlike traditional frameworks that simulate multi-agent sys-

tems on a single device, IoA supports agents distributed across multiple devices and locations.

(Sections 2.2 and 2.3.1)

2. Dynamic and adaptive communication: IoA implements mechanisms for autonomous team for-

mation and conversation ﬂow control, allowing agents to adapt their collaboration strategies

based on task requirements and ongoing progress. (Sections 2.3.2 to 2.3.4)

3. Integration of heterogeneous agents: IoA provides a ﬂexible protocol for integrating various

third-party agents, expanding the diversity of agent capabilities within the system. (Section 2.4)

At its core, IoA consists of two main components: the server and the client. The server acts as a

central hub, managing agent registration, discovery, and message routing. It enables agents with

varying capabilities to ﬁnd each other and initiate communication. The client, on the other hand,

serves as a wrapper for individual agents, providing them with the necessary communication func-

tionalities and adapting them to the speciﬁed protocol. IoA employs a layered architecture (Bass

et al., 1999) for both the server and client components, comprising three layers:

• Interaction Layer: Facilitates team formation and agent communication.

• Data Layer: Manages information related to agents, group chats, and tasks.

• Foundation Layer: Provides essential infrastructure for agent integration, data management,

and network communication.

These layers work together to facilitate agent collaboration through the network. In the following

subsections, we will go through the IoA’s architecture and design.

2.2 ARCHITECTURE OF IOA

The layered architecture of IoA is designed to support scalable, ﬂexible, and efﬁcient multi-agent

collaboration. This architecture enables a clear separation of concerns and facilitates the integration

of diverse agents and functionalities (Fig. 1).

2.2.1 SERVER ARCHITECTURE

The server acts as the central hub of IoA, facilitating agent discovery, group formation, and message

routing. Its architecture consists of three layers:

Interaction Layer: At the top level, the Interaction Layer manages high-level interactions between

agents and the system. It encompasses the Agent Query Block for enabling agents to search for

other agents based on speciﬁc characteristics, the Group Setup Block for facilitating the creation

and management of group chats, and the Message Routing Block for ensuring efﬁcient and accurate

routing of messages between agents and group chats.

Work in progress

Data Layer: Serving as the information backbone, the Data Layer handles the storage and manage-

ment of critical system information. The Agent Registry Block maintains a comprehensive database

of registered agents, including their capabilities and current status, similar to service discovery in

distributed systems (Meshkova et al., 2008; Netﬂix). Meanwhile, the Session Management Block

manages active connections and ensures continuous communication between the server and con-

nected clients.

Foundation Layer: Underpinning the entire system, the Foundation Layer provides the essential

infrastructure for the server’s operations. It encompasses the Data Infrastructure Block for handling

data persistence and retrieval, the Network Infrastructure Block for managing network communi-

cations, and the Security Block for implementing authentication, authorization, and other security

measures to maintain system integrity.

2.2.2 CLIENT ARCHITECTURE

The client component of IoA serves as a wrapper for individual agents, providing them with the

necessary interfaces to communicate within the system. Its architecture mirrors that of the server

with three layers:

Interaction Layer: At the forefront of agent operations, the Interaction Layer manages the agent’s

interactions within the system. The Team Formation Block implements the logic for identifying

suitable collaborators and forming teams for the task at hand, similar to coalition formation in con-

ventional multi-agent research (Rahwan et al., 2009). Complementing this, the Communication

Block manages the agent’s participation in group chats and handles message processing.

Data Layer: Functioning as the agent’s memory, the Data Layer maintains local data relevant to the

agent’s operations. It includes the Agent Contact Block for storing information about other agents

the current agent has interacted with, the Group Info Block for maintaining details about ongoing

group chats and collaborations, and the Task Management Block for tracking the status and progress

of tasks assigned to the agent.

Foundation Layer: Forming the base of the client architecture, the Foundation Layer provides the

basic functionalities for the client’s operations. The Agent Integration Block deﬁnes the protocols

and interfaces for integrating third-party agents into the IoA ecosystem. Alongside this, the Data

Infrastructure Block handles local data storage and retrieval, while the Network Infrastructure Block

manages network communications with the server.

This layered architecture enables IoA to support a wide range of agent types and collaboration sce-

narios. By providing a clear separation of concerns and well-deﬁned interfaces between layers, the

architecture facilitates the integration of diverse agents and allows for future extensibility. Further-

more, this design supports the key mechanisms of IoA, such as autonomous team formation and

conversation ﬂow control, which we will explore in detail in the following subsections.

2.3 KEY MECHANISMS

The effectiveness of IoA relies on several key mechanisms that enable seamless collaboration among

diverse agents. These mechanisms work in concert to facilitate agent integration, team formation,

task allocation, and structured communication. We detail these critical components in this section.

2.3.1 AGENT REGISTRATION AND DISCOVERY

To enable collaboration among distributed agents with heterogeneous architectures, tools, and en-

vironments, we propose the agent registration and discovery mechanism. This mechanism forms

the foundation for collaborative interactions within IoA, enabling the integration of diverse agents

into the system and facilitating their discovery on the online server by other agents for potential

collaboration through the network.

Agent Registration: When a new agent joins the IoA, its client wrapper undergoes a registration

process with the server. During registration, the agent should provide a comprehensive description

of its capabilities, skills, and areas of expertise. This description, denoted as d

for an agent c

, is

stored in the Agent Registry Block of the server’s Data Layer. Formally, we represent the set of all

registered agents as C = {c

, c

, ..., c

}, where each c

is associated with its description d

Work in progress

Agent Discovery: The agent discovery function leverages the information stored in the Agent Reg-

istry from the online server to enable agents to ﬁnd suitable collaborators for speciﬁc tasks. When an

agent needs to form a team or seek assistance, it can use the search client tool provided by the

server’s Agent Query Block. This tool allows an agent to search for other agents based on desired

characteristics or capabilities. Formally, the agent discovery process can be described as follows: Let

= [l

, l

, ..., l

] be a list of desired characteristics generated by an agent seeking collaborators.

The search client function can be represented as: search client : L

→ P(C), where

P(C) denotes the power set of C. The function returns a subset of clients C

⊆ C whose descriptions

match the desired characteristics in L

. The matching process between L

and d

can be imple-

mented with various semantic matching techniques (Robertson & Zaragoza, 2009; Karpukhin et al.,

2020). It ensures that agents with relevant capabilities can be discovered even if their descriptions

do not exactly match the search criteria.

2.3.2 AUTONOMOUS NESTED TEAM FORMATION

The autonomous nested team formation mechanism enables dynamic and ﬂexible combinations of

appropriate agents. This mechanism allows agents to form teams adaptively based on task require-

ments and to create nested sub-teams for complex, multi-faceted tasks.

Team Formation Process: When a client c

∈ C is assigned a task t, it initiates the team formation

process. The client has access to two essential tools provided by the server: search client

and launch group chat. The LLM in the client is prompted to decide which tool to call based

on the task and the current set of discovered clients. If more collaborators are needed, it calls

search client with appropriate characteristics. Once suitable collaborators are found, it calls

launch group chat to initiate a new group chat g ∈ G, where G is the space of all group chats.

Nested Team Structure: The nested team formation allows for a hierarchical structure of teams and

sub-teams. Let g

∈ G be the initial group chat for task t. During the execution of t, if a client c

is assigned with a sub-task t

(the task assignment mechanism will be introduced in Section 2.3.4),

and it identiﬁes t

requires additional expertise, c

is allowed to search for appropriate agents again

and initiate a new sub-group chat g

∈ G. This process can continue recursively for the new sub-

tasks assigned in g

, forming a tree-like structure of group chats. Formally, we can deﬁne a function

h : G → P(G) that maps a group chat to its set of sub-group chats. The nested structure can be

represented as: h(g

) = {g

, g

, ..., g

}, h(g

) = {g

, g

, ..., g

}, and so on.

Sub-Task: Market Research

and Data Collection

Assignee:

GoogleAgent

Nested Team Formation?

✅

Overall Goal: Create a Comprehensive

Market Analysis Report for iPhone 15.

Agents:

GoogleAgent,

✍

ReportWritingAgent

Sub-Task: Data Analysis

and Visualization

Assignee:

✍

ReportWritingAgent

Nested Team Formation?

✅

Sub-Task: Report Writing

Assignee:

✍

ReportWritingAgent

Nested Team Formation?

❌

Sub-Task: Competitor Analysis

Agents:

GoogleAgent

Nested Team Formation?

❌

Sub-Task: Customer Analysis

Agents:

MarketAPIAgent

Nested Team Formation?

❌

Chat

Tasks

Overall Goal: Market

Research and Data Collection

Agents:

GoogleAgent,

MarketAPIAgent

Chat

Tasks

……

Figure 2: An example of nested team formation

mechanism. The process is simpliﬁed for clarity.

Communication Complexity: The nested

team formation mechanism helps reduce com-

munication complexity in large agent teams.

Assuming fully connected communication

within each group, the number of communi-

cation channels (connected edges) in a single

group with |g| members is c

full

|g |(|g|−1)

However, by decomposing a task into sub-tasks

and allocating them to sub-group chats, we

can reduce the total number of communication

channels. Let S(g) denote the set of all sub-

groups (including g itself) formed for a task ini-

tially assigned to group g. The total number of

communication channels can then be expressed

as: c

nested

∈S(g)

|(|g

|−1)

≤ c

full

Fig. 2 illustrates an example of the nested team

formation process. In this example, the initial

group chat g

spawns three sub-group chats g

, g

and g

for speciﬁc sub-tasks during the discussion.

further creates two sub-group chats g

and g

for a more specialized sub-task.

2.3.3 AUTONOMOUS CONVERSATION FLOW CONTROL

Effective communication is crucial for successful collaboration among autonomous agents. Inspired

by Speech Act Theory (Austin, 1975; Searle, 1969) and its applications in multi-agent systems (Finin

剩余30页未读，继续阅读

评论收藏

内容反馈

版权申诉

pk_xz123456

粉丝: 2601
资源: 3661

多智能体系统协作研究：Internet of Agents框架的设计与实验验证

最新资源

多智能体系统协作研究：Internet of Agents框架的设计与实验验证

多智能体强化学习Simulink模型

多智能体系统分布式一致性算法研究现状.pdf

论文研究-无速度测量的多智能体系统旋转领航者跟踪控制问题 .pdf

多智能体合作与探索新兴行为的AgentVerse框架在任务执行中的应用

多智能体仿真matlab代码

人工智能英文版课件：02 Intelligent Agents.ppt

智能体教程java agent development

基于智能体系统的软件工程开发途径探究.rar

人工智能：计算代理的基础Artificial Intelligence: Foundations of Computational Agents

Artificial Intelligence Foundations of Computational Agents, second edition

分布式多智能体学习书籍礼包，内含中文和英文书籍各一本，是学习JADE的宝贵资料

基于NETLOGO与MATLAB的电网多智能体建模及仿真研究_毕业论文.pdf

Applications of Intelligent Agents

使用JADE 平台进行智能体开发

Graph Theory in Mutli-Agents.zip

Generative Agents文献阅读与分析

基于多智能体系统的重点爬虫之间的协作

异构混合阶多智能体系统编队控制的分布式优化算法matlab仿真【包含程序操作录像,代码中文注释】

多智能体仿真matlab代码_rezip.zip

第八章 多智能体实现.docx

基于混合式多智能体系统的协作多机器人系统研究 (2005年)

Artificial Societies of Intelligent Agents

多Agent分布式车间动态调度仿真系统研究.pdf

多智能体仿真matlab代码_rezip1.zip

Exp-Agents:实验系统多代理

论文研究-Average consensus of multi-agent systems with event-triggered communication.pdf

舰载雷达智能故障诊断系统研究.pdf

最新资源

第八章多智能体实现.docx