AAMAS2009国际会议POMDP论文资源-CSDN文库

共12个文件

pdf：12个

POMDP

4星 · 超过85%的资源需积分: 15 181 浏览量 2010-06-05 19:11:41 上传评论 1 收藏 3.45MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

AAMAS_2009.rar （12个子文件）

AAMAS

Constraint-Based Dynamic Programming for Decentralized POMDPs with Structured Interactions_AAMAS.pdf 401KB

SarsaLandmark an algorithm for learning in POMDPs with landmarks_AAMAS.pdf 245KB

Reward shaping for valuing communications during multi-agent coordination_AAMAS.pdf 289KB

Solving multiagent assignment Markov decision processes_AAMAS.pdf 311KB

Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs_AAMAS.pdf 322KB

Achieving goals in decentralized POMDPs_AAMAS.pdf 294KB

Decentralised dynamic task allocation a practical game theoretic approach_AAMAS.pdf 315KB

Lossless clustering of histories in decentralized POMDPs_AAMAS.pdf 284KB

A Dynamic Battery Model for Co-design in Cyber-Physical Systems_ICDCS.pdf 328KB

Solving Continuous-State POMDPs via Density Projection_SCI.pdf 654KB

Learning of coordination exploiting sparse interactions in multiagent systems_AAMAS.pdf 276KB

A Distributed Termination Detection Algorithm for Dynamic Asynchronous Systems_ICDCS.pdf 319KB

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 5, MAY 2010 1101

Solving Continuous-State POMDPs

via Density Projection

Enlu Zhou, Member, IEEE, Michael C. Fu, Fellow, IEEE, and Steven I. Marcus, Fellow, IEEE

Abstract—Research on numerical solution methods for partially

observable Markov decision processes (POMDPs) has primarily

focused on ﬁnite-state models, and these algorithms do not gen-

erally extend to continuous-state POMDPs, due to the inﬁnite

dimensionality of the belief space. In this paper, we develop a

computationally viable and theoretically sound method for solving

continuous-state POMDPs by effectively reducing the dimen-

sionality of the belief space via density projection. The density

projection technique is also incorporated into particle ﬁltering

to provide a ﬁltering scheme for online decision making. We

provide an error bound between the value function induced by

the policy obtained by our method and the true value function of

the POMDP, and also an error bound between projection particle

ﬁltering and exact ﬁltering. Finally, we illustrate the effectiveness

of our method through an inventory control problem.

Index Terms—Belief state, decision making, density projection,

partially observable Markov decision processes (POMDPs), par-

ticle ﬁltering, value function.

I. INTRODUCTION

ARTIALLY observable Markov decision processes

(POMDPs) model sequential decision making under

uncertainty with partially observed state information. At each

stage or period, an action is taken based on a partial observation

of the current state along with the history of observations and

actions, and the state transitions probabilistically. The objective

is to minimize (or maximize) a cost (or reward) function, where

costs (or rewards) are accrued in each stage. Clearly, POMDPs

suffer from the same curse of dimensionality as fully observable

MDPs, so efﬁcient numerical solution of problems with large

state spaces is a challenging research area.

A POMDP can be converted to a continuous-state Markov

decision process (MDP) by introducing the notion of the be-

lief state [6], which is the conditional distribution of the cur-

Manuscript received January 04, 2008; revised August 18, 2008. First

published February 02, 2010; current version published May 12, 2010. This

work was supported in part by the National Science Foundation under Grants

DMI-0540312 and DMI-0323220, and by the Air Force Ofﬁce of Scientiﬁc

Research under Grant FA9550-07-1-0366. Recommended by Associate Editor

C.-H. Chen.

E. Zhou is with the Department of Industrial and Enterprise Systems Engi-

neering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA

(e-mail: enluzhou@illinois.edu).

M.C. Fu is with the Robert H. Smith School of Business, and Institute for Sys-

tems Research, University of Maryland, College Park, MD 20742 USA (e-mail:

mfu@umd.edu).

S. I. Marcus is with the Department of Electrical and Computer Engineering,

and Institute for Systems Research, University of Maryland, College Park, MD

20742 USA (e-mail: marcus@umd.edu).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TAC.2010.2042005

rent state given the history of observations and actions. For a

ﬁnite-state POMDP, the belief space is ﬁnite dimensional (i.e.,

a probability simplex), whereas for a continuous-state POMDP,

the belief space is an inﬁnite-dimensional space of continuous

probability distributions. This difference suggests that simple

generalizations of many of the ﬁnite-state algorithms to con-

tinuous-state models are not appropriate or applicable. For ex-

ample, discretization of the continuous-state space may result

in a ﬁnite-state POMDP of dimension either too large to solve

computationally or not sufﬁciently precise. Taking another ex-

ample, many algorithms for solving ﬁnite-state POMDPs (see

[17] for a survey) are based on discretization of the ﬁnite-di-

mensional probability simplex; however, it is usually not fea-

sible to discretize an inﬁnite-dimensional probability distribu-

tion space. Throughout the paper, when we use the word “di-

mension” or “dimensional,” we refer to the dimension of the

belief space/state.

Despite the abundance of algorithms for ﬁnite-state

POMDPs, the aforementioned difﬁculty has motivated some

researchers to look for efﬁcient algorithms for continuous-state

POMDPs [8], [9], [10], [11], [23], [24], [25], [28], [31]. As-

suming discrete observation and action spaces, Porta

et al.

[24] showed that the optimal ﬁnite-horizon value function is

deﬁned by a ﬁnite set of “

-functions”, and model all functions

of interest by Gaussian mixtures. In a later work [25], they

extended their result and method to continuous observation and

action spaces using sampling strategies. However, the number

of Gaussian mixtures in representing belief states and

-func-

tions grows exponentially in value iteration as the number of

iterations increases. Thrun [31] addressed continuous-state

POMDPs using particle ﬁltering to simulate the propagation of

belief states and represent the belief states by a ﬁnite number of

samples. The number of samples determines the dimension of

the belief space, and the dimension could be very high in order

to approximate the belief states closely. Brunskill et al. [10]

used weighted sums of Gaussians to approximate the belief

states and value functions in a class of switching state models.

Roy [28] and Brooks et al. [8] used sufﬁcient statistics to re-

duce the dimension of the belief space, which is often referred

to as belief compression in the Artiﬁcial Intelligence literature.

Roy [28] proposed an augmented MDP (AMDP), characterizing

belief states using maximum likelihood state and entropy, which

are usually not sufﬁcient statistics except for a linear Gaussian

model. As shown by Roy himself, the algorithm fails in a simple

robot navigation problem, since the two statistics are not sufﬁ-

cient for distinguishing between a unimodal distribution and a

bimodal distribution. Brooks et al. [8] proposed a parametric

POMDP, representing the belief state as a Gaussian distribution

so as to convert the POMDP to a problem of computing the value

Authorized licensed use limited to: National Univ of Defense Tech. Downloaded on May 29,2010 at 03:02:27 UTC from IEEE Xplore. Restrictions apply.

1102 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 5, MAY 2010

function over a two-dimensional continuous space, and using

the extended Kalman ﬁlter to estimate the transition of the ap-

proximated belief state. The restriction to the Gaussian repre-

sentation has the same problem as the AMDP. The algorithm

recently proposed in Brooks and Williams [9] is similar to ours,

in that they also approximate the belief state by a parameterized

density and solve the approximate belief MDP on the parameter

space using Monte Carlo simulation-based methods. However,

they did not specify how to compute the parameters except for

Gaussian densities, whereas we explicitly provide an analytical

way to calculate the parameters for exponential families of den-

sities. Moreover, we develop rigorous theoretical error bounds

for our algorithm. There are some other belief compression al-

gorithms designed for ﬁnite-state POMDPs, such as value-di-

rected compression [26] and the exponential family principle

components analysis (E-PCA) belief compression [29], but they

cannot be directly generalized to continuous-state models, since

they are based on a ﬁxed set of support points.

Motivated by the work of [28], [31] and [8], we develop a

computationally tractable algorithm that effectively reduces the

dimension of the belief state and has the ﬂexibility to represent

arbitrary belief states, such as multimodal or heavy tail distri-

butions. The idea is to project the original high/inﬁnite-dimen-

sional belief space to a low-dimensional family of parameter-

ized distributions by minimizing the Kullback–Leibler (KL) di-

vergence between the belief state and that family of distribu-

tions. For an exponential family, the minimization of KL diver-

gence can be carried out in analytical form, making the method

easy to implement. The projected belief MDP can then be solved

on the parameter space by using simulation-based algorithms, or

can be further approximated by a ﬁnite-state MDP via a suitable

discretization of the parameter space and thus solved by using

standard solution techniques such as value iteration and policy

iteration. Our method can be viewed as a generalization of the

AMDP in [28] and the parametric POMDP in [8], which con-

siders only the family of Gaussian distributions. In addition, we

provide theoretical results on the error bounds of the value func-

tion and the performance of the policy generated by our method

with respect to the optimal ones.

We also develop a projection particle ﬁlter for online ﬁltering

and decision making, by incorporating the density projection

technique into particle ﬁltering. The projection particle ﬁlter we

propose here is a modiﬁcation of the projection particle ﬁlter

in [2]. Unlike in [2] where the

predicted conditional density is

projected, we project the updated conditional density, so as to

ensure the projected belief state remains in the given family of

densities. Although seemingly a small modiﬁcation in the algo-

rithm, we prove under much less restrictive assumptions a sim-

ilar bound on the error between our projection particle ﬁlter and

the exact ﬁlter.

The rest of the paper is organized as follows. Section II de-

scribes the formulation of a continuous-state POMDP and its

transformation to a belief MDP. Section III describes the den-

sity projection technique, and uses it to develop the projected

belief MDP. Section IV develops the projection particle ﬁlter.

Section V computes error bounds for the value function ap-

proximation and the projection particle ﬁlter. Section VI dis-

cusses scalability and computational issues of the method, and

applies the method to a simulation example of an inventory con-

trol problem. Section VII concludes the paper. Proofs of all re-

sults are contained in the Appendix .

II. C

ONTINUOUS-STATE

POMDP

Consider a discrete-time continuous-state POMDP

(1)

(2)

where for all

, the state is in a continuous state space

, the action is in a ﬁnite action space , the

observation

is in a continuous observation space ,

the random disturbances

and are sequences

of i.i.d. continuous random vectors with known distributions.

Assume that

and are independent of each other, and

are independent of

, which follows a distribution . Also

assume that

is continuous in for every and

, is continuous in for every and

, and is continuous in for every .

Equation (1) is often referred to as the state equation, and (2) as

the observation equation.

All the information available to the decision maker at time

can be summarized by means of an information vector , which

is deﬁned as

The objective is to ﬁnd a policy consisting of a sequence of

functions

, where each function maps the

information vector

onto the action space , to minimize the

value function

where is the one-step cost function,

is the discount factor, and denotes

the expectation with respect to the joint distribution of

. For simplicity, we assume that

the above limit exists. The optimal value function is deﬁned by

where is the set of all admissible policies. An optimal

policy, denoted by

, is an admissible policy that achieves

.Astationary policy is an admissible policy of the form

, referred to as the stationary policy for

brevity, and its corresponding value function is denoted by

The information vector

grows as the history expands. The

standard approach to encode historical information is the use of

the belief state, which is the conditional probability density of

the current state

given the past history, i.e.

Authorized licensed use limited to: National Univ of Defense Tech. Downloaded on May 29,2010 at 03:02:27 UTC from IEEE Xplore. Restrictions apply.

ZHOU et al.: SOLVING CONTINUOUS-STATE POMDPS VIA DENSITY PROJECTION 1103

Given our assumptions on (1) and (2), exists, and can be

computed recursively via Bayes’ rule

(3)

The third line follows from the Markovian property of

induced by (2), and the fact that the denominator

does not explicitly depend on and ;

the fourth line follows from the Markovian property of

induced by (1), and the fact that is a function of .

The right-hand side of (3) can be expressed in terms of

and . Hence

(4)

where

is characterized by the time-homogeneous conditional

distribution

that is induced by (1) and (2), and

does not depend on

A POMDP can be converted to an MDP by conditioning on

the information vectors ([6, Ch. 5]), and the converted MDP is

called the belief MDP. The states of the belief MDP are the be-

lief states, which follow the system dynamics (4), where

can

be viewed as the system noise with the distribution

. The

state space of the belief MDP is the belief space, denoted by

which is the set of all belief states, i.e., a set of probability den-

sities. A policy

is a sequence of functions ,

where each function

maps the belief state onto the action

space

. Noticing that

thus the one-step cost function can be written in terms of the

belief state as the belief one-step cost function

Assuming there exists a stationary optimal policy, the optimal

value function is given by

where is the dynamic programming (DP) mapping that oper-

ates on any bounded function

according to

(5)

where

denotes the expectation with respect to the distribu-

tion

For ﬁnite-state POMDPs, the belief state

is a vector with

each entry being the probability of being at one of the states.

Hence, the belief space

is a ﬁnite-dimensional probability

simplex, and the value function is a piecewise linear convex

function after a ﬁnite number of iterations, provided that the

one-step cost function is piecewise linear and convex [30]. This

feature has been exploited in various exact and approximate

value iteration algorithms such as those found in [17], [22], and

[30].

For continuous-state POMDPs, the belief state

is a contin-

uous density, and thus, the belief space

is an inﬁnite-dimen-

sional space that contains all sorts of continuous densities. For

continuous-state POMDPs, the value function preserves con-

vexity [32], but value iteration algorithms are not computation-

ally feasible because the belief space is inﬁnite dimensional. The

inﬁnite-dimensionality of the belief space also creates difﬁcul-

ties in applying the approximate algorithms that were developed

for ﬁnite-state POMDPs. For example, one straightforward and

commonly used approach is to approximate a continuous-state

POMDP by a ﬁnite-state one via discretization of the state space.

In practice, this could lead to computational difﬁculties, either

resulting in a belief space that is of huge dimension or in a so-

lution that is not accurate enough. In addition, note that even

for a relatively nice prior distribution

(e.g., a Gaussian distri-

bution), the exact evaluation of the posterior distribution

computationally intractable; moreover, the update

may not

have any structure, and therefore can be very difﬁcult to handle.

Therefore, for practical reasons, we often wish to have a low-di-

mensional belief space and to have a posterior distribution

that stays in the same distribution family as the prior .

To address the aforementioned difﬁculties, we apply the den-

sity projection technique to project the inﬁnite-dimensional be-

lief space onto a ﬁnite/low-dimensional parameterized family

of densities, so as to derive a so-called projected belief MDP,

which is an MDP with a ﬁnite/low-dimensional state space and

therefore can be solved by many existing methods. In the next

section, we describe density projection in detail and develop the

formulation of a projected belief MDP.

III. P

ROJECTED BELIEF

MDP

A projection mapping from the belief space

to a family

of parameterized densities

, denoted as ,is

deﬁned by

(6)

where

denotes the Kullback-Leibler (KL) divergence

(or relative entropy) between

and , which is

(7)

Hence, the projection of

on has the minimum KL divergence

from

among all the densities in .

When

is an exponential family of densities, the minimiza-

tion (6) has an analytical solution and can be carried out easily.

The exponential families include many common families of

densities, such as Gaussian, binomial, Poisson, Gamma, etc.

An exponential family of densities is deﬁned as follows [3]:

Authorized licensed use limited to: National Univ of Defense Tech. Downloaded on May 29,2010 at 03:02:27 UTC from IEEE Xplore. Restrictions apply.

1104 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 5, MAY 2010

Deﬁnition 1: Let be afﬁnely indepen-

dent scalar functions deﬁned on

, i.e., for distinct points

, and implies

, where . As-

suming that

is a convex set with a nonempty interior, then deﬁned by

where is open, is called an exponential family of prob-

ability densities, with

its parameter and its sufﬁcient

statistic.

Substituting

into (7) and expressing it further

we can see that the ﬁrst term does not depend on , hence

is equivalent to

which by Deﬁnition 1 is the same as

(8)

Recall the fact that the log-likelihood

is strictly concave in [21], and therefore,

is also strictly concave in .

Hence, (8) has a unique maximum and the maximum is

achieved when the ﬁrst-order optimality condition is satisﬁed,

i.e.

With a little rearranging of the terms and the expression of

, the above equation can be rewritten as

(9)

where

and denote the expectations with respect to and

, respectively.

Density projection is a useful idea to approximate an arbi-

trary (most likely, inﬁnite-dimensional) density as accurately as

possible by a density in a chosen family that is characterized by

only a few parameters. Using this idea, we can transform the be-

lief MDP to another MDP conﬁned on a low-dimensional belief

space, and then solve this MDP problem. We call such an MDP

the projected belief MDP. Its state is the projected belief state

that satisﬁes the system dynamics

where , and

the dynamic programming mapping on the projected belief

MDP is

(10)

For the projected belief MDP, a policy is denoted as

, where each function maps the projected belief

state

onto the action space . Similarly, a stationary policy

is denoted as

; an optimal stationary policy is denoted as ;

and the optimal value function is denoted as

The projected belief MDP is in fact a low-dimensional con-

tinuous-state MDP, and can be solved in numerous ways. One

common approach is to use value iteration or policy iteration

by converting the projected belief MDP to a discrete-state

MDP problem via a suitable discretization of the projected

belief space (i.e., the parameter space) and then estimating

the one-step cost function and transition probabilities on the

discretized mesh. The effect of the discretization procedure on

dynamic programming has been studied in [5]. We describe

this approach in detail below.

Discretization of the projected belief space

is equivalent to

discretization of the parameter space

, which yields a set of

grid points, denoted by

. Let

denote the one-step cost function associated with taking action

at the projected belief state . Let

denote the transition probability from the current projected be-

lief state

to the next projected belief state

by taking action . Estimation of is done

using a variation of the projection particle ﬁltering algorithm, to

be described in the next section.

can be estimated by its

sample mean

(11)

where

are sampled i.i.d. from .

Remark 1: The approach for solving the projected belief

MDP described here is probably the most intuitive, but not

necessarily the most computationally efﬁcient. Other more

efﬁcient techniques for solving continuous-state MDPs can

be used to solve the projected belief MDP, such as the linear

programming approach [15], neuro-dynamic programming

methods [7], and simulation-based methods [12].

IV. P

ROJECTION PARTICLE FILTERING

Solving the projected belief MDP gives us a policy, which

tells us what action to take at each projected belief state. In

an online implementation, at each time

, the decision maker

receives a new observation

, estimates the belief state ,

and then chooses his action

according to and that policy.

Hence, to implement our approach requires addressing the

problem of estimating the belief state. Estimation of

,or

simply called ﬁltering, does not have an analytical solution in

most cases except linear Gaussian systems, but it can be solved

using many approximation methods, such as the extended

Kalman ﬁlter and particle ﬁltering. Here we focus on particle

Authorized licensed use limited to: National Univ of Defense Tech. Downloaded on May 29,2010 at 03:02:27 UTC from IEEE Xplore. Restrictions apply.

ZHOU et al.: SOLVING CONTINUOUS-STATE POMDPS VIA DENSITY PROJECTION 1105

ﬁltering, because 1) it outperforms the extended Kalman ﬁlter

in many nonlinear/non-Gaussian systems [1], and 2) we will

develop a projection particle ﬁlter to be used in conjunction

with the projected belief MDP.

A. Particle Filtering

Particle ﬁltering is a Monte Carlo simulation-based method

that approximates the belief state by a ﬁnite number of parti-

cles/samples and mimics the propagation of the belief state [1],

[14] . As we have already shown in (3), the belief state evolves

recursively as

(12)

The integration in (12) can be approximated using Monte Carlo

simulation, which is the essence of particle ﬁltering. Speciﬁ-

cally, suppose

are drawn i.i.d. from , and

is drawn from for each ; then can be

approximated by the probability mass function

(13)

where

(14)

denotes the Kronecker delta function, are the

random support points, and are the associated prob-

abilities/weights which sum up to 1.

To avoid sample degeneracy, new samples

are sam-

pled i.i.d. from the approximate belief state

. At the next time

, the above steps are repeated to yield and

corresponding weights , which are used to approxi-

mate

. This is the basic form of particle ﬁltering, which is

also called the bootstrap ﬁlter [18]. (Please see [1] for a rigorous

and thorough derivation for a more general form of particle ﬁl-

tering.) The algorithm is as follows:

Algorithm 1 (Particle Filtering (Bootstrap Filter)):

• Input: a (stationary) policy

on the belief MDP; a se-

quence of observations

arriving sequentially at

time

• Output: a sequence of approximate belief states

• Step 1. Initialization: Sample

i.i.d. from the

approximate initial belief state

. Set .

• Step 2. Prediction: Compute

by prop-

agating

according to the system dynamics

(1) using the action

and randomly

generated noise

, i.e., sample from

, . The empirical predicted

belief state is

• Step 3. Bayes’ updating: Receive a new observation .

The empirical updated belief state is

where

• Step 4. Resampling: Sample i.i.d. from .

• Step 5.

and go to step 2.

It has been proved that the approximate belief state

con-

verges to the true belief state

as the sample number in-

creases to inﬁnity [13], [20] . However, uniform convergence in

time has only been proved for the special case, where the system

dynamics has a mixing kernel which ensures that any error is

forgotten (exponentially) in time. Usually, as time

increases,

an increasing number of samples is required to ensure a given

precision of the approximation

for all .

B. Projection Particle Filtering

To obtain a reasonable approximation of the belief state, par-

ticle ﬁltering needs a large number of samples/particles. Since

the number of samples/particles is the dimension of the approx-

imate belief state

, particle ﬁltering is not very helpful in re-

ducing the dimensionality of the belief space. Moreover, particle

ﬁltering does not give us an approximate belief state in the pro-

jected belief space

, hence the policy we obtained by solving

the projected belief MDP is not immediately applicable.

We incorporate the idea of density projection into particle ﬁl-

tering, so as to approximate the belief state by a density in

The projection particle ﬁlter we propose here is a modiﬁcation

of the one in [2]. Their projection particle ﬁlter projects the em-

pirical predicted belief state, not the empirical updated belief

state, onto a parametric family of densities, so after Bayes’ up-

dating, the approximate belief state might not be in that family.

We will project the empirical updated belief state onto a para-

metric family by minimizing the KL divergence between the

empirical density and the projected one. In addition, we will

need much less restrictive assumptions than [2] to obtain similar

error bounds. Since resampling is from a continuous distribution

instead of an empirical (discrete) one, the proposed projection

particle ﬁlter also overcomes the difﬁculty of sample impover-

ishment [1] that occurs in the bootstrap ﬁlter.

Applying the density projection technique we described in the

last section, projecting the empirical belief state

onto an ex-

ponential family

involves ﬁnding a with the parameter

satisfying (9). Hence, plugging (13) into (9), yields

which constitutes the projection step in the projection particle

ﬁltering.

Authorized licensed use limited to: National Univ of Defense Tech. Downloaded on May 29,2010 at 03:02:27 UTC from IEEE Xplore. Restrictions apply.

评论收藏

内容反馈

suyabai

2013-10-12

不错，很有用。很多英文的论文看起来比较吃力

zhh_fantasy

粉丝: 0
资源: 1

AAMAS2009国际会议POMDP论文

POMDP研究文献综述

论文研究-一种基于独立任务的POMDP问题的解决方法.pdf

POMDP，部分可观察马尔可夫决策过程

POMDP：基于部分可观察的马尔可夫决策过程实现RL算法

机器学习与POMDP的关系

pomdp简介：第一张pomdp纸的单独回购（AmNat）

AAMAS21-375:链接到标题为“可组合模块化深度强化学习的动作选择”的AAMAS_21论文ID 375的代码链接

oppt:在线 POMDP 规划工具包

pomdp-py:构建和解决POMDP问题的框架。 文献资料

aamas-2015-efficient:重现AAMAS 2015“Efficient Inter-Team Task Allocation in RoboCup Rescue”论文实验的信息

matlab代码做游戏-MISofG-4-1-3-inHAT-forAAMAS19:使用c++代码进行分析以显示我的AAMAS19论文中提到的

人工智能会议等级列表.pdf

二抽取代码MATLAB-ENPG:从出处图提取叙事。Ebden等人针对HAIDM/AAMAS2015实现子图匹配算法

smart-carpooling-demo:AAMAS 2018论文“通过并发规划进行集体适应”的在线资源库

人工智能原理介绍.pptx

竞拍系统源码java-sats:用Java生成频谱拍卖测试实例的库

exp-bounty-hunter:DrewWicke的一些测试模拟器，“赏金猎人和多代理任务分配”，AAMAS2015

blockchain-swarm-robotics:在群体机器人集体决策场景中通过区块链技术管理拜占庭机器人

gp_pref_elicit:使用高斯过程进行多目标决策的有序偏好吸引策略

KOBDIG:基于可能性理论的认知主体框架-开源

Internet-advertising-mechanism-and-strategy:有关互联网广告中的策略，匹配，定位和创意的研究和应用论文的集合

FeverBasketball:用于研究目的的FeverBasketball环境的开源

agent-black-box

reasoning-about-preferences:“智能代理系统偏好推理”的实现

Reinforcement-Learning-Materials

python大作业 含爬虫、数据可视化、地图、报告、及源码（整和为一个文件）（2014-2020全国各地区原油加工量）.rar

仿真电路以及操作方法

【纯干货啊】华为IPD流程管理(完整版).pptx

可编程语言标准IEC61131-3中文版.pdf

最新资源

pomdp-py:构建和解决POMDP问题的框架。文献资料

python大作业含爬虫、数据可视化、地图、报告、及源码（整和为一个文件）（2014-2020全国各地区原油加工量）.rar