多重零和博弈中的信息溢出_InformationSpilloverinMultipleZero-sumGames资源-CSDN文库

版权申诉

4 浏览量 2022-01-23 22:30:11 上传评论收藏 342KB PDF 举报

资源推荐

资源详情

资源评论

arXiv:2111.01647v1 [econ.TH] 2 Nov 2021

INFORMATION SPILLOVER IN MULTIPLE ZERO-SUM GAMES

LUCAS PAHL

Abstract. This paper considers an inﬁnitely repeated three-player Bayesian game with lack of

information on two sides, in which an informed player plays two zero-sum games simultaneously at

each stage against two uninformed players. This is a generalization of the Aumann et al. [

1] two-

player zero-sum one-sided incomplete information model. Under a correlated prior, the informed

player faces the problem of how to optimally disclose information among two uninformed players

in order to maximize his long t erm average payoﬀs. Our objective is to und erstand the adverse

eﬀects of “information spillover” from on e game to the other in the equilibrium payoﬀ set of the

informed player. We provide conditions under which the informed player can fully overcome such

adverse eﬀects and characterize equilibrium payoﬀs. In a second result, we show how the eﬀects of

information spillover on the equilibrium payoﬀ set of th e informed player might be severe.

1. Introduction

In their seminal work, Aumann et al. [1] analyzed an undiscounted inﬁnitely repeated two-player

zero-sum game with lack of information on one sid e: one player (the informed) knows the stage

game being played whereas the other (the uninformed ) does not know and cannot observe payoﬀs,

only actions. They showed that this game has a value and constructed optimal strategies for the

players. Matters are more complicated if the informed player were to play against more than one

uninformed player, as it would be the case of a military power (e.g., USA) negotiating with two

diﬀerent countries (e.g., Russia and Iran).

By observin g what the informed player plays against

some other uninformed player, an uninformed player can make inferences about the game he plays

against the informed player. As a consequence, it may not be optimal for the inform ed player to

play his unilaterally optimal strategy against s ome of the uninf ormed players. Put diﬀerently, the

information spillover among th e games played between the informed player and the uninf ormed

players adds layers of complexity to the analysis.

We consider a three-player undiscounted inﬁnitely repeated game in which one of the players is

informed of the two zero-sum stage games that he plays against each of the other two (uninformed)

players. Each uninformed player only knows the prior probability distribution over the ﬁnite set of

pairs of zero-sum ﬁnite-action stage games, and during the play of the game observes the proﬁles

Date: October, 2021. This paper subsumes a previous paper titled “Information Spillover in Bayesian Repeated

Games”.

I am grateful to Paulo Barelli and Hari Govindan for their guidance and encouragement. I would like to thank

Sven Rady, Rida Laraki, Tristan Tomala, Heng Liu and Mathijs Janssen for comments and suggestions.

The USA may want to conceal from Russia the exact size of its arsenal, and at the same time may want to reveal

it to Iran to leverage its bargaining position. More examples in this line can be found in Aumann et al. [

1].

2 L. PAHL

of actions (but not the payoﬀs). The informed player collects the sum of payoﬀs from the two

component games.

In the absence of information spillover, for instance when neither of the uninformed players can

observe the actions played in the other zero-sum game, our three-player game has a single expected

payoﬀ, namely, the sum of values of each of the two-player component games. However, when all

players are able to observe the actions played across each zero-sum game, the information spillover

kicks in and it is in principle un clear whether the informed player can attain the sum of values in

equilibrium. This sum of values can actually be shown to be an upper bound on the equilibrium

payoﬀs of the informed player in our three-player game.

Our ﬁrst main r esult provides a condition under which the informed player can attain this upper

bound in equilibrium, even in the presence of information s pillover. More precisely, we show the

informed player can attain anything as an equilibrium payoﬀ from his individually rational payoﬀ

to the above mentioned upper bound, thus characterizing the set of equilibrium payoﬀs in the

three-player game. In particular, this result implies that the three-player model we analyse might

have a continuum of equilibrium payoﬀs, even though it is a zero-sum model.

The method used to obtain this ﬁrst result is also of interest to the model of two-player games

studied by Aumann et al. [

1]. Under a s uﬃcient condition on the stage-payoﬀs, we show that

diﬀerent optimal strategies from those constructed by Aumann et al. exist. The strategy of the

informed player, in particular, d oes not involve any signaling on path of play, even when the

standard optimal strategy constructed by Aumann et al. necessarily does.

In a second a result we provide a necessary condition for equ ilibria paying the u pper bound to

the informed player. We explore two consequences of this result. First, we show that a natural

class of equilibria which involve signaling on equilibrium path never pays the upper bound to the

informed player. Second, we pr esent an example showing that the eﬀects of information spillover

might be very severe, in the sense that the informed player is not able to attain the upper bound

in equilibrium.

1.1. Related Literature. To the best of our knowledge, the model analyzed here is new. Al-

though the model we analyse is zero-sum, the results and the techniques presented remain closer to

the non-zero-sum literature, especially to Hart [

7] and Sorin [12]. We highlight here a few additional

papers on discounted and undiscounted Bayesian repeated games with perfect monitoring that have

technical and thematic similarities to this pr oject. A signiﬁcant part of the literature on undis-

counted Bayesian repeated games with perfect monitoring analyses models under the assumption

of “known own payoﬀs” (see Forges [4]). This is a reasonable assumption in several applications

and allows for equilibrium-payoﬀ characterizations which are especially tractable (see Shalev [

10]).

INFORMATION SPILLOVER 3

Under this assumption, Forges and Solomon [5] provide a simple characterization of Nash equilib-

rium payoﬀs in undiscounted Bayesian repeated games of two player s.

This characterization is

used to show that in a class of public good games, uniform equilibr ia might not exist. More closely

related to our paper in terms of the information environment is Forges et al. [

6]. In this paper,

among other results, cooperative solutions of one-shot Bayesian games with two players and exactly

one in formed player are related to noncooperative solutions of two-player repeated Bayesian games

with exactly one informed player. More speciﬁcally, under the assu mption of existence of uniform

punishment strategies for the uninformed player, the joint-plan equilibrium payoﬀs of the repeated

game equal the set of cooperative solutions of the one-shot Bayesian game. This folk theorem is

not however an equilibrium payoﬀ characterization, since it is known from Hart [7] that joint plans

cannot account for the whole of equilibrium payoﬀs in general.

One additional reference in the discounted dynamic games literature is Huangfu and Liu [8].

Although considering a signiﬁcantly diﬀerent model from ours, this paper motivated our work by

considering information sp illovers between diﬀerent markets with asymmetric information. In that

paper, a seller holds private information about the quality of goods he sells in two diﬀerent markets

and buyers learn about the seller’s private information from obs er v ing past trading outcomes not

only in the market in which they directly participate, but also from observing the outcomes of the

other market. The authors show, under certain assumptions on the correlation of qualities between

go ods in diﬀerent markets, that information spillover can mitigate adverse selection.

1.2. Organization. The remainder of the paper is organized as follows. The model is presented in

Section

2. Section 3 presents our ﬁrst main result. Section 4 pr esents our second main result and

Section 5 concludes. The p roofs of technical results are left to the Ap pendix. Additional results

can be found in a Supplemental Appendix.

2. Model

2.1. Notation. Given a ﬁnite set K, ∆(K) is the set of probability distribu tions over K; the

interior of X will be denoted by int(X), its boundary by ∂(X), and its convex closure by co(X).

For p ∈ ∆(K

× K

), p

(resp. p

) denotes its marginal on K

(resp. K

), and supp(p) its

support. We denote a product distr ibution on K

× K

by p

, and use ∆(K

)

∆(K

) to

denote the set of all such distributions.

A 3-player inﬁnitely repeated zero-sum game with lack of information on two sides, denoted G(p

)

is given by the following data:

• Three players, namely player 1 (the informed player), player 2 and player 3 (the uninformed

players).

An additional assumption needed for the characterization is the ex istence of “uniform punishment strategies” for

the players in the Bayesian stage game, that is, strategies that allow a player to be punished by holding his payoﬀs

at his ex-post individually rational level.

https://pahllucas.wixsite.com/lucaspahl.

4 L. PAHL

• Finite sets: I

, J

, K

, i = A, B with I

× I

(resp. J

and J

) being the set of actions of

player 1 (resp. p layers 2 and 3), and K

× K

being the set of s tates.

• p

∈ ∆(K

× K

) is the prior.

• For each k

∈ K

and k

∈ K

, A

and B

are |I

| × |J

| and |I

| × |J

|-payoﬀ

matrices, respectively.

The play of the inﬁnitely repeated game is as follows:

• At stage 0, state (k

, k

) ∈ K

× K

is drawn according to distribution p

and only player

1 knows the draw.

• At each stage t = 1, 2, ..., the players independently choose an action in their own set of

actions: when player 1 chooses (i

, i

) ∈ I

× I

and players 2 and 3 choose j

∈ J

and j

∈ J

, respectively, the stage payoﬀ to player 1 is then A

+ B

; to player 2,

−A

and to player 3, −B

. Monitoring is perfect, i.e., all players observe all past

action proﬁles before starting stage t + 1.

Player s are assumed to have perfect recall and the whole description of the game is common

knowledge. A behavior strategy for player 1 is an element σ = (σ

)

)∈K

×K

, where for

each (k

, k

) ∈ K

× K

, σ

)

= (σ

)

t≥1

and σ

)

is a map ping from the Cartesian

produ ct H

:= (I

× J

× I

× J

)

t−1

(with H

:= {∅}) to ∆(I

× I

), giving the lottery on

actions played by player 1 at a stage t, when the state is (k

, k

). Because p layers 2 and 3 do not

know the state, a behavior strategy for player 2 (resp. player 3) is an element τ

= (τ

A,t

)

t≥1

(resp.

= (τ

B,t

)

t≥1

), where τ

A,t

(resp. τ

B,t

) is a mapping from H

:= (I

× J

× I

× J

)

t−1

to ∆(J

)

(resp. ∆(J

)), giving the lottery on actions to be played by player 2 (resp. player 3) on stage t.

The set of behavior strategies of player 1 is denoted by Σ; for player 2, it is denoted by T

and for

player 3, it is den oted by T

We now deﬁne our equilibrium notion for the model. A behavior strategy proﬁle (σ, τ

, τ

)

induces, for every state (k

, k

) and stage T > 0, a probability distribution on H

T +1

. Also,

(σ, τ

, τ

) and p

induce a probability distribution over K

× K

× H

T +1

. We can thus deﬁne the

aver age expected payoﬀs (with κ being a random variable taking values on K

× K

distributed

according to p

, κ

the random variable obtained from projecting κ on K

and κ

the random

variable obtained from projecting κ on K

= α

(σ, τ

, τ

) := E

)

,τ

[

t=1

+ B

)],

(σ, τ

, τ

) := E

σ,τ

,τ

[

t=1

(−A

)],

INFORMATION SPILLOVER 5

(σ, τ

, τ

) := E

σ,τ

,τ

[

t=1

(−B

)].

Equilibrium Concept. A pr oﬁ le (σ, τ

, τ

) is a uniform equilibrium of G(p

) when:

(1) For each (k

, k

) ∈ supp(p

(α

(σ, τ

, τ

))

T ≥1

converges as T goes to inﬁnity to some

(σ, τ

, τ

), (β

(σ, τ

, τ

))

T ≥1

converges to some β

(σ, τ

, τ

) and (β

(σ, τ

, τ

))

T ≥1

converges to some β

(σ, τ

, τ

(2) For each ǫ > 0, there exists a positive integer T

such that for all T ≥ T

, (σ, τ

, τ

) is an

ǫ-Nash equilibrium in the ﬁnitely repeated game with T stages, i.e.,

(a) For each (k

, k

) ∈ supp(p

) and σ

′

∈ Σ, α

(σ

′

, τ

) ≤ α

(σ, τ

, τ

) + ǫ;

(b) For each τ

′

∈ T

, β

(σ, τ

′

, τ

) ≤ β

(σ, τ

, τ

) + ǫ;

′

∈ T

, β

(σ, τ

, τ

′

) ≤ β

(σ, τ

, τ

) + ǫ.

Uniform equilibrium is a standard equilibrium concept for the analysis of undiscounted repeated

games. It contains a strong requirement, namely (2), which posits that the proﬁle (σ, τ

, τ

) must

generate an ǫ-equilibrium in all “long” (T ≥ T

) but T -times (where T < ∞) repeated version of

our model.

If (σ, τ

, τ

) is an equilibrium in G(p

), the associated vector

(α(σ, τ

, τ

), β

(σ, τ

, τ

), β

(σ, τ

, τ

)),

where α(σ, τ

, τ

) := (α

(σ, τ

, τ

))

)∈supp(p

)

, is the vector of payoﬀs of (σ, τ

, τ

). Also

α(σ, τ

, τ

)·p

(where · is the standard scalar product in Euclidean space) is the ex-ante equ ilibrium

payoﬀ of the informed player.

Our analysis of the equilibrium payoﬀ set of the game G(p

) in the next section will rely on

certain properties of each of the two-player, inﬁnitely repeated zero-sum games that the informed

player plays against each uninformed player. For this reason we now recall some of the main results

in Aumann et al. [

1], which is the original reference for this two-player model. Let K be the ﬁnite

set of states and M = (M

)

k∈K

a collection of zero-sum payoﬀ matrices wher e M

∈ R

I×J

for

each k ∈ K. Denote by G

(p) the inﬁnitely repeated, two-player, zero-sum game with lack of

information on one side with prior p ∈ ∆(K) and u ndiscounted payoﬀs (see Sorin [

13], Chapter

3, for a detailed description of this model or Aumann et al. [

1]). Let M(p) =

k∈K

and

deﬁne v

(p) = min

t∈∆(J)

max

s∈∆(I)

sM(p)t = max

s∈∆(I)

min

t∈∆(J)

sM(p)t, w here s is a row vector

Remark

3.5 in Section 3 shows that assuming the prior p

∈ int(∆(K

× K

)) (as customary in the literature) is

not without loss of generality for the results in this paper. This is why we present the deﬁnition requiring convergence

of (α

)

T ≥ 1

, (k

, k

) ∈ supp(p

). The same reasoning applies to condition (2).

One notable aspect of this eq uilibrium notion is that uniform equilibria in our model are approximate Nash

equilibria in the discounted version of our model: if (σ, τ

, τ

) is a uniform equilibrium, then (σ, τ

, τ

) is a ε-Nash

equilibrium of the discounted versions of our model for a suﬃciently high discount factor. See Theorem 13.32 in

Maschler et al. [

9].

剩余28页未读，继续阅读

评论收藏

内容反馈

版权申诉

易小侠

粉丝: 6475
资源: 9万+

多重零和博弈中的信息溢出_Information Spillover in Multiple Zero-sum Games

最新资源

多重零和博弈中的信息溢出_Information Spillover in Multiple Zero-sum Games

Spillover_updated_spillover_spilloverDY12_Spillover_updated_Dieb

Spillover_updated_spillover_spilloverDY12

spilloverRolling BK_spectralanalysis_spillover_timefrequency_

Enterprise Knowledge Spillover in Industrial Clusters Based on Multi-agent Simulation

China's growing influence in Asia-Pacific stock markets: Evidence from spillover effects and market integration

基于校友创业的大学知识溢出：数字创业生态系统视角

Modeling the Momentum Spillover Effect for Stock Prediction

net_netspillover_源码

Diebold 和 Yilmaz (2009, 2012,2014) 溢出指数：计算 Diebold 和 Yilmaz (2009, 2012) 指数的函数。 !!! 注意 ！！！ 该软件包必须与 Econometrics Tbx 一起使用。-matlab开发

主要股票市场的波动溢出：文献批判性回顾-研究论文

pairwise_pairwisespillover_

传染还是相互依赖？ 比较有符号和无符号溢出-研究论文

Improve Electrochemical Hydrogen Insertion on the Carbon Materials Loaded with Pt nano-particles through H spillover

Effects of Mutual Coupling on Interference Mitigation

信息网络中的内容增长和注意力传染：解决维基百科上的信息贫困问题-研究论文

matlab使用元胞数组计算多个公司之间发明专利的spillover值

信仰干预社交网络：一种信息设计方法-研究论文

基于低分辨率点检测器数据的考虑溢出条件的队列长度估计

Cobalt Strike下载

计算机系统-笔记-HUN2021级

北京邮电大学计算机考研复试笔试资料

cs1.6老版本供下载

合成孔径雷达的经典成像算法cs(matlab)仿真代码（吐血整理，内容全，注释全）

港大CS（MSC）面试整理

合成孔径雷达RD CS OmegaK算法点目标仿真.rar

计算机科学导论原书第二版答案.zip

Cobalt-Strike-4.5

cobaltstrike4.3.zip

在dataGridView的列中出现日历选择控件的类型

最新资源

Diebold 和 Yilmaz (2009, 2012,2014) 溢出指数：计算 Diebold 和 Yilmaz (2009, 2012) 指数的函数。 !!! 注意！！！该软件包必须与 Econometrics Tbx 一起使用。-matlab开发

传染还是相互依赖？比较有符号和无符号溢出-研究论文