没有合适的资源?快使用搜索试试~ 我知道了~
多重零和博弈中的信息溢出_Information Spillover in Multiple Zero-sum Games
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 4 浏览量
2022-01-23
22:30:11
上传
评论
收藏 342KB PDF 举报
温馨提示
试读
29页
多重零和博弈中的信息溢出_Information Spillover in Multiple Zero-sum Games.pdf
资源推荐
资源详情
资源评论
arXiv:2111.01647v1 [econ.TH] 2 Nov 2021
INFORMATION SPILLOVER IN MULTIPLE ZERO-SUM GAMES
LUCAS PAHL
Abstract. This paper considers an infinitely repeated three-player Bayesian game with lack of
information on two sides, in which an informed player plays two zero-sum games simultaneously at
each stage against two uninformed players. This is a generalization of the Aumann et al. [
1] two-
player zero-sum one-sided incomplete information model. Under a correlated prior, the informed
player faces the problem of how to optimally disclose information among two uninformed players
in order to maximize his long t erm average payoffs. Our objective is to und erstand the adverse
effects of “information spillover” from on e game to the other in the equilibrium payoff set of the
informed player. We provide conditions under which the informed player can fully overcome such
adverse effects and characterize equilibrium payoffs. In a second result, we show how the effects of
information spillover on the equilibrium payoff set of th e informed player might be severe.
1. Introduction
In their seminal work, Aumann et al. [1] analyzed an undiscounted infinitely repeated two-player
zero-sum game with lack of information on one sid e: one player (the informed) knows the stage
game being played whereas the other (the uninformed ) does not know and cannot observe payoffs,
only actions. They showed that this game has a value and constructed optimal strategies for the
players. Matters are more complicated if the informed player were to play against more than one
uninformed player, as it would be the case of a military power (e.g., USA) negotiating with two
different countries (e.g., Russia and Iran).
1
By observin g what the informed player plays against
some other uninformed player, an uninformed player can make inferences about the game he plays
against the informed player. As a consequence, it may not be optimal for the inform ed player to
play his unilaterally optimal strategy against s ome of the uninf ormed players. Put differently, the
information spillover among th e games played between the informed player and the uninf ormed
players adds layers of complexity to the analysis.
We consider a three-player undiscounted infinitely repeated game in which one of the players is
informed of the two zero-sum stage games that he plays against each of the other two (uninformed)
players. Each uninformed player only knows the prior probability distribution over the finite set of
pairs of zero-sum finite-action stage games, and during the play of the game observes the profiles
Date: October, 2021. This paper subsumes a previous paper titled “Information Spillover in Bayesian Repeated
Games”.
I am grateful to Paulo Barelli and Hari Govindan for their guidance and encouragement. I would like to thank
Sven Rady, Rida Laraki, Tristan Tomala, Heng Liu and Mathijs Janssen for comments and suggestions.
1
The USA may want to conceal from Russia the exact size of its arsenal, and at the same time may want to reveal
it to Iran to leverage its bargaining position. More examples in this line can be found in Aumann et al. [
1].
1
2 L. PAHL
of actions (but not the payoffs). The informed player collects the sum of payoffs from the two
component games.
In the absence of information spillover, for instance when neither of the uninformed players can
observe the actions played in the other zero-sum game, our three-player game has a single expected
payoff, namely, the sum of values of each of the two-player component games. However, when all
players are able to observe the actions played across each zero-sum game, the information spillover
kicks in and it is in principle un clear whether the informed player can attain the sum of values in
equilibrium. This sum of values can actually be shown to be an upper bound on the equilibrium
payoffs of the informed player in our three-player game.
Our first main r esult provides a condition under which the informed player can attain this upper
bound in equilibrium, even in the presence of information s pillover. More precisely, we show the
informed player can attain anything as an equilibrium payoff from his individually rational payoff
to the above mentioned upper bound, thus characterizing the set of equilibrium payoffs in the
three-player game. In particular, this result implies that the three-player model we analyse might
have a continuum of equilibrium payoffs, even though it is a zero-sum model.
The method used to obtain this first result is also of interest to the model of two-player games
studied by Aumann et al. [
1]. Under a s ufficient condition on the stage-payoffs, we show that
different optimal strategies from those constructed by Aumann et al. exist. The strategy of the
informed player, in particular, d oes not involve any signaling on path of play, even when the
standard optimal strategy constructed by Aumann et al. necessarily does.
In a second a result we provide a necessary condition for equ ilibria paying the u pper bound to
the informed player. We explore two consequences of this result. First, we show that a natural
class of equilibria which involve signaling on equilibrium path never pays the upper bound to the
informed player. Second, we pr esent an example showing that the effects of information spillover
might be very severe, in the sense that the informed player is not able to attain the upper bound
in equilibrium.
1.1. Related Literature. To the best of our knowledge, the model analyzed here is new. Al-
though the model we analyse is zero-sum, the results and the techniques presented remain closer to
the non-zero-sum literature, especially to Hart [
7] and Sorin [12]. We highlight here a few additional
papers on discounted and undiscounted Bayesian repeated games with perfect monitoring that have
technical and thematic similarities to this pr oject. A significant part of the literature on undis-
counted Bayesian repeated games with perfect monitoring analyses models under the assumption
of “known own payoffs” (see Forges [4]). This is a reasonable assumption in several applications
and allows for equilibrium-payoff characterizations which are especially tractable (see Shalev [
10]).
INFORMATION SPILLOVER 3
Under this assumption, Forges and Solomon [5] provide a simple characterization of Nash equilib-
rium payoffs in undiscounted Bayesian repeated games of two player s.
2
This characterization is
used to show that in a class of public good games, uniform equilibr ia might not exist. More closely
related to our paper in terms of the information environment is Forges et al. [
6]. In this paper,
among other results, cooperative solutions of one-shot Bayesian games with two players and exactly
one in formed player are related to noncooperative solutions of two-player repeated Bayesian games
with exactly one informed player. More specifically, under the assu mption of existence of uniform
punishment strategies for the uninformed player, the joint-plan equilibrium payoffs of the repeated
game equal the set of cooperative solutions of the one-shot Bayesian game. This folk theorem is
not however an equilibrium payoff characterization, since it is known from Hart [7] that joint plans
cannot account for the whole of equilibrium payoffs in general.
One additional reference in the discounted dynamic games literature is Huangfu and Liu [8].
Although considering a significantly different model from ours, this paper motivated our work by
considering information sp illovers between different markets with asymmetric information. In that
paper, a seller holds private information about the quality of goods he sells in two different markets
and buyers learn about the seller’s private information from obs er v ing past trading outcomes not
only in the market in which they directly participate, but also from observing the outcomes of the
other market. The authors show, under certain assumptions on the correlation of qualities between
go ods in different markets, that information spillover can mitigate adverse selection.
1.2. Organization. The remainder of the paper is organized as follows. The model is presented in
Section
2. Section 3 presents our first main result. Section 4 pr esents our second main result and
Section 5 concludes. The p roofs of technical results are left to the Ap pendix. Additional results
can be found in a Supplemental Appendix.
3
2. Model
2.1. Notation. Given a finite set K, ∆(K) is the set of probability distribu tions over K; the
interior of X will be denoted by int(X), its boundary by ∂(X), and its convex closure by co(X).
For p ∈ ∆(K
A
× K
B
), p
A
(resp. p
B
) denotes its marginal on K
A
(resp. K
B
), and supp(p) its
support. We denote a product distr ibution on K
A
× K
B
by p
A
N
p
B
, and use ∆(K
A
)
N
∆(K
B
) to
denote the set of all such distributions.
A 3-player infinitely repeated zero-sum game with lack of information on two sides, denoted G(p
0
)
is given by the following data:
• Three players, namely player 1 (the informed player), player 2 and player 3 (the uninformed
players).
2
An additional assumption needed for the characterization is the ex istence of “uniform punishment strategies” for
the players in the Bayesian stage game, that is, strategies that allow a player to be punished by holding his payoffs
at his ex-post individually rational level.
3
https://pahllucas.wixsite.com/lucaspahl.
4 L. PAHL
• Finite sets: I
i
, J
i
, K
i
, i = A, B with I
A
× I
B
(resp. J
A
and J
B
) being the set of actions of
player 1 (resp. p layers 2 and 3), and K
A
× K
B
being the set of s tates.
• p
0
∈ ∆(K
A
× K
B
) is the prior.
• For each k
A
∈ K
A
and k
B
∈ K
B
, A
k
A
and B
k
B
are |I
A
| × |J
A
| and |I
B
| × |J
B
|-payoff
matrices, respectively.
The play of the infinitely repeated game is as follows:
• At stage 0, state (k
A
, k
B
) ∈ K
A
× K
B
is drawn according to distribution p
0
and only player
1 knows the draw.
• At each stage t = 1, 2, ..., the players independently choose an action in their own set of
actions: when player 1 chooses (i
t
A
, i
t
B
) ∈ I
A
× I
B
and players 2 and 3 choose j
t
A
∈ J
A
and j
t
B
∈ J
B
, respectively, the stage payoff to player 1 is then A
k
A
i
t
A
,j
t
A
+ B
k
B
i
t
B
,j
t
B
; to player 2,
−A
k
A
i
t
A
,j
t
A
and to player 3, −B
k
B
i
t
B
,j
t
B
. Monitoring is perfect, i.e., all players observe all past
action profiles before starting stage t + 1.
Player s are assumed to have perfect recall and the whole description of the game is common
knowledge. A behavior strategy for player 1 is an element σ = (σ
(k
A
,k
B
)
)
(k
A
,k
B
)∈K
A
×K
B
, where for
each (k
A
, k
B
) ∈ K
A
× K
B
, σ
(k
A
,k
B
)
= (σ
(k
A
,k
B
)
t
)
t≥1
and σ
(k
A
,k
B
)
t
is a map ping from the Cartesian
produ ct H
t
:= (I
A
× J
A
× I
B
× J
B
)
t−1
(with H
0
:= {∅}) to ∆(I
A
× I
B
), giving the lottery on
actions played by player 1 at a stage t, when the state is (k
A
, k
B
). Because p layers 2 and 3 do not
know the state, a behavior strategy for player 2 (resp. player 3) is an element τ
A
= (τ
A,t
)
t≥1
(resp.
τ
B
= (τ
B,t
)
t≥1
), where τ
A,t
(resp. τ
B,t
) is a mapping from H
t
:= (I
A
× J
A
× I
B
× J
B
)
t−1
to ∆(J
A
)
(resp. ∆(J
B
)), giving the lottery on actions to be played by player 2 (resp. player 3) on stage t.
The set of behavior strategies of player 1 is denoted by Σ; for player 2, it is denoted by T
A
and for
player 3, it is den oted by T
B
.
We now define our equilibrium notion for the model. A behavior strategy profile (σ, τ
A
, τ
B
)
induces, for every state (k
A
, k
B
) and stage T > 0, a probability distribution on H
T +1
. Also,
(σ, τ
A
, τ
B
) and p
0
induce a probability distribution over K
A
× K
B
× H
T +1
. We can thus define the
aver age expected payoffs (with κ being a random variable taking values on K
A
× K
B
distributed
according to p
0
, κ
A
the random variable obtained from projecting κ on K
A
and κ
B
the random
variable obtained from projecting κ on K
B
):
α
k
A
,k
B
T
= α
k
A
,k
B
T
(σ, τ
A
, τ
B
) := E
σ
(k
A
,k
B
)
,τ
A
,τ
B
[
1
T
T
X
t=1
(A
k
A
i
t
A
,j
t
A
+ B
k
B
i
t
B
,j
t
B
)],
β
A
T
(σ, τ
A
, τ
B
) := E
σ,τ
A
,τ
B
,p
0
[
1
T
T
X
t=1
(−A
κ
A
i
t
A
,j
t
A
)],
INFORMATION SPILLOVER 5
β
B
T
(σ, τ
A
, τ
B
) := E
σ,τ
A
,τ
B
,p
0
[
1
T
T
X
t=1
(−B
κ
B
i
t
B
,j
t
B
)].
Equilibrium Concept. A pr ofi le (σ, τ
A
, τ
B
) is a uniform equilibrium of G(p
0
) when:
(1) For each (k
A
, k
B
) ∈ supp(p
0
),
4
(α
k
A
,k
B
T
(σ, τ
A
, τ
B
))
T ≥1
converges as T goes to infinity to some
α
k
A
,k
B
(σ, τ
A
, τ
B
), (β
A
T
(σ, τ
A
, τ
B
))
T ≥1
converges to some β
A
(σ, τ
A
, τ
B
) and (β
B
T
(σ, τ
A
, τ
B
))
T ≥1
converges to some β
B
(σ, τ
A
, τ
B
).
(2) For each ǫ > 0, there exists a positive integer T
0
such that for all T ≥ T
0
, (σ, τ
A
, τ
B
) is an
ǫ-Nash equilibrium in the finitely repeated game with T stages, i.e.,
(a) For each (k
A
, k
B
) ∈ supp(p
0
) and σ
′
∈ Σ, α
k
A
,k
B
T
(σ
′
, τ
A
, τ
B
) ≤ α
k
A
,k
B
T
(σ, τ
A
, τ
B
) + ǫ;
(b) For each τ
′
A
∈ T
A
, β
A
T
(σ, τ
′
A
, τ
B
) ≤ β
A
T
(σ, τ
A
, τ
B
) + ǫ;
(c) For each τ
′
B
∈ T
B
, β
B
T
(σ, τ
A
, τ
′
B
) ≤ β
B
T
(σ, τ
A
, τ
B
) + ǫ.
Uniform equilibrium is a standard equilibrium concept for the analysis of undiscounted repeated
games. It contains a strong requirement, namely (2), which posits that the profile (σ, τ
A
, τ
B
) must
generate an ǫ-equilibrium in all “long” (T ≥ T
0
) but T -times (where T < ∞) repeated version of
our model.
5
If (σ, τ
A
, τ
B
) is an equilibrium in G(p
0
), the associated vector
(α(σ, τ
A
, τ
B
), β
A
(σ, τ
A
, τ
B
), β
B
(σ, τ
A
, τ
B
)),
where α(σ, τ
A
, τ
B
) := (α
k
A
,k
B
(σ, τ
A
, τ
B
))
(k
A
,k
B
)∈supp(p
0
)
, is the vector of payoffs of (σ, τ
A
, τ
B
). Also
α(σ, τ
A
, τ
B
)·p
0
(where · is the standard scalar product in Euclidean space) is the ex-ante equ ilibrium
payoff of the informed player.
Our analysis of the equilibrium payoff set of the game G(p
0
) in the next section will rely on
certain properties of each of the two-player, infinitely repeated zero-sum games that the informed
player plays against each uninformed player. For this reason we now recall some of the main results
in Aumann et al. [
1], which is the original reference for this two-player model. Let K be the finite
set of states and M = (M
k
)
k∈K
a collection of zero-sum payoff matrices wher e M
k
∈ R
I×J
for
each k ∈ K. Denote by G
M
(p) the infinitely repeated, two-player, zero-sum game with lack of
information on one side with prior p ∈ ∆(K) and u ndiscounted payoffs (see Sorin [
13], Chapter
3, for a detailed description of this model or Aumann et al. [
1]). Let M(p) =
P
k∈K
p
k
M
k
and
define v
M
(p) = min
t∈∆(J)
max
s∈∆(I)
sM(p)t = max
s∈∆(I)
min
t∈∆(J)
sM(p)t, w here s is a row vector
4
Remark
3.5 in Section 3 shows that assuming the prior p
0
∈ int(∆(K
A
× K
B
)) (as customary in the literature) is
not without loss of generality for the results in this paper. This is why we present the definition requiring convergence
of (α
k
A
,k
B
T
)
T ≥ 1
, (k
A
, k
B
) ∈ supp(p
0
). The same reasoning applies to condition (2).
5
One notable aspect of this eq uilibrium notion is that uniform equilibria in our model are approximate Nash
equilibria in the discounted version of our model: if (σ, τ
A
, τ
B
) is a uniform equilibrium, then (σ, τ
A
, τ
B
) is a ε-Nash
equilibrium of the discounted versions of our model for a sufficiently high discount factor. See Theorem 13.32 in
Maschler et al. [
9].
剩余28页未读,继续阅读
资源评论
易小侠
- 粉丝: 6475
- 资源: 9万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功