DA and RSD—and we are not aware of any work that has attempted to map out the tradeoff more
generally.
Inspired by the recent development of deep learning for optimal auction design [
10
,
8
,
28
,
23
],
we initiate the study of multi-player neural networks to model two-sided matching. We show how
to use machine learning to characterize the frontier curve of the tradeoff between stability and
strategy-proofness. The main challenges of applying neural networks to two-sided matching come
from handling the ordinal preference inputs, and in identifying suitable differentiable surrogates for
approximate strategy-proofness and stability.
For randomized matching mechanisms, the strongest SP concept is that of ordinal strategy-
proofness. This aligns incentives with truthful reporting whatever the utility function of an agent.
Ordinal SP is equivalent to a property of first-order stochastic dominance (FOSD) [
11
], which means
that agents have a weakly higher chance of getting their top-ranked choices when they report their
preferences truthfully. Our metric for SP quantifies the degree to which FOSD is violated. For this,
we adopt an adversarial approach, seeking to augment the training data with suitable defeating
mis-reports. We also provide a metric to quantify the degree to which stability is violated. We show
that a loss function built from these quantities can be trained through SGD, and illustrate the use of
the framework to identify the stability-strategyproofness frontier. We run simulations to validate the
efficiency of our approach, demonstrating for different preference distributions that our approach can
strike much better trade-offs between stability and strategyproofness than the convex combination
of DA and RSD.
Related work.
This work lies at the intersection of two-sided matching [
27
] and the role of
machine learning within economics [
6
]. The matching mechanisms learned by our neural networks are
randomized, approximately strategy-proof, and approximately stable. Budish et al. [
7
] and Mennle
and Seuken [
19
,
20
] discuss different notions of approximate strategy-proofness in the context of
matching and allocation. In this work, we focus on ordinal SP and its analog of FOSD [
11
]. This is
a strong and widely-used SP concept in the presence of ordinal preferences.
Classic results of Dubins and Freedman [
9
] and Roth [
24
] show that it is impossible to achieve both
stability and strategy-proofness in two-sided matching simultaneously—although strategy-proofness
for one side of the market is achieved by the Deferred Acceptance (DA) mechanism.
3
Thus market
designers have looked at mechanisms that relax one or both of these conditions. The Random serial
dictatorship (RSD) mechanism [
1
] is SP but typically fails to produce stable outcomes (and indeed, it
may even fail to be individually rational).
4
On the other hand, Roth et al. [
26
] study the polytope of
stable matchings and provide a wide class of stable matchings beyond the well-known DA outcome.
The stable improvement cycles mechanism of Erdil and Ergin [
12
], meanwhile, achieves as much
efficiency as possible on top of stability, but fails to be SP even for one side of the market. Finally, a
series of results have shown that DA becomes SP for both sides of the market in certain large-market
limit contexts; these results typically also require additional, structural assumptions on market
participants’ preferences (see, e.g., [16, 17, 18]).
This work belongs to the emerging literature on machine learning for economic design. Narasimhan
et al. [
21
] utilize support vector machines to search for good mechanisms among the weighted polytope
mechanisms. Recently, many papers [
10
,
13
,
15
,
8
,
28
,
23
] apply deep neural networks to optimal
auction design and facility location problems. In this work, and inspired by the neural network
3
Alcalde and Barberà [
4
] also showed the impossibility of individually rational, Pareto efficient, and SP allocation
rules. Alva and Manjunath [5] extended this result to randomized matching contexts.
4
The Top Trading Cycles (TTC) mechanism is likewise SP—but it effectively treats the market as one-sided,
producing outcomes that do not reflect the other side’s preferences.
2