(a) User interface (b) Change of opponents (c) Change of targets
Figure 1:
Game user interface (UI) and the change of opponents and targets in one match of Honor of Kings. (a)
In the main screen, there are four sub-parts: a mini-map on the top-left, a dashboard that records the number of
KDAs (kill/death/assist) on the top-right, a movement controller on the bottom-left, and skill controller buttons
on the bottom-right. (b) The environment changes with different opponent heroes. (c) The action space changes
with different target heroes.
actions of different heroes against different opponent heroes. This makes MOBA 1v1 games, which
focus on hero control [29], a perfect testbed to test the generality of models under different tasks.
Existing benchmark environments on RL generality are mainly focusing on relatively narrow tasks
for a single agent. For example, MetaWorld [
30
] and RLBench [
10
] present benchmarks of simulated
manipulation tasks in a shared, table-top environment with a simulated arm, whose goal is to train the
arm controller to complete tasks like opening the door, fetching balls, etc. As the agent’s action space
remains the same as an arm, it is hard to tell the generality of the learned RL on more diverse tasks
like simulated legs.
In this paper, we provide Honor of Kings Arena, a MOBA 1v1 game environment, authorized by
the original game Honor of Kings
1
. The game of Honor of Kings was reported to be one of the
world’s most popular and highest-grossing games of all time, as well as the most downloaded App
worldwide. As of November 2020, the game was reported to have over 100 million daily active
players [
21
]. There are two camps in MOBA 1v1, each with one agent, and each agent controls a
hero character. As shown in Figure 1(a), an Honor of Kings player uses the bottom-left steer button to
control the movements of a hero and uses the bottom-right set of buttons to control the hero’s skills.
To win a game, agents must take actions with planning, attacking, defending and skill combos, with
consideration on the opponents in the partially observable environment.
Specifically, the Honor of Kings Arena imposes the following challenges regarding generalization:
• Generalization across opponents
. When controlling one target hero, its opponent hero varies
across different matches. There are over 20 possible opponent heroes in Honor of Kings Arena (in the
original game there are over 100 heroes), each having different influences in the game environment.
If we keep the same target hero and vary the opponent hero as in Figure 1(b), Honor of Kings Arena
could be treated as a similar environment as MetaWorld [
30
], which both provides a variety of tasks
for the same agent with the same action space.
• Generalization across targets
. The generality challenge of RL arises to a different dimension
when it comes to the competitive setting. In a match of MOBA game like Honor of Kings and DOTA,
players also need to master different hero targets. Playing a different MOBA hero is like playing a
different game since different heroes have various attacking and healing skills, and the action control
can completely change from hero to hero, as shown in Figure 1(c). With over 20 heroes to control for
Honor of Kings Arena, it calls for robust and generalized modeling in RL.
Contributions:
As we will show in this paper, the above-mentioned challenges are not well solved
by existing RL methods under Honor of Kings Arena. In summary, our contributions are as follows:
•
We provide the Honor of Kings Arena, a highly-optimized game engine that simulates the popular
MOBA game, Honor of Kings. It supports 20 heroes in the competitive mode.
•
We introduce simple and standardized APIs to make RL in Honor of Kings straightforward: the
complex observations and actions are defined in terms of low-resolution grids of features; configurable
rewards are provided combining factors like the score from the game engine.
•
We evaluate RL algorithms under Honor of Kings Arena, providing an extensive set of benchmark-
ing results for future comparison.
•
The generality challenges in competitive RL settings are proposed with preliminary experiments
showing that existing RL methods cannot cope well under Honor of Kings Arena.
1
https://en.wikipedia.org/wiki/Honor_of_Kings
2