【免费】deep-go基于深度学习的围棋AI演示程序(基于ConvNetJS库)资源-CSDN文库

共15个文件

js：6个

png：3个

css：2个

深度学习

神经网络

3星 · 超过75%的资源需积分: 0 145 浏览量 2017-01-29 20:20:27 上传评论 3 收藏 2.23MB 7Z 举报

资源推荐

资源详情

资源评论

收起资源包目录

deep-go_20170125.7z （15个子文件）

deep-go

paper

Training Deep Convolutional Neural Networks to Play Go.pdf 304KB

deep-go

playgo.css 1KB

man_vs_machine.js 10KB

convnet-min.js 33KB

font-awesome-4.4.0

css

font-awesome.min.css 26KB

fonts

fontawesome-webfont.woff2 63KB

medium

white.png 6KB

shinkaya.jpg 72KB

shadow.png 3KB

board.js 2KB

black.png 6KB

Play Go Against a DCNN.html 5KB

net.js 46.11MB

jgoboard-3.4.2.js 64KB

spin.min.js 4KB

Training Deep Convolutional Neural Networks to Play Go

Christopher Clark CHRISC@ALLENAI.ORG

Allen Institute for Artiﬁcial Intelligence

∗

, 2157 N Northlake Way Suite 110, Seattle, WA 98103, USA

Amos Storkey A.STORKEY@ED.AC.UK

School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH9 1DG, United Kingdom

Abstract

Mastering the game of Go has remained a long-

standing challenge to the ﬁeld of AI. Modern

computer Go programs rely on processing mil-

lions of possible future positions to play well,

but intuitively a stronger and more ‘humanlike’

way to play the game would be to rely on pattern

recognition rather than brute force computation.

Following this sentiment, we train deep convo-

lutional neural networks to play Go by training

them to predict the moves made by expert Go

players. To solve this problem we introduce a

number of novel techniques, including a method

of tying weights in the network to ‘hard code’

symmetries that are expected to exist in the target

function, and demonstrate in an ablation study

they considerably improve performance. Our ﬁ-

nal networks are able to achieve move prediction

accuracies of 41.1% and 44.4% on two different

Go datasets, surpassing previous state of the art

on this task by signiﬁcant margins. Additionally,

while previous move prediction systems have not

yielded strong Go playing programs, we show

that the networks trained in this work acquired

high levels of skill. Our convolutional neural net-

works can consistently defeat the well known Go

program GNU Go and win some games against

state of the art Go playing program Fuego while

using a fraction of the play time.

1. Introduction

Go is an ancient, deeply strategic board game that is notable

for being one of the few board games where human experts

are still comfortably ahead of computer programs in terms

of skill. Predicting the moves made by expert players is

∗

Work completed at the University of Edinburgh

Proceedings of the 32

International Conference on Machine

Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copy-

right 2015 by the author(s).

an interesting and challenging machine learning task, and

has immediate applications to computer Go. In this section

we provide a brief overview of Go, previous work, and the

motivation for our deep learning based approach.

1.1. The Game of Go

Figure 1. Capturing pieces in Go. Here white’s stones in the upper

left are connected to each other through adjacency so they form a

single group (left panel). When black places a stone on the indi-

cated grid point (middle panel) that group is surrounded, meaning

there are no longer any empty grid points adjacent to it, so the

entire group is removed from the board (right panel).

Figure 2. Example of positions from a game of Go after 50 moves

have passed (left) and after 200 moves have passed (right). In the

right panel it can be seen that white is gaining control of territory

in the center and top of the board, while black is gaining inﬂuence

over the left and right edges.

We give a very brief introduction to the rules of Go. We de-

fer to (Bozulich, 1992) or (M

uller, 2002) for a more com-

prehensive account of the rules. Go has a number of differ-

ent rulesets that subtly differ as to when moves are illegal

and how the game is scored, here we focus on generalities

Training Deep Convolutional Neural Networks to Play Go

that are common to all rulesets.

Go is a two-player game that is usually played on a 19x19

grid based board. The board typically starts empty. One

player plays as white and one as black. White starts by

placing a white stone on a grid point. Black then places

a black stone on an unoccupied grid point, and play con-

tinues in this manner. Players can opt to pass instead of

placing a stone, in which case their turn is skipped and

their opponent may make a second move. Stones cannot be

moved once they are placed, however a player can capture a

group of their opponent’s stones by surrounding that group

with their own stones. In this case the surrounded group

is removed from the board as shown in Figure 1. Broadly

speaking, the objective of Go is to capture as many of the

grid points on the board as possible by either occupying

them or surrounding them with stones. The game is played

until both players pass their turn, in which case the players

come to an agreement about which player has control over

which grid points and the game is scored.

Through the capturing mechanic it is possible to create in-

ﬁnite ‘loops’ in play as players repeatedly capture and re-

place the same pieces. Go rulesets include rules to prevent

this from occurring. The simplest version of this rule is

called the simple-ko rule, which states that players cannot

play moves that would recreate the position that existed on

their previous turn. Most Go rulesets contain stronger ver-

sions of this rule called super-ko rules, which prevent play-

ers from recreating any previously seen position. Figure 2

shows some example board positions from a game of Go.

State of the art computer Go programs such as Fuego (En-

zenberger et al., 2010), Pachi (Baudi

s & Gailly, 2012), or

CrazyStone

, can achieve the skill level of very strong am-

ateur players, but are still behind professional level play.

The difﬁculty computers have in this domain relative to

other board games, such as chess, is often attributed to two

things. First, in Go there are a very large number of possi-

ble moves. Players have 19 × 19 = 361 possible starting

moves. As the board ﬁlls up the number of possible moves

decreases, but can be expected to remain in the hundreds

until late in the game. This is in contrast to chess where

the number of possible moves might stay around ﬁfty. Sec-

ond, good heuristics for assessing which player is winning

in Go have not been found. Counting the number of stones

each player has is a poor indicator of who is winning, and

it has proven to be difﬁcult to build effective heuristics for

estimating which player has the stronger position.

Current state of the art Go playing programs use Monte

Carlo Tree Search (MCTS) algorithms. MCTS algorithms

evaluate positions in Go using simulated ‘playouts’ where

the game is played to completion from the current posi-

tion assuming both players move randomly or follow some

computationally cheap best move heuristic. Many playouts

are carried out and it is then assumed good positions are

http://remi.coulom.free.fr/CrazyStone/

ones where the program was the winner in the majority

of them. See (Browne et al., 2012) for a recent survey of

MCTS algorithms and (Rimmel et al., 2010) for a survey

of some modern Go playing systems.

1.2. Move Prediction in Go

Human Go experts rely heavily on pattern recognition

when playing Go. Expert players can gain strong intuitions

about what parts of the board will fall under whose con-

trol and what are the best moves to consider at a glance,

and without needing to mentally simulate possible future

positions. This is in contrast to typical computer Go algo-

rithms, which simulate thousands of possible future posi-

tions and make minimal use of pattern recognition. This

gives us reason to think that developing pattern recognition

algorithms for Go might be the missing element needed to

close the performance gap between computers and humans.

In particular for Go, pattern recognition systems could pro-

vide ways to combat the high branching factor by making

it possible to prune out many of the possible moves in the

current position.

Outside of playing Go, move prediction is an interesting

machine learning task in its own right. We expect the target

function to be highly complex, since it is fair to assume hu-

man experts think in complex, non-linear ways when they

choose moves. We also expect the target function to be

non-smooth because a minor change to a position in Go

(such as adding or removing a single stone) could be ex-

pected to dramatically alter which moves are likely to be

played next. These properties make this learning task very

challenging, however it has been argued that acquiring an

ability to learn complex, non-smooth functions is of par-

ticular importance when it comes to solving AI (Bengio,

2009). These properties have also motivated us to attempt

a deep learning approach, as it has been argued that deep

learning is well suited to learning complex, non-smooth

functions (Bengio, 2009; Bengio & LeCun, 2007). Move

prediction for Go also provides an opportunity to test the

abilities of deep learning on a domain that has a close con-

nection to AI.

1.3. Previous Work

Previous work in move prediction for Go typically made

use of feature construction or shallow neural networks. The

former approach involves characterizing each legal move

by a large number of features. These features include many

‘shape’ features, which take on a value of 1 if the stones

around the move in question exactly match a predeﬁned

conﬁguration of stones and 0 otherwise. Stone conﬁgura-

tions can be as small as the nearest two or three squares and

as large as the entire board. Very large numbers of stone

conﬁgurations can be harvested by ﬁnding commonly oc-

curring stone conﬁgurations in the training data. Shape fea-

tures can be augmented with hand crafted features, such as

distance of the move in question to the edge of the board,

Training Deep Convolutional Neural Networks to Play Go

whether making the move will capture or lose stones, its

distance to previous moves, ect. Finally a model is trained

to rank the legal moves based on their features. Work fol-

lowing this approach includes (Stern et al., 2006; Araki

et al., 2007; Wistuba et al., 2012; Wistuba & Schmidt-

Thieme, 2013; Coulom, 2007). Depending on the complex-

ity of the model used, researchers have seen accuracies of

30 - 41% on move prediction for high-ranked amateur Go

players.

Several researchers have made use of neural networks for

move prediction. Werf et al. trained a neural network to

predict expert moves using hand constructed features, pre-

processing techniques to reduce the dimensionality of the

data, and a two layer neural network (Van Der Werf et al.,

2003). Our work builds upon work done by Sutskever et al.,

where two layer convolutional networks were trained for

move prediction (Sutskever & Nair, 2008). They achieved

an accuracy of 34% when predicting the moves made by

professional Go players using a network that took both the

current board position and the previous moves as input. An

ensemble of networks reached 37% accuracy.

Our work will differ in several important ways. We use

much deeper networks and propose several novel position

encoding schemes and network designs that improve per-

formance. We found that the most important one is a strat-

egy of tying weights within the network to ‘hard code’ par-

ticular symmetries that are expected to exist in good move

prediction functions. We also do not use the previously

made moves as input. There are two motivations for choos-

ing not to do so. First, classiﬁers trained using previous

moves as input might come to rely on heuristics like ‘move

near the area where previous moves were made’ instead of

learning to evaluate positions based on the current stone

conﬁguration. While this might improve accuracy, our fun-

damental motivation is to see whether the classiﬁer can

capture Go knowledge, and the ability to borrow knowl-

edge from experts by looking at their past moves cheapens

this objective. Secondly, it is likely to reduce performance

when it comes to playing as a stand-alone Go player. Dur-

ing play both an opponent and the network are liable to

make much worse moves then would be made by Go ex-

perts, therefore coming to rely on the assumption that the

previous moves were made by experts can be expected to

yield poor results. This potential problem was also noted

by (Araki et al., 2007). Our work is also the ﬁrst to perform

an evaluation across two datasets, providing an opportunity

to compare how classiﬁers trained on these datasets differ

in terms of Go playing abilities and move prediction accu-

racy.

Several of the works mentioned above analyze or comment

on the strength of their move prediction program as a stand-

alone Go player. In (Van Der Werf et al., 2003) researchers

found that their neural network was consistently defeated

by GNU Go and conclude their ‘...belief was conﬁrmed that

the local move predictor in itself is not sufﬁcient to play a

strong game.’ Work by (Araki et al., 2007) also reports

that their move predictor was beaten by GNU Go. Stern et

al. report that other Go players estimated their move pre-

dictor as having a ranking of 10-15 kyu, but do not report

its win rates against another computer Go opponent (Stern

et al., 2006) . Both (Coulom, 2007; Wistuba & Schmidt-

Thieme, 2013) do not give formal results, but suggest that

their systems did not make strong stand-alone Go playing

programs. In general past approaches to move prediction

have not resulted in Go programs with much skill.

Subsequent to the public release of this work, Maddison

et al. also released independent work exploring DCNNs

for move prediction (Maddison et al., 2014). In that work

they explore larger networks and different board represen-

tations. However they also use previous moves as input to

their classiﬁer, which we avoid. They show initial explo-

rations of using convolutional networks as part of Monte-

Carlo Tree Search.

The work presented here is based on the work done

in (Clark, 2014).

2. Approach

2.1. Data Representation

As done by (Sutskever & Nair, 2008), the networks trained

here take as input a representation of the current position

and output a probability distribution over all grid points of

the Go board, which are interpreted as a probability distri-

bution over the possible places an expert player could place

a stone. During testing probability given to grid points that

would be an illegal move, either due to being occupied by

a stone or due to the simple-ko rule, are set to zero and the

remaining outputs renormalized. We follow (Sutskever &

Nair, 2008) by encoding the current position in two 19x19

binary matrices. The ﬁrst matrix has ones indicating the

location of the stones of the player who is about to play,

the second 19x19 matrix has ones marking where the op-

ponent’s stones are. We depart from (Sutskever & Nair,

2008) by additionally encoding the presence of a ‘simple-

ko constraint’ if one is present in a third 19x19 matrix.

Here simple-ko constraints refers to grid points that the

current player is not allowed to place a stone on due to

the simple-ko rule. In our dataset of professional games

only 2.4% of moves were made with a simple-ko constraint

present. However simple-ko constraints are often featured

in Go tactics so we hypothesize they are still important to

include as input. We elect not to encode move constraints

beyond the ones created by the simple-ko rule, meaning

constraints stemming from super-ko rules, because they are

rare, harder to detected, ruleset-dependent, and less promi-

nent in Go tactics. Thus the input has three channels and

a height and width of 19. Again following (Sutskever &

Nair, 2008), as well as other work that has found this to

be a useful feature such as (Wistuba & Schmidt-Thieme,

2013), we tried encoding the board into 7 channels where

评论收藏

内容反馈

qq_38067408

2018-03-18

还好，能用，看着太累
lichunmao689148

2017-03-29

代码是经过编译的，不全呀
benchild126

2017-09-18

资源不错，参考学习一下
a2791897673

2017-10-20

不错，就是加载神经网络数据的时候，浏览器卡死好久。

winx_coder

粉丝: 7
资源: 2

deep-go 基于深度学习的围棋AI演示程序(基于ConvNetJS库)

最新资源

deep-go 基于深度学习的围棋AI演示程序(基于ConvNetJS库)

js前端五子棋，直接运行，带Ai功能

C#围棋程序

公司一个大师用JS写的围棋

Javascript围棋(含行棋路线)

Python-深度学习和围棋的游戏的源码和其他材料

用于开发游戏AI 的 JavaScript 库_JavaScript _代码_下载

AlphaGo围棋源码.rar

围棋开源AI棋谱分析工具Goreviewpartner中文汉化版0.14.2

GoAI:该存储库将用于开发可以在线玩Go的AI机器人

JavaScriptAI:JavaScript 和人工智能，示例和演示，WIP

deep_learning_and_the_game_of_go：“深度学习与围棋”一书的代码和其他材料

html+js实现围棋棋盘

Unity围棋项目源码

最新最强电脑围棋

C#语言开发的游戏围棋

AI智能web纯前端html+js实现的照片动起来唱歌源码 参考学习用还可以

GNU GO（自由围棋源代码）

PHP+MYSQL+JS（纯canvas）围棋，代码有简单注释方便学习研究

开源围棋fuego代码

auqlue：这是使用Golang的深度学习框架

C#益智游戏-记忆围棋

C#黑白棋游戏 C# 围棋小游戏

Alex围棋游戏源码

《深度学习与围棋》一书的代码和其他材料-Python开发

C#网络围棋(搜集的资料汇总)

围棋棋局识别

C# 翻转棋源码（黑白棋带智能）

成三围棋源代码ThreeGo-V10

Origin绘制相关性热图插件(Correlation Plot)

（免费）Chrome浏览器插件axure-chrome-extension

最新资源

AI智能web纯前端html+js实现的照片动起来唱歌源码参考学习用还可以