基于Python+Pygame+PaddlePaddle打造一款点击按钮就能可视化地训练围棋人工智能的程序+代码+文档说明资源-CSDN文库

共94个文件

png：38个

py：32个

md：13个

版权申诉

paddlepaddle

python

pygame

人工智能

68 浏览量 2023-12-01 22:12:32 上传评论收藏 36.25MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

机巧围棋(CleverGo)基于Python+Pygame+PaddlePaddle打造一款点击按钮就能可视化地训练围棋人工智能的程序。.zip （94个子文件）

code

pictures

7_4.png 34KB

训练初始界面.png 59KB

7_3.png 49KB

2_5.png 388KB

2_1.png 318KB

9_1.png 30KB

6_2.png 26KB

1_1.png 63KB

2_4.png 375KB

8_3.png 239KB

6_1.png 29KB

4_1.png 21KB

2_7.png 472KB

4_2.png 7KB

2_8.gif 462KB

7_1.png 63KB

7_2.png 96KB

6_3.png 30KB

2_6.png 455KB

对弈.png 98KB

8_2.png 92KB

启动界面.png 58KB

3_1.png 188KB

2_2.png 355KB

4_3.png 150KB

2_3.png 379KB

训练过程.png 149KB

2_9.png 372KB

2_10.png 320KB

8_1.png 84KB

assets

pictures

B-19.png 17KB

W-9.png 22KB

B.png 25KB

W-19.png 17KB

W-13.png 19KB

B-9.png 23KB

B-13.png 19KB

W.png 25KB

audios

Button.wav 10KB

Stone.wav 86KB

fonts

msyh.ttc 18.74MB

msyhl.ttc 11.58MB

msyhbd.ttc 16.05MB

player.py 7KB

mcts.py 9KB

LICENSE 1KB

game_engine.py 36KB

trainer.py 7KB

play_game.py 590B

pgutils

pgcontrols

__init__.py 190B

button.py 6KB

ctbase.py 671B

text.py 2KB

position.py 492B

pgtools

__init__.py 172B

toolbase.py 493B

information_display.py 4KB

manager.py 2KB

policy_value_net.py 3KB

docs

阿尔法狗与机巧围棋的网络结构.md 8KB

深度强化学习基础.md 20KB

深度学习框架(PaddlePaddle)使用教程.md 6KB

机巧围棋(CleverGo)技术原理文档.md 3KB

游戏开发引擎(Pygame)核心方法.md 26KB

机巧围棋(CleverGo)开发计划文档.md 4KB

训练策略网络和价值网络.md 17KB

围棋基本知识.md 14KB

机巧围棋(CleverGo)项目总览及介绍.md 5KB

围棋程序逻辑.md 26KB

蒙特卡洛树搜索(MCTS).md 27KB

go_engine.py 10KB

GymGo

gym_go

__init__.py 193B

gogame.py 15KB

tests

test_invalid_moves.py 7KB

test_batch_fns.py 1022B

efficiency.py 3KB

test_basics.py 10KB

test_valid_moves.py 8KB

state_utils.py 11KB

envs

__init__.py 93B

go_extrahard_env.py 99B

go_env.py 8KB

govars.py 117B

rendering.py 4KB

screenshots

human_ui.png 329KB

setup.py 134B

.gitignore 61B

demo.py 631B

README.md 4KB

requirements.txt 121B

models

alpha_go.pdparams 1.28MB

.gitignore 50B

test.py 636B

README.md 3KB

# About An environment for the board game Go. It is implemented using OpenAI's Gym API. It is also optimized to be as efficient as possible in order to efficiently train ML models. # Installation ```bash # In the root directory pip install -e . ``` # API ### Basic example ```bash # In the root directory python3 demo.py ``` ![alt text](screenshots/human_ui.png) ### Coding example ```python import gym go_env = gym.make('gym_go:go-v0', size=7, komi=0, reward_method='real') first_action = (2,5) second_action = (5,2) state, reward, done, info = go_env.step(first_action) go_env.render('terminal') ``` ``` 0 1 2 3 4 5 6 ----------------------------- 0 | | | | | | | | ----------------------------- 1 | | | | | | | | ----------------------------- 2 | | | | | | B | | ----------------------------- 3 | | | | | | | | ----------------------------- 4 | | | | | | | | ----------------------------- 5 | | | | | | | | ----------------------------- 6 | | | | | | | | ----------------------------- Turn: WHITE, Last Turn Passed: False, Game Over: False Black Area: 49, White Area: 0, Reward: 0 ``` ```python state, reward, done, info = go_env.step(second_action) go_env.render('terminal') ``` ``` 0 1 2 3 4 5 6 ----------------------------- 0 | | | | | | | | ----------------------------- 1 | | | | | | | | ----------------------------- 2 | | | | | | B | | ----------------------------- 3 | | | | | | | | ----------------------------- 4 | | | | | | | | ----------------------------- 5 | | | W | | | | | ----------------------------- 6 | | | | | | | | ----------------------------- Turn: BLACK, Last Turn Passed: False, Game Over: False Black Area: 1, White Area: 1, Reward: 0 ``` ### High level API [GoEnv](gym_go/envs/go_env.py) defines the Gym environment for Go. It contains the highest level API for basic Go usage. ### Low level API [GoGame](gym_go/gogame.py) is the set of low-level functions that defines all the game logic of Go. `GoEnv`'s high level API is built on `GoGame`. These sets of functions are intended for a more detailed and finetuned usage of Go. # Scoring We use Trump Taylor scoring, a simple area scoring, to determine the winner. A player's _area_ is defined as the number of empty points a player's pieces surround plus the number of player's pieces on the board. The _winner_ is the player with the larger area (a game is tied if both players have an equal amount of area on the board). There is also support for `komi`, a bias score constant to balance the advantage of black going first. By default `komi` is set to 0. # Game ending A game ends when both players pass consecutively # Reward methods Reward methods are in _black_'s perspective * **Real**: * If game ended: * `-1` - White won * `0` - Game is tied * `1` - Black won * `0` - Otherwise * **Heuristic**: If the game is ongoing, the reward is `black area - white area`. If black won, the reward is `BOARD_SIZE**2`. If white won, the reward is `-BOARD_SIZE**2`. If tied, the reward is `0`. # State The `state` object that is returned by the `reset` and `step` functions of the environment is a `6 x BOARD_SIZE x BOARD_SIZE` numpy array. All values in the array are either `0` or `1` * **First and second channel:** represent the black and white pieces respectively. * **Third channel:** Indicator layer for whose turn it is * **Fourth channel:** Invalid moves (including ko-protection) for the next action * **Fifth channel:** Indicator layer for whether the previous move was a pass * **Sixth channel:** Indicator layer for whether the game is over # Action The `step` function takes in the action to execute and can be in the following forms: * a tuple/list of 2 integers representing the row and column or `None` for passing * a single integer representing the action in 1d space (i.e 9 would be (1,2) and 49 would be a pass for a 7x7 board)

评论收藏

内容反馈

版权申诉