基于Q-Learning解决迷宫问题python源码(含项目报告及演示视频).zip资源-CSDN文库

共27个文件

pyc：8个

py：8个

xml：5个

版权申诉

Q-Learning

强化学习

128 浏览量 2023-12-28 14:28:46 上传评论收藏 2.52MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于Q-Learning解决迷宫问题python源码(含项目报告及演示视频).zip （27个子文件）

项目说明.md 899B

QLearning

Saved_QTable

maze11_1.npy 17KB

maze10_3.npy 16KB

.idea

workspace.xml 12KB

misc.xml 185B

inspectionProfiles

profiles_settings.xml 174B

QLearning.iml 317B

modules.xml 270B

encodings.xml 277B

train_qtable.py 6KB

MAP

maze.py 7KB

maze_map.py 3KB

__pycache__

maze.cpython-37.pyc 5KB

maze_map.cpython-37.pyc 2KB

GUI

gui.py 1KB

gui_basic.py 7KB

draw.py 5KB

gui_userSelfDefine.py 10KB

__pycache__

gui_userSelfDefine.cpython-37.pyc 6KB

draw_ui.cpython-37.pyc 4KB

gui.cpython-37.pyc 1KB

gui_basic.cpython-37.pyc 5KB

draw.cpython-37.pyc 4KB

draw_ui.py 6KB

__pycache__

train_qtable.cpython-37.pyc 4KB

项目报告.pdf 593KB

项目展示视频.mp4 2.06MB

《智能驾驶技术》软英 1702 20175188 马洪升

Reinforcement Learning: Q-Learning

1 Project Introduction

1.1 Background

Reinforcement learning is a machine learning method. Through the interactive

feedback system (reward and punishment) between Agent and Environment, the agent

needs to use a series of decisions and state transitions to achieve the preset goal.

A classic example is training a rat (an intelligence agent) to find the shortest path

to a cake in a maze (an environment). Agents use Exploration and Exploitation of past

Experiences to achieve their goals. It may fail over and over again, but after a long

period of trial and error, the agent can finally find the answer to the problem.

The value of Accumulated Rewards can be maximized when an intelligent agent

can continuously find an optimal state in the long run. (in short, an algorithm with

feedback rewards can be used to induce an intelligent agent to achieve a goal by

continuously acquiring rewards.)

In addition, the agent may have to endure many penalties (negative rewards) in

order to achieve the goal. For example, the mouse in the maze was given a slap on the

wrist for every legal action because we wanted it to take the shortest possible route to

reach the target unit, otherwise it would be rewarded for wandering around the maze at

will. The shortest path to the target cake can sometimes be long and convoluted, and

the agent (the mouse) may have to endure many penalties until it finally reaches the

Delayed Reward goal.

1.2 Maze Problem

Maze problem has been applied in data structure and algorithm research. The well-

known Dijkstra shortest path algorithm is still one of the most practical methods to

solve these problems. But because of the intuitive nature of the maze problem, it is well

suited to demonstrate and test reinforcement learning techniques.

《智能驾驶技术》软英 1702 20175188 马洪升

An algorithm with Feedback Rewards can be used to induce an intelligent agent

to achieve a goal by continuously acquiring rewards.

1.3 My Project

The maze form of this project is shown in the figure, in which the stars are cars,

the green square is grass and the red circle is signal light. Cars are not allowed to walk

into grass or walls. Cars are not allowed to move when red light appear. The period of

the red lights flashing can be defined by the user.

In my project, users can also define their own maze, training times, etc.

《智能驾驶技术》软英 1702 20175188 马洪升

2 Project Modelling

2.1 MDP

The framework of the Markov Decision Process MDP consists of the environment

and the intelligence agent acting in the environment.

In our example, the environment is a classic square maze with five types of squares:

➢ Wall

➢ The blank space

➢ Target square (where the Exit is)

➢ Light

➢ Grass

Our agent is a car that is only allowed to move on a blank space for the sole purpose

of finding an exit.

Four actions: up, down, left, right.

评论收藏

内容反馈

版权申诉

北航程序员小C

粉丝: 2222
资源: 1823

基于Q-Learning解决迷宫问题python源码(含项目报告及演示视频).zip

人工智能大作业基于强化学习求解迷宫问题python实现源码+项目说明+实验报告+可执行文件.zip

基于Q-Learning强化学习算法走迷宫游戏python源码.zip

基于Q-Learning算法实现的论文推荐系统python源码(带数据和论文).zip

分别基于Q-learning、sarsa、蒙特卡洛(强化学习)解决二维世界问题python源码+详细注释.zip

基于python实现Q-Learning算法训练倒立摆控制源码.zip

基于Pytorch实现深度强化学习各种算法python源码+算法介绍(DQN、Q-Learning、Sarsa等14种).zip

强化学习算法-基于python的Q学习算法q-learning实现

【提供操作视频】基于Q-learning强化学习的H无穷控制器设计matlab仿真

Algorithm-Machine-Learning-for-Beginner-by-Python3.zip

Q-learning_Q-learning_Q-Learningpython_DEMO_

Learning-Data-Mining-with-Python-Second-Edition-master.zip

Hands-On-Reinforcement-Learning-With-Python-master.zip

基于python的强化学习算法Q-learning设计与实现

自适应交通信号灯控制（增强学习）（Q-learning）(代码 python ).zip

Hands-on-data-science-and-Python-machine-learning-perform-data-mining-and-machine-learning-efficiently-using-Python-and- ....pdf

【路径规划】基于强化学习Q-Learing实现栅格地图路径规划matlab源码.zip

Q-learning_q-learningmatlab_联合开发_路径规划_三维路径规划_q学习_源码.zip

DEEE_Q_NETWORK_深度学习_Q-learning_深度强化学习_python_deeplearning_源码.zip

毕业设计基于强化学习的智能体小车python源码+项目说明+模型.zip

Q-Learning-in-Python-master.rar_Q-learning_python qlearn库_qlearn

《点燃我温暖你》中李峋的同款爱心代码

122版本Chrome最新驱动-122.0.6261.58

第十五届蓝桥杯大赛软件赛省赛-PythonB组题目

Python入门基础教程全套.ppt

Stable Diffusion WebUI linux部署问题

Tesseract最新中文语言包chi-sim.traineddata

Python学习笔记(干货) 中文PDF完整版.pdf

PyCharm安装教程一篇搞定包括下载PyCharm、安装PyCharm、PyCharm简单使用教程

第十五届蓝桥杯大赛软件赛省赛-PythonA组题目

最新资源