# Temporal-Difference Learning Demos in MATLAB
In this package you will find MATLAB codes which demonstrate some selected examples of *temporal-difference learning* methods in *prediction problems* and in *reinforcement learning*.
To begin:
* Run `DemoGUI.m`
* Start with the set of predefined demos: select one and press *Go*
* Modify demos: select one of the predefined demos, and modify the options
Feel free to distribute or use package especially for educational purposes. I personally, learned too much from cliff-walking.
The repository for the package is hosted on [GitHub](https://github.com/sinairv/Temporal-Difference-Learning).
## Why temporal difference learning is important
A quotation from *R. S. Sutton*, and *A. G. Barto* from their book *Introduction to Reinforcement Learning* ([here](http://www.cs.ualberta.ca/~sutton/book/ebook/node60.html)):
> If one had to identify one idea as central and novel to reinforcement learning, it would undoubtedly be temporal-difference (TD) learning.
Many basic reinforcement learning algorithms such as *Q-Laerning* and *SARSA* are in essence *temporal difference learning methods*.
## Demos
* *Prediciton random walk*: see how precise we can predict the probability of visiting nodes
* *RL random walk*: see how RL generated random walk policy converges the computed probabilities.
* *Simple grid world (with and without king moves)*: see how RL generated policy helps the agent find the goal through time (by *king-moves* it is meant moving along the four main directions and the diagonals, i.e., the way king moves in chess).
* *Windy grid world*: the wind distracts the agent from its destination sought by its actions. See how RL solves this problem.
* *Cliff walking*: the agent should reach its destination while avoiding the cliffs. A truly instructive example, which shows the differences between *on-policy*, and *off-policy* learning algorithms.
## References
[1] Sutton, R. S., "Learning to predict by the methods of temporal differences, In *Machine Learning*, pp. 9-44, 1988 (available [online](http://webdocs.cs.ualberta.ca/~sutton/papers/sutton-88.pdf))
[2] Sutton, R. S. and Barto, A. G., "Reinforcement learning: An introduction," 1998 (available [online](http://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html))
[3] Kaelbling, L. P., Littman, M. L., and Moore, A. W., "Reinforcement learning: A survey," *Journal of Artificial Intelligence Research*, Vol.4, pp.237-285, 1997 (available [online](http://www.jair.org/media/301/live-301-1562-jair.pdf))
## Contact
Copyright (c) 2011 Sina Iravanian - licensed under MIT.
Homepage: [sinairv.github.io](https://sinairv.github.io)
GitHub: [github.com/sinairv](https://github.com/sinairv)
Twitter: [@sinairv](http://www.twitter.com/sinairv)
## Screenshots
Prediction random walk demo:
![Prediction random walk demo](http://sinairv.github.io/temporal-difference-learning/images/PrdRandomWalk.png)
RL random walk demo:
![RL random walk demo](http://sinairv.github.io/temporal-difference-learning/images/RLRandomWalk.png)
Simple grid-world demo:
![Simple grid-world demo](http://sinairv.github.io/temporal-difference-learning/images/GridWorlds.png)
没有合适的资源?快使用搜索试试~ 我知道了~
中心差分法的MATLAB代码-Temporal-Difference-Learning:Matlab中的时态差异学习和基本强化学...
共21个文件
m:19个
fig:1个
md:1个
需积分: 39 8 下载量 21 浏览量
2021-05-22
17:51:10
上传
评论 1
收藏 34KB ZIP 举报
温馨提示
中心差分法的MATLAB代码MATLAB中的时差学习演示 在此软件包中,您将找到MATLAB代码,这些代码演示了预测问题和强化学习中的时差学习方法的一些选定示例。 开始: 运行DemoGUI.m 从一组预定义的演示开始:选择一个演示并按Go 修改演示:选择预定义的演示之一,然后修改选项 随意分发或使用软件包,特别是出于教育目的。 我个人从徒步旅行中学到了很多东西。 软件包的存储库位于。 为什么时间差异学习很重要 RS Sutton和AG Barto从他们的书《强化学习入门》 ()引述: 如果必须将一种思想确定为强化学习的核心和新颖性,那么毫无疑问,这将是时差(TD)学习。 本质上,许多基本的强化学习算法(例如Q层和SARSA)都是时差学习方法。 演示版 Prediciton随机游走:了解我们可以多么精确地预测访问节点的概率 RL随机游走:了解RL生成的随机游走策略如何收敛计算的概率。 简单的网格世界(有或没有国王移动) :了解RL产生的政策如何帮助代理人随时间推移找到目标(通过国王移动,这意味着沿着四个主要方向和对角线移动,即国王在国际象棋中移动的方式)。 有风的网格世界:风将代理商从
资源详情
资源评论
资源推荐
收起资源包目录
Temporal-Difference-Learning-master.zip (21个子文件)
Temporal-Difference-Learning-master
ReadMe.md 3KB
src
DemoGUI.m 40KB
GridWorldSARSA.m 5KB
DemoGUI.fig 9KB
WindyGridWorldQLearning.m 5KB
FindColBaseCenter.m 535B
DrawActionOnCell.m 1KB
CliffWalkingQLearning.m 5KB
DrawTextOnCell.m 493B
WindyGridWorldSARSA.m 5KB
GenerateRandomWalkSequence.m 888B
PredictionRandomWalk.m 3KB
DrawWindyEpisodeState.m 798B
DrawEpisodeState.m 614B
DrawCliffEpisodeState.m 721B
PredictionRandomWalkAlphaEffect.m 2KB
RLRandomWalk.m 2KB
FindCellCenter.m 445B
GridWorldQLearning.m 5KB
CliffWalkingSARSA.m 5KB
DrawGrid.m 937B
共 21 条
- 1
weixin_38685961
- 粉丝: 8
- 资源: 907
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0