<!-- udacimak v1.2.1 -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Backpropagation</title>
<link rel="stylesheet" href="../assets/css/bootstrap.min.css">
<link rel="stylesheet" href="../assets/css/plyr.css">
<link rel="stylesheet" href="../assets/css/katex.min.css">
<link rel="stylesheet" href="../assets/css/jquery.mCustomScrollbar.min.css">
<link rel="stylesheet" href="../assets/css/styles.css">
<link rel="shortcut icon" type="image/png" href="../assets/img/udacimak.png" />
</head>
<body>
<div class="wrapper">
<nav id="sidebar">
<div class="sidebar-header">
<h3>MiniFlow</h3>
</div>
<ul class="sidebar-list list-unstyled CTAs">
<li>
<a href="../index.html" class="article">Back to Home</a>
</li>
</ul>
<ul class="sidebar-list list-unstyled components">
<li class="">
<a href="01. Introduction to MiniFlow.html">01. Introduction to MiniFlow</a>
</li>
<li class="">
<a href="02. Introduction.html">02. Introduction</a>
</li>
<li class="">
<a href="03. Graphs.html">03. Graphs</a>
</li>
<li class="">
<a href="04. MiniFlow Architecture.html">04. MiniFlow Architecture</a>
</li>
<li class="">
<a href="05. Forward Propagation.html">05. Forward Propagation</a>
</li>
<li class="">
<a href="06. Forward Propagation Solution.html">06. Forward Propagation Solution</a>
</li>
<li class="">
<a href="07. Learning and Loss.html">07. Learning and Loss</a>
</li>
<li class="">
<a href="08. Linear Transform.html">08. Linear Transform</a>
</li>
<li class="">
<a href="09. Sigmoid Function.html">09. Sigmoid Function</a>
</li>
<li class="">
<a href="10. Cost.html">10. Cost</a>
</li>
<li class="">
<a href="11. Gradient Descent Part 1.html">11. Gradient Descent Part 1</a>
</li>
<li class="">
<a href="12. Gradient Descent Part 2.html">12. Gradient Descent Part 2</a>
</li>
<li class="">
<a href="13. Backpropagation.html">13. Backpropagation</a>
</li>
<li class="">
<a href="14. Stochastic Gradient Descent.html">14. Stochastic Gradient Descent</a>
</li>
<li class="">
<a href="15. SGD Solution.html">15. SGD Solution</a>
</li>
<li class="">
<a href="16. Under the Hood Part 1.html">16. Under the Hood Part 1</a>
</li>
<li class="">
<a href="17. Under the Hood Part 2.html">17. Under the Hood Part 2</a>
</li>
<li class="">
<a href="18. Outro.html">18. Outro</a>
</li>
</ul>
<ul class="sidebar-list list-unstyled CTAs">
<li>
<a href="../index.html" class="article">Back to Home</a>
</li>
</ul>
</nav>
<div id="content">
<header class="container-fluild header">
<div class="container">
<div class="row">
<div class="col-12">
<div class="align-items-middle">
<button type="button" id="sidebarCollapse" class="btn btn-toggle-sidebar">
<div></div>
<div></div>
<div></div>
</button>
<h1 style="display: inline-block">13. Backpropagation</h1>
</div>
</div>
</div>
</div>
</header>
<main class="container">
<div class="row">
<div class="col-12">
<div class="ud-atom">
<h3></h3>
<div>
<h3 id="gradient-descent-solution">Gradient Descent Solution</h3>
<pre><code class="python language-python">def gradient_descent_update(x, gradx, learning_rate):
"""
Performs a gradient descent update.
"""
x = x - learning_rate * gradx
# Return the new value for x
return x</code></pre>
<p>We adjust the old <code>x</code> pushing it in the <em>direction</em> of <code>gradx</code> with the <em>force</em> <code>learning_rate</code>, by subtracting <code>learning_rate * gradx</code>. Remember the gradient is initially in the direction of <strong>steepest ascent</strong> so subtracting <code>learning_rate * gradx</code> from <code>x</code> turns it into <strong>steepest descent</strong>. You can make sure of this yourself by replacing the subtraction with an addition.</p>
</div>
</div>
<div class="divider"></div><div class="ud-atom">
<h3></h3>
<div>
<h3 id="the-gradient--backpropagation">The Gradient & Backpropagation</h3>
<p>As promised, we'll now discuss the gradient in more depth. Specifically we'll focus on the following insight:</p>
<blockquote>
<p><em>In order to figure out how we should alter a parameter to minimize the cost, we must first find out what effect that parameter has on the cost.</em> </p>
</blockquote>
<p>That makes sense. After all, we can't just blindly change parameter values and hope to get meaningful results. The gradient takes into account the effect each parameter has on the cost, so that's how we find the direction of steepest ascent.</p>
<p>How do we determine the effect a parameter has on the cost? This technique is famously known as <strong>backpropagation</strong> or <strong>reverse-mode differentiation</strong>. Those names might sound intimidating, but behind it all, it's just a clever application of the <strong>chain rule</strong>. Before we get into the chain rule let's revisit plain old derivatives.</p>
<h4 id="derivatives">Derivatives</h4>
<p>In calculus, the derivative tells us how something changes with respect to something else. Or, put differently, how <em>sensitive</em> something is to something else. </p>
<p>Let's take the function <span class="mathquill ud-math">f(x) = x^2</span> as an example. In this case, the derivative of <span class="mathquill ud-math">f(x)</span> is <span class="mathquill ud-math">2x</span>. Another way to state this is, "the derivative of <span class="mathquill ud-math">f(x)</span> with respect to <span class="mathquill ud-math">x</span> is <span class="mathquill ud-math">2x</span>". </p>
<p>Using the derivative, we can say <em>how much</em> a change in <span class="mathquill ud-math">x</span> effects <span class="mathquill ud-math">f(x)</span>. For example, when <span class="mathquill ud-math">x</span> is 4, the derivative is 8 (<span class="mathquill ud-math">2x = 2*4 = 8</span>). This means that if <span class="mathquill ud-math">x</span> is increased or decreased by 1 unit, then <span class="mathquill ud-math">f(x)</span> will increase or decrease by 2. </p>
</div>
</div>
<div class="divider"></div><div class="ud-atom">
<h3></h3>
<div>
<figure class="figure">
<img src="img/23.png" alt="f(x) and the tangent line of f(x) when x = 4." class="img img-fluid">
<figcaption class="figure-caption">
<p>f(x) and the tangent line of f(x) when x = 4.</p>
</figcaption>
</figure>
</div>
</div>
<div class="divider"></div><div class="ud-atom">
<h3></h3>
<div>
<p>Notice that <span class="mathquill ud-math">f(4) = 16</span> and <span class="mathquill ud-math">f(5) = 25</span>. 25 - 16 = 9, which isn't the same as 8.</p>
<p>But we just calculated that increasing <span class="mathquill ud-math">x</span> by 1 unit would change <span class="mathquill ud-math">f(x)</span> by 8. What happened?</p>
<p>The answer is that the slope (or derivative) itself changes as <span class="mathquill ud-math">x</span> changes. If we calculate the derivative when <span class="mathquill ud-math">x</span> is 4.5, it's now 9, which matches the difference between <span class="mathquill ud-math">f(4) = 16</span> and <span class="mathquill ud-math">f(5) = 25</span>.</p>
<h4 id="chain-rule">Chain Rule</h4>
<p>Let's return to neural networks and the original goal of figuring out what ef
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
Part 01-Module 03-Lesson 02_MiniFlow.zip (53个子文件)
Part 01-Module 03-Lesson 02_MiniFlow
06. Forward Propagation Solution.html 14KB
08. Pixels are Features!-qE5YYXtPq9U.mp4 1.06MB
08. Pixels are Features!-qE5YYXtPq9U.en-US.vtt 72B
16. Under the Hood Part 1.html 10KB
08. Linear Transform.html 22KB
14. Stochastic Gradient Descent.html 24KB
media
input-to-output-2.mp4 172KB
01. Introduction to MiniFlow-FxmB3Q308h0.en.vtt 825B
02. Introduction.html 8KB
04. MiniFlow Architecture.html 11KB
10. Cost.html 18KB
07. Learning and Loss.html 14KB
05. Forward Propagation.html 15KB
15. SGD Solution.html 20KB
01. Introduction to MiniFlow-FxmB3Q308h0.zh-CN.vtt 768B
01. Introduction to MiniFlow.html 6KB
img
x-mn.png 9KB
gradient-descent-convergence.gif 26KB
addition-graph.png 69KB
19.png 3KB
z.png 1KB
pasted-image-at-2016-10-25-01-17-pm.png 63KB
neww.png 3KB
tf.jpeg 15KB
screen-shot-2018-02-07-at-2.39.45-pm.png 66KB
12.png 2KB
b-1byk.png 2KB
w-nk.png 10KB
23.png 28KB
screen-shot-2016-10-21-at-15.43.05.png 482KB
boston-back-bay-reflection.jpg 318KB
example-neural-network.png 163KB
topological-sort.001.jpeg 107KB
save-2.png 11KB
21.png 2KB
vlcsnap-2016-10-26-21h47m38s028.png 326KB
screen-shot-2016-10-26-at-19.28.34.png 298KB
gradient-descent-divergence.gif 26KB
newx.png 1KB
miniflow-nn-graph.001.jpeg 201KB
newx-1n.png 2KB
22.png 3KB
pasted-image-at-2016-10-25-01-24-pm.png 130KB
11. Gradient Descent Part 1.html 13KB
18. Outro.html 7KB
13. Backpropagation.html 31KB
08. Pixels are Features!-qE5YYXtPq9U.pt-BR.vtt 91B
index.html 5KB
09. Sigmoid Function.html 18KB
03. Graphs.html 10KB
01. Introduction to MiniFlow-FxmB3Q308h0.mp4 4.4MB
12. Gradient Descent Part 2.html 12KB
17. Under the Hood Part 2.html 13KB
共 53 条
- 1
资源评论
智能汽车人
- 粉丝: 612
- 资源: 8
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功