<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Wolfe conditions &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/tag/wolfe-conditions/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Mon, 27 Apr 2020 17:29:26 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>[原创]用“人话”解释不精确线搜索中的Armijo-Goldstein准则及Wolfe-Powell准则</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e7%94%a8%e4%ba%ba%e8%af%9d%e8%a7%a3%e9%87%8a%e4%b8%8d%e7%b2%be%e7%a1%ae%e7%ba%bf%e6%90%9c%e7%b4%a2%e4%b8%ad%e7%9a%84armijo-goldstein%e5%87%86%e5%88%99%e5%8f%8awo/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e7%94%a8%e4%ba%ba%e8%af%9d%e8%a7%a3%e9%87%8a%e4%b8%8d%e7%b2%be%e7%a1%ae%e7%ba%bf%e6%90%9c%e7%b4%a2%e4%b8%ad%e7%9a%84armijo-goldstein%e5%87%86%e5%88%99%e5%8f%8awo/#comments</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Sun, 27 Oct 2013 14:01:02 +0000</pubDate>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[Armijo-Goldstein准则]]></category>
		<category><![CDATA[line search]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[Wolfe conditions]]></category>
		<category><![CDATA[Wolfe-Powell准则]]></category>
		<category><![CDATA[最优化]]></category>
		<guid isPermaLink="false">http://www.codelast.com/?p=7320</guid>

					<description><![CDATA[<p>
line search（一维搜索，或线搜索）是最优化（Optimization）算法中的一个基础步骤/算法。它可以分为精确的一维搜索以及不精确的一维搜索两大类。<br />
在本文中，我想用&#8220;人话&#8221;解释一下不精确的一维搜索的两大准则：Armijo-Goldstein准则 ＆ Wolfe-Powell准则。<br />
之所以这样说，是因为我读到的所有最优化的书或资料，从来没有一个可以用初学者都能理解的方式来解释这两个准则，它们要么是长篇大论、把一堆数学公式丢给你去琢磨；要么是简短省略、直接略过了解释的步骤就一句话跨越千山万水得出了结论。<br />
每当看到这些书的时候，我脑子里就一个反应：你们就不能写人话吗？<br />
<span id="more-7320"></span><br />
我下面就尝试用通俗的语言来描述一下这两个准则。</p>
<div>
	<span style="background-color:#00ff00;">【1】</span>为什么要遵循这些准则</div>
<div>
	由于采用了不精确的一维搜索，所以，为了能让算法收敛（即：求得极小值），人们逐渐发现、证明了一些规律，当你遵循这些规律的时候，算法就很有可能收敛。因此，为了达到让算法收敛的目的，我们就要遵循这些准则。如果你不愿意遵循这些已经公认有效的准则，而是要按自己的准则来设计算法，那么恭喜你，如果你能证明你的做法是有效的，未来若干年后，书本里可能也会出现你的名字。</div>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a></p>
<div>
	<span style="background-color:#00ff00;">【2】</span>Armijo-Goldstein准则</div>
<div>
	此准则是在196X年的时候由Armijo和Goldstein提出的，当然我没有具体去搜过这俩人是谁。在有的资料里，你可能会看到&#8220;Armijo rule&#8221;（Armijo准则）的说法，可能是同一回事，不过，任何一个对此作出重要贡献的人都是不可抹杀的，不是么？</div>
<p><span style="color:#0000ff;">Armijo-Goldstein准则的核心思想有两个：①目标函数值应该有足够的下降；②一维搜索的步长&#945;不应该太小。</span></p>
<div>
	这两个思想的意图非常明显。由于最优化问题的目的就是寻找极小值，因此，让目标函数函数值&#8220;下降&#8221;是我们努力的方向，所以①正是想要保证这一点。</div>
<div>
	同理，②也类似：如果一维搜索的步长&#945;太小了，那么我们的搜索类似于在原地打转，可能也是在浪费时间和精力。</div>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
有了这两个指导思想，我们来看看Armijo-Goldstein准则的数学表达式：<br />
<a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/2011/05/Armijo-Goldstein_1.png" style="width: 460px; height: 45px;" /></a><br />
<a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/2011/05/Armijo-Goldstein_2.png" style="width: 510px; height: 42px;" /></a></p>
<p>其中，<span style="color:#ff0000;"> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_5087877fc30cbeb7449a630043199764.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="0 < \rho < \frac{1}{2}" /></span><script type='math/tex'>0 < \rho < \frac{1}{2}</script> </span><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#dda0dd;">(1)</span>为什么要规定 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7630a6c9cd1940bc102b357691038a4e.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho \in (0,\frac{1}{2})" /></span><script type='math/tex'>\rho \in (0,\frac{1}{2})</script> 这个条件？其实可以证明：如果没有这个条件的话，将影响算法的<span style="color:#0000ff;">超线性收敛</span>性（<a href="http://www.codelast.com/?page_id=963" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">定义看这个链接，第4条</span></a>）。在这个速度至关重要的时代，没有超线性收敛怎么活啊！(开个玩笑)<br />
具体的证明过程，大家可以参考袁亚湘写的《最优化理论与方法》一书，我没有仔细看，我觉得对初学者，不用去管它。<br />
<span style="background-color:#dda0dd;">(2)</span>第1个不等式的左边式子的泰勒展开式为：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3182322817f538586aa55cc34d65e082.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + {\alpha _k}{d_k}) = f({x_k}) + {\alpha _k}{g_k}^T{d_k} + o({\alpha _k})" /></span><script type='math/tex'>f({x_k} + {\alpha _k}{d_k}) = f({x_k}) + {\alpha _k}{g_k}^T{d_k} + o({\alpha _k})</script> <br />
去掉高阶无穷小，剩下的部分为： <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9db59ffe507886c32015a962bbaf8944.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}{g_k}^T{d_k}" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}{g_k}^T{d_k}</script> <br />
而第一个不等式右边与之只差一个系数 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_f7f177957cf064a93e9811df8fe65ed1.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho " /></span><script type='math/tex'>\rho </script> <br />
我们已知了 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> （这是 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 为下降方向的充要条件），并且 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7630a6c9cd1940bc102b357691038a4e.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho \in (0,\frac{1}{2})" /></span><script type='math/tex'>\rho \in (0,\frac{1}{2})</script> ，因此，1式右边仍然是一个比 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3378621fdb4518ce7e503d48af416e07.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k})" /></span><script type='math/tex'>f({x_k})</script> 小的数，即：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c0aff944a2606a5a9e9c7ef88994429d.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k} < f({x_k})" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k} < f({x_k})</script> <br />
也就是说函数值是下降的（下降是最优化的目标）。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b%e7%94%a8%e4%ba%ba%e8%af%9d%e8%a7%a3%e9%87%8a%e4%b8%8d%e7%b2%be%e7%a1%ae%e7%ba%bf%e6%90%9c%e7%b4%a2%e4%b8%ad%e7%9a%84armijo-goldstein%e5%87%86%e5%88%99%e5%8f%8awo/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>
line search（一维搜索，或线搜索）是最优化（Optimization）算法中的一个基础步骤/算法。它可以分为精确的一维搜索以及不精确的一维搜索两大类。<br />
在本文中，我想用&ldquo;人话&rdquo;解释一下不精确的一维搜索的两大准则：Armijo-Goldstein准则 ＆ Wolfe-Powell准则。<br />
之所以这样说，是因为我读到的所有最优化的书或资料，从来没有一个可以用初学者都能理解的方式来解释这两个准则，它们要么是长篇大论、把一堆数学公式丢给你去琢磨；要么是简短省略、直接略过了解释的步骤就一句话跨越千山万水得出了结论。<br />
每当看到这些书的时候，我脑子里就一个反应：你们就不能写人话吗？<br />
<span id="more-7320"></span><br />
我下面就尝试用通俗的语言来描述一下这两个准则。</p>
<div>
	<span style="background-color:#00ff00;">【1】</span>为什么要遵循这些准则</div>
<div>
	由于采用了不精确的一维搜索，所以，为了能让算法收敛（即：求得极小值），人们逐渐发现、证明了一些规律，当你遵循这些规律的时候，算法就很有可能收敛。因此，为了达到让算法收敛的目的，我们就要遵循这些准则。如果你不愿意遵循这些已经公认有效的准则，而是要按自己的准则来设计算法，那么恭喜你，如果你能证明你的做法是有效的，未来若干年后，书本里可能也会出现你的名字。</div>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a></p>
<div>
	<span style="background-color:#00ff00;">【2】</span>Armijo-Goldstein准则</div>
<div>
	此准则是在196X年的时候由Armijo和Goldstein提出的，当然我没有具体去搜过这俩人是谁。在有的资料里，你可能会看到&ldquo;Armijo rule&rdquo;（Armijo准则）的说法，可能是同一回事，不过，任何一个对此作出重要贡献的人都是不可抹杀的，不是么？</div>
<p><span style="color:#0000ff;">Armijo-Goldstein准则的核心思想有两个：①目标函数值应该有足够的下降；②一维搜索的步长&alpha;不应该太小。</span></p>
<div>
	这两个思想的意图非常明显。由于最优化问题的目的就是寻找极小值，因此，让目标函数函数值&ldquo;下降&rdquo;是我们努力的方向，所以①正是想要保证这一点。</div>
<div>
	同理，②也类似：如果一维搜索的步长&alpha;太小了，那么我们的搜索类似于在原地打转，可能也是在浪费时间和精力。</div>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
有了这两个指导思想，我们来看看Armijo-Goldstein准则的数学表达式：<br />
<a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/2011/05/Armijo-Goldstein_1.png" style="width: 460px; height: 45px;" /></a><br />
<a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/2011/05/Armijo-Goldstein_2.png" style="width: 510px; height: 42px;" /></a></p>
<p>其中，<span style="color:#ff0000;"> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_5087877fc30cbeb7449a630043199764.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="0 < \rho < \frac{1}{2}" /></span><script type='math/tex'>0 < \rho < \frac{1}{2}</script> </span><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#dda0dd;">(1)</span>为什么要规定 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7630a6c9cd1940bc102b357691038a4e.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho \in (0,\frac{1}{2})" /></span><script type='math/tex'>\rho \in (0,\frac{1}{2})</script> 这个条件？其实可以证明：如果没有这个条件的话，将影响算法的<span style="color:#0000ff;">超线性收敛</span>性（<a href="http://www.codelast.com/?page_id=963" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">定义看这个链接，第4条</span></a>）。在这个速度至关重要的时代，没有超线性收敛怎么活啊！(开个玩笑)<br />
具体的证明过程，大家可以参考袁亚湘写的《最优化理论与方法》一书，我没有仔细看，我觉得对初学者，不用去管它。<br />
<span style="background-color:#dda0dd;">(2)</span>第1个不等式的左边式子的泰勒展开式为：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3182322817f538586aa55cc34d65e082.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + {\alpha _k}{d_k}) = f({x_k}) + {\alpha _k}{g_k}^T{d_k} + o({\alpha _k})" /></span><script type='math/tex'>f({x_k} + {\alpha _k}{d_k}) = f({x_k}) + {\alpha _k}{g_k}^T{d_k} + o({\alpha _k})</script> <br />
去掉高阶无穷小，剩下的部分为： <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9db59ffe507886c32015a962bbaf8944.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}{g_k}^T{d_k}" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}{g_k}^T{d_k}</script> <br />
而第一个不等式右边与之只差一个系数 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_f7f177957cf064a93e9811df8fe65ed1.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho " /></span><script type='math/tex'>\rho </script> <br />
我们已知了 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> （这是 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 为下降方向的充要条件），并且 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7630a6c9cd1940bc102b357691038a4e.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho \in (0,\frac{1}{2})" /></span><script type='math/tex'>\rho \in (0,\frac{1}{2})</script> ，因此，1式右边仍然是一个比 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3378621fdb4518ce7e503d48af416e07.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k})" /></span><script type='math/tex'>f({x_k})</script> 小的数，即：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c0aff944a2606a5a9e9c7ef88994429d.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k} < f({x_k})" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k} < f({x_k})</script> <br />
也就是说函数值是下降的（下降是最优化的目标）。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#dda0dd;">(3)</span>由于 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7630a6c9cd1940bc102b357691038a4e.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\rho \in (0,\frac{1}{2})" /></span><script type='math/tex'>\rho \in (0,\frac{1}{2})</script> 且 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> （ <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 是一个下降方向的充要条件），故第2个式子右边比第1个式子右边要小，即：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c8153267cd0d03c8afa2d9fe51d6546c.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\alpha _k}(1 - \rho ){g_k}^T{d_k} < {\alpha _k}\rho {g_k}^T{d_k} < 0" /></span><script type='math/tex'>{\alpha _k}(1 - \rho ){g_k}^T{d_k} < {\alpha _k}\rho {g_k}^T{d_k} < 0</script> <br />
如果步长 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 太小的话，会导致这个不等式接近于不成立的边缘。因此，式2就保证了 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 不能太小。<br />
<span style="background-color:#dda0dd;">(4)</span>我还要把很多书中都用来描述Armijo-Goldstein准则的一幅图搬出来说明一下（亲自手绘）：</p>
<div style="text-align: center;">
	<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_two_rules_in_line_search_1.jpg" style="width: 550px; height: 318px;" /></div>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
横坐标是 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> ，纵坐标是 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8fa14cdd754f91cc6554c9e71929cce7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f" /></span><script type='math/tex'>f</script> ，表示在 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_6df14963c8774de478b239e157bd1c14.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_k},{d_k}" /></span><script type='math/tex'>{x_k},{d_k}</script> 均为常量、 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 为自变量变化的情况下，目标函数值随之变化的情况。<br />
之所以说 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_6df14963c8774de478b239e157bd1c14.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_k},{d_k}" /></span><script type='math/tex'>{x_k},{d_k}</script> 均为常量，是因为在一维搜索中，在某一个确定的点 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_550187f469eda08b9e5b55143f19c4ce.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_k}" /></span><script type='math/tex'>{x_k}</script> 上，搜索方向 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 确定后，我们只需要找到一个合适的步长 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 就可以了。<br />
当 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9dd4e461268c8034f5c8564e155c67a6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="x" /></span><script type='math/tex'>x</script> 为常量， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 为自变量时， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9c1a98b7b1878eabcf4d855982a2c81a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f(x + \alpha d)" /></span><script type='math/tex'>f(x + \alpha d)</script> 可能是非线性函数（例如目标函数为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_990d8846cdb8bfba65b16bf6252a7dd6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="y = {x^2}" /></span><script type='math/tex'>y = {x^2}</script> 时）。因此图中是一条曲线。<br />
右上角的 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_19d48ed66ebabc7b91d1cb1c97d28ad3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + \alpha {d_k})" /></span><script type='math/tex'>f({x_k} + \alpha {d_k})</script> 并不是表示一个特定点的值，而是表示这条曲线是以 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_bccfc7022dfb945174d9bcebad2297bb.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha " /></span><script type='math/tex'>\alpha </script> 为自变量、 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_6df14963c8774de478b239e157bd1c14.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_k},{d_k}" /></span><script type='math/tex'>{x_k},{d_k}</script> 为常量的函数图形。<br />
当 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c4c417553b680cf203765de254be0350.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha = 0" /></span><script type='math/tex'>\alpha = 0</script> 时，函数值为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3378621fdb4518ce7e503d48af416e07.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k})" /></span><script type='math/tex'>f({x_k})</script> ，如图中左上方所示。水平的那条虚线是函数值为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3378621fdb4518ce7e503d48af416e07.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k})" /></span><script type='math/tex'>f({x_k})</script> 的基线，用于与其他函数值对比。<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_45b7e561542aab7e2c705136b385555a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k}" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k}</script> 那条线在 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3378621fdb4518ce7e503d48af416e07.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k})" /></span><script type='math/tex'>f({x_k})</script> 下方（前面已经分析过了，因为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> ）， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_d40b12c499976a6d7b94c6dfd79a56a0.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}(1 - \rho ){g_k}^T{d_k}" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}(1 - \rho ){g_k}^T{d_k}</script> 又在 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_45b7e561542aab7e2c705136b385555a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k}" /></span><script type='math/tex'>f({x_k}) + {\alpha _k}\rho {g_k}^T{d_k}</script> 的下方（前面也已经分析过了），所以Armijo-Goldstein准则可能会把极小值点（可接受的区间）判断在区间bc内。显而易见，区间bc是有可能把极小值排除在外的（极小值在区间ed内）。<br />
所以，为了解决这个问题，Wolfe-Powell准则应运而生。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">【3】</span>Wolfe-Powell准则<br />
在某些书中，你会看到&ldquo;<a href="http://en.wikipedia.org/wiki/Wolfe_conditions" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">Wolfe conditions</span></a>&rdquo;的说法，应该和Wolfe-Powell准则是一回事&mdash;&mdash;可怜的Powell大神又被无情地忽略了...<br />
Wolfe-Powell准则也有两个数学表达式，其中，第一个表达式与Armijo-Goldstein准则的第1个式子相同，第二个表达式为：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_two_rules_in_line_search_2.jpg" style="width: 550px; height: 39px;" /><br />
这个式子已经不是关于函数值的了，而是关于梯度的。<br />
此式的几何解释为：<span style="color:#0000ff;">可接受点处的切线斜率&ge;初始斜率的 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9d43cb8bbcb702e9d5943de477f099e2.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\sigma " /></span><script type='math/tex'>\sigma </script> 倍</span>。<br />
上面的图已经标出了 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1fb3bf723ae9a9c61eef13e70e1c5b73.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\sigma g_k^T{d_k}" /></span><script type='math/tex'>\sigma g_k^T{d_k}</script> 那条线（即 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e1671797c52e15f763380b45e841ec32.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="e" /></span><script type='math/tex'>e</script> 点处的切线），而初始点（ <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c4c417553b680cf203765de254be0350.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\alpha = 0" /></span><script type='math/tex'>\alpha = 0</script> 的点）处的切线是比 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e1671797c52e15f763380b45e841ec32.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="e" /></span><script type='math/tex'>e</script> 点处的切线要&ldquo;斜&rdquo;的，由于 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_eb0004237e8dc9db81ca87fd2cc7f6e9.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\sigma \in (\rho ,1)" /></span><script type='math/tex'>\sigma \in (\rho ,1)</script> ，使得 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e1671797c52e15f763380b45e841ec32.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="e" /></span><script type='math/tex'>e</script> 点处的切线变得&ldquo;不那么斜&rdquo;了&mdash;&mdash;不知道这种极为通俗而不够严谨的说法，是否有助于你理解。<br />
这样做的结果就是，我们将极小值包含在了可接受的区间内（ <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e1671797c52e15f763380b45e841ec32.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="e" /></span><script type='math/tex'>e</script> 点右边的区间）。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
Wolfe-Powell准则到这里还没有结束！在某些书中，你会看到用另一个所谓的&ldquo;更强的条件&rdquo;来代替(3)式，即：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_two_rules_in_line_search_3.jpg" style="width: 470px; height: 48px;" /><br />
这个式子和(3)式相比，就是左边加了一个绝对值符号，右边换了一下正负号（因为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_ead119a63ddef55ab91efbf5514e3609.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="g_k^T{d_k} < 0" /></span><script type='math/tex'>g_k^T{d_k} < 0</script> ，所以 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_ec6871914afe07c0858dc674aafe2ccf.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="- \sigma g_k^T{d_k} > 0" /></span><script type='math/tex'>- \sigma g_k^T{d_k} > 0</script> ）。<br />
这样做的结果就是：可接受的区间被限制在了 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_af2fdeb93b85e5b0a49cb557cf999ebc.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="[b,d]" /></span><script type='math/tex'>[b,d]</script> 内，如图：</p>
<div style="text-align: center;">
	<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_two_rules_in_line_search_4.jpg" style="width: 317px; height: 211px;" /></div>
<p>图中红线即为极小值被&ldquo;夹击&rdquo;的生动演示。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="width: 200px; height: 200px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e7%94%a8%e4%ba%ba%e8%af%9d%e8%a7%a3%e9%87%8a%e4%b8%8d%e7%b2%be%e7%a1%ae%e7%ba%bf%e6%90%9c%e7%b4%a2%e4%b8%ad%e7%9a%84armijo-goldstein%e5%87%86%e5%88%99%e5%8f%8awo/feed/</wfw:commentRss>
			<slash:comments>26</slash:comments>
		
		
			</item>
	</channel>
</rss>
