<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>收敛性 &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/tag/%E6%94%B6%E6%95%9B%E6%80%A7/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Mon, 27 Apr 2020 17:31:28 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>[原创]使用一维搜索(line search)的算法的收敛性</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e4%bd%bf%e7%94%a8%e4%b8%80%e7%bb%b4%e6%90%9c%e7%b4%a2line-search%e7%9a%84%e7%ae%97%e6%b3%95%e7%9a%84%e6%94%b6%e6%95%9b%e6%80%a7/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e4%bd%bf%e7%94%a8%e4%b8%80%e7%bb%b4%e6%90%9c%e7%b4%a2line-search%e7%9a%84%e7%ae%97%e6%b3%95%e7%9a%84%e6%94%b6%e6%95%9b%e6%80%a7/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Tue, 29 Oct 2013 15:24:10 +0000</pubDate>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[一维搜索]]></category>
		<category><![CDATA[收敛性]]></category>
		<category><![CDATA[最优化]]></category>
		<guid isPermaLink="false">http://www.codelast.com/?p=7514</guid>

					<description><![CDATA[<p>
在最优化领域中，有一类使用一维搜索（line search）的算法，例如<a href="http://www.codelast.com/?p=2573" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">牛顿法</span></a>等。这类算法采用的是 <span style="color:#0000ff;">确定搜索方向&#8594;进行一维搜索&#8594;调整搜索方向&#8594;进行一维搜索</span> 的迭代过程来求解。那么，这类算法应该满足什么条件的时候才能收敛？本文将略为讨论一下。请务必看清本文的标题：不是讨论line search的收敛性，而是讨论使用line search的算法的收敛性。<br />
<span id="more-7514"></span><br />
<span style="background-color:#00ff00;">【1】</span>搜索方向条件<br />
搜索方向 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 满足什么条件时算法才能收敛？谈到这个问题，首先就要定义搜索方向&#8212;&#8212;要有一个&#8220;参照物&#8221;，要不然何来方向之说呢？<br />
用 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 与负梯度 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_be9976a20363f7c49bb370084b76dca7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt=" - {g_k}" /></span><script type='math/tex'> - {g_k}</script> 的夹角 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 来衡量搜索方向。我们先给出结论： <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 应满足：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_convergence_property_1.png" style="width: 280px; height: 73px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
为了说明这个式子是怎么来的，需要先说明两个向量（ <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8e50450420bb2e5ebad5002d615f93d5.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}, - {g_k}" /></span><script type='math/tex'>{d_k}, - {g_k}</script> 都是向量）夹角的余弦怎么计算：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_convergence_property_2.png" style="width: 200px; height: 86px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
分子 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e973dc08d5acdbeb2d0d8ad588def4a4.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{ - {g_k}^T{d_k}}" /></span><script type='math/tex'>{ - {g_k}^T{d_k}}</script> 是两个向量的点积（数量积），分母 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7cc5882e84d1847284e8a3c2665cb850.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\left\&#124; {{g_k}} \right\&#124;\left\&#124; {{d_k}} \right\&#124;}" /></span><script type='math/tex'>{\left\&#124; {{g_k}} \right\&#124;\left\&#124; {{d_k}} \right\&#124;}</script> 是两个向量的范数之积，分母&#62;0。<br />
由上面 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 的取值范围可知 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_909edc7edc1f6163ea05ff3c6b1c2e1a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\cos {\theta _k} \in (0,1)" /></span><script type='math/tex'>\cos {\theta _k} \in (0,1)</script> ，即 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_dbbc28fa63f4b257a2b4f4896194a152.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\cos {\theta _k}/> 0" /</span><script type='math/tex'>\cos {\theta _k} 0</script> ，因此 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> <br />
所以，根据泰勒展开式（忽略掉高阶无穷小部分）：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1e756c37a3b14c544250c482126a4c78.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + \alpha {d_k}) = f({x_k}) + \alpha {g_k}^T{d_k} + o(\alpha )" /></span><script type='math/tex'>f({x_k} + \alpha {d_k}) = f({x_k}) + \alpha {g_k}^T{d_k} + o(\alpha )</script> <br />
我们可知， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a3fa56198f5bcf005c1237de03ecbd26.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + \alpha {d_k}) < f({x_k})" /></span><script type='math/tex'>f({x_k} + \alpha {d_k}) < f({x_k})</script> ，即<span style="color:#0000ff;">函数值是下降的</span>&#8212;&#8212;下降正是最优化的目标。<br />
所以你现在明白为什么 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 要满足上面的条件了。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b%e4%bd%bf%e7%94%a8%e4%b8%80%e7%bb%b4%e6%90%9c%e7%b4%a2line-search%e7%9a%84%e7%ae%97%e6%b3%95%e7%9a%84%e6%94%b6%e6%95%9b%e6%80%a7/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>
在最优化领域中，有一类使用一维搜索（line search）的算法，例如<a href="http://www.codelast.com/?p=2573" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">牛顿法</span></a>等。这类算法采用的是 <span style="color:#0000ff;">确定搜索方向&rarr;进行一维搜索&rarr;调整搜索方向&rarr;进行一维搜索</span> 的迭代过程来求解。那么，这类算法应该满足什么条件的时候才能收敛？本文将略为讨论一下。请务必看清本文的标题：不是讨论line search的收敛性，而是讨论使用line search的算法的收敛性。<br />
<span id="more-7514"></span><br />
<span style="background-color:#00ff00;">【1】</span>搜索方向条件<br />
搜索方向 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 满足什么条件时算法才能收敛？谈到这个问题，首先就要定义搜索方向&mdash;&mdash;要有一个&ldquo;参照物&rdquo;，要不然何来方向之说呢？<br />
用 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 与负梯度 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_be9976a20363f7c49bb370084b76dca7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt=" - {g_k}" /></span><script type='math/tex'> - {g_k}</script> 的夹角 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 来衡量搜索方向。我们先给出结论： <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 应满足：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_convergence_property_1.png" style="width: 280px; height: 73px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
为了说明这个式子是怎么来的，需要先说明两个向量（ <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8e50450420bb2e5ebad5002d615f93d5.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}, - {g_k}" /></span><script type='math/tex'>{d_k}, - {g_k}</script> 都是向量）夹角的余弦怎么计算：<br />
<img decoding="async" alt="" src="http://www.codelast.com/wp-content/uploads/ckfinder/images/optimization_convergence_property_2.png" style="width: 200px; height: 86px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
分子 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e973dc08d5acdbeb2d0d8ad588def4a4.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{ - {g_k}^T{d_k}}" /></span><script type='math/tex'>{ - {g_k}^T{d_k}}</script> 是两个向量的点积（数量积），分母 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_7cc5882e84d1847284e8a3c2665cb850.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\left\| {{g_k}} \right\|\left\| {{d_k}} \right\|}" /></span><script type='math/tex'>{\left\| {{g_k}} \right\|\left\| {{d_k}} \right\|}</script> 是两个向量的范数之积，分母&gt;0。<br />
由上面 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 的取值范围可知 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_909edc7edc1f6163ea05ff3c6b1c2e1a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\cos {\theta _k} \in (0,1)" /></span><script type='math/tex'>\cos {\theta _k} \in (0,1)</script> ，即 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_dbbc28fa63f4b257a2b4f4896194a152.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\cos {\theta _k} > 0" /></span><script type='math/tex'>\cos {\theta _k} > 0</script> ，因此 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_426f053da911fa0c66d34d53cd38934f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}^T{d_k} < 0" /></span><script type='math/tex'>{g_k}^T{d_k} < 0</script> <br />
所以，根据泰勒展开式（忽略掉高阶无穷小部分）：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1e756c37a3b14c544250c482126a4c78.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + \alpha {d_k}) = f({x_k}) + \alpha {g_k}^T{d_k} + o(\alpha )" /></span><script type='math/tex'>f({x_k} + \alpha {d_k}) = f({x_k}) + \alpha {g_k}^T{d_k} + o(\alpha )</script> <br />
我们可知， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a3fa56198f5bcf005c1237de03ecbd26.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k} + \alpha {d_k}) < f({x_k})" /></span><script type='math/tex'>f({x_k} + \alpha {d_k}) < f({x_k})</script> ，即<span style="color:#0000ff;">函数值是下降的</span>&mdash;&mdash;下降正是最优化的目标。<br />
所以你现在明白为什么 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3131592ca5d71947c49e5566de089e24.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\theta _k}" /></span><script type='math/tex'>{\theta _k}</script> 要满足上面的条件了。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">【2】</span>两个关于收敛性的重要理论<br />
这两个理论非常重要，作个比喻，如果你要自己设计一个使用line search技术的算法，并且要保证它能收敛的话，那么，你可能就要让你的算法符合这两个理论的要求。<br />
其中一个理论描述了使用精确line search技术的算法的收敛性，另一个描述了使用不精确line search技术的算法的收敛性。<br />
<span style="background-color:#dda0dd;">①</span>适用于使用精确line search技术的算法<br />
设最优化算法产生的点序列为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_f9ea7f736dbc8e9f8a7c5b31c8270c71.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\{ {x_k}\} ,\{ f({x_k})\} " /></span><script type='math/tex'>\{ {x_k}\} ,\{ f({x_k})\} </script> ，对任意 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_ac7f25d9bbac3e3961c83fe88775b02d.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_0} \in {R^n}" /></span><script type='math/tex'>{x_0} \in {R^n}</script> ，目标函数的梯度 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e84fec1e074026d6fa8e3155482c35c3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="g(x)" /></span><script type='math/tex'>g(x)</script> 在水平集 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_5eb706eb1fa35b8609f02111950895ec.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="L = \{ x \in {R^n}:f(x) \le f({x_0})\} " /></span><script type='math/tex'>L = \{ x \in {R^n}:f(x) \le f({x_0})\} </script> 上<span style="color:#ff0000;">一致连续</span>，若line search的步长 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_501cb8ba16bc463c7329e28f3ec226a7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\alpha _k}" /></span><script type='math/tex'>{\alpha _k}</script> 满足精确搜索条件 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e87da303f6aa2fc8194483479096cdb0.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\alpha _k} = \arg \mathop {\min }\limits_{\alpha > 0} f({x_k} + \alpha {d_k})" /></span><script type='math/tex'>{\alpha _k} = \arg \mathop {\min }\limits_{\alpha > 0} f({x_k} + \alpha {d_k})</script> ，搜索方向 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 与 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_be9976a20363f7c49bb370084b76dca7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt=" - {g_k}" /></span><script type='math/tex'> - {g_k}</script> 的夹角满足前面所说的搜索方向条件，那么，必然会发生下面3种情况中的一种：<br />
<span style="color:#0000ff;">（1）</span>存在某个有限的 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8ce4b16b22b58894aa86c421e8759df3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="k" /></span><script type='math/tex'>k</script> ，使得 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_081b7d4d8eb6434a34715da9096a7110.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k} = 0" /></span><script type='math/tex'>{g_k} = 0</script> <br />
<span style="color:#0000ff;">（2）</span> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a6f8668d7f88cb2891cdebfdc4d05bd8.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) \to - \infty " /></span><script type='math/tex'>f({x_k}) \to - \infty </script> <br />
<span style="color:#0000ff;">（3）</span><span style="color:#ff0000;"> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a1cf4b8722569ad1ba6ff89c33fde2d7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k} \to 0" /></span><script type='math/tex'>{g_k} \to 0</script> </span><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
其中，（3）是最常见的情况，（1）和（2）很少出现&mdash;&mdash;书上是这么说的，至于为什么，我不知道。<br />
（3）中的 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a1cf4b8722569ad1ba6ff89c33fde2d7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k} \to 0" /></span><script type='math/tex'>{g_k} \to 0</script> 又是个什么概念呢？大家可以想像一下二维平面上的寻优过程，一个图像类似于抛物线的函数，当搜索点逐渐向极小值点逼近时，其梯度 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1cd5597a080292208723039cfd7bfd41.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k}" /></span><script type='math/tex'>{g_k}</script> 正是趋于0的。</p>
<p>另外，上面出现了&ldquo;<a href="http://zh.wikipedia.org/zh/%E4%B8%80%E8%87%B4%E8%BF%9E%E7%BB%AD" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">一致连续</span></a>&rdquo;的概念，我不太了解，这里摘录Wiki的部分内容：<br />
<span style="color:#800000;">一致连续性描述定义在一定度量空间上的函数的性质。与连续性刻画函数在局部的性质不同，一致连续刻画的是函数的整体性质。一致连续是比连续更苛刻的条件。一个函数在某度量空间上一致连续，则其在此度量空间上必然连续，但反之未必成立。直观上，一致连续可以理解为，当自变量 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9dd4e461268c8034f5c8564e155c67a6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="x" /></span><script type='math/tex'>x</script> 在足够小的范围内变动时，函数值 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_415290769594460e2e485922904f345d.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="y" /></span><script type='math/tex'>y</script> 的变动也会被限制在足够小的范围内。</span><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="http://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">http://www.codelast.com/</span></a><br />
<span style="background-color:#dda0dd;">②</span>适用于使用不精确line search技术的算法<br />
设最优化算法产生的点序列为 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_f9ea7f736dbc8e9f8a7c5b31c8270c71.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\{ {x_k}\} ,\{ f({x_k})\} " /></span><script type='math/tex'>\{ {x_k}\} ,\{ f({x_k})\} </script> ，对任意 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_ac7f25d9bbac3e3961c83fe88775b02d.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_0} \in {R^n}" /></span><script type='math/tex'>{x_0} \in {R^n}</script> ，目标函数的梯度 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_e84fec1e074026d6fa8e3155482c35c3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="g(x)" /></span><script type='math/tex'>g(x)</script> 在 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c9c10d5a8789190da478ba4da8fa4cdc.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{R^n}" /></span><script type='math/tex'>{R^n}</script> 上<span style="color:#ff0000;">Lipschitz连续</span>，若line search的步长 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_501cb8ba16bc463c7329e28f3ec226a7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{\alpha _k}" /></span><script type='math/tex'>{\alpha _k}</script> 满足<a href="http://www.codelast.com/?p=7320" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">Wolfe-Powell准则</span></a>，搜索方向 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_66eea6bfeea7fcb327d435f627a2390b.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{d_k}" /></span><script type='math/tex'>{d_k}</script> 与 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_be9976a20363f7c49bb370084b76dca7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt=" - {g_k}" /></span><script type='math/tex'> - {g_k}</script> 的夹角满足前面所说的搜索方向条件，那么，必然会发生下面3种情况中的一种：</p>
<div>
	<span style="color:#0000ff;">（1）</span>存在某个有限的 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8ce4b16b22b58894aa86c421e8759df3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="k" /></span><script type='math/tex'>k</script> ，使得 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_081b7d4d8eb6434a34715da9096a7110.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k} = 0" /></span><script type='math/tex'>{g_k} = 0</script> </div>
<div>
	<span style="color:#0000ff;">（2）</span> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a6f8668d7f88cb2891cdebfdc4d05bd8.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_k}) \to - \infty " /></span><script type='math/tex'>f({x_k}) \to - \infty </script> </div>
<div>
	<span style="color:#0000ff;">（3）</span><span style="color:#ff0000;"> <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a1cf4b8722569ad1ba6ff89c33fde2d7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{g_k} \to 0" /></span><script type='math/tex'>{g_k} \to 0</script> </span><br />
	和上面一样，书上说，（3）是最常见的情况，（1）和（2）很少出现。对（3）的含义的解释，还是请看上面。</p>
<p>	这里又出现了一个新名词：<a href="http://zh.wikipedia.org/zh/%E5%88%A9%E6%99%AE%E5%B8%8C%E8%8C%A8%E9%80%A3%E7%BA%8C" target="_blank" rel="noopener noreferrer"><span style="background-color:#ffa07a;">Lipschitz（利普希茨）连续</span></a>。很抱歉，这个我还是不懂（数学不好的人泪奔）。但是从Wiki的解释，我们仍可以看出个大概来：<br />
	<span style="color:#800000;">符合利普希茨条件的函数一致连续，也连续。直觉上，利普希茨连续函数限制了函数改变的速度。</span></p>
<p>	我感觉，利普希茨连续是比&ldquo;一致连续&rdquo;更强的条件。我从《数学分析（上）第四章》里看到一个结论：由函数 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_50bbd36e1fd2333108437a2ca378be62.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f(x)" /></span><script type='math/tex'>f(x)</script> 在区间 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_dd7536794b63bf90eccfd37f9b147d7f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="I" /></span><script type='math/tex'>I</script> 上Lipschitz连续可得： <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_50bbd36e1fd2333108437a2ca378be62.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f(x)" /></span><script type='math/tex'>f(x)</script> 在 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_dd7536794b63bf90eccfd37f9b147d7f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="I" /></span><script type='math/tex'>I</script> 上一致连续。</p>
<p>	有人会说，为什么不精确的一维搜索需要一个&ldquo;更强&rdquo;的连续条件啊？我猜是不是由于它是不精确的，所以满足的条件就需要强一些才能达到收敛？当然，这只是直观猜测，谁来给补充一下吧。<br />
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
		<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="width: 200px; height: 200px;" /></p>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b%e4%bd%bf%e7%94%a8%e4%b8%80%e7%bb%b4%e6%90%9c%e7%b4%a2line-search%e7%9a%84%e7%ae%97%e6%b3%95%e7%9a%84%e6%94%b6%e6%95%9b%e6%80%a7/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
