<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>underflow &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/tag/underflow/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Mon, 27 Apr 2020 16:48:43 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>[原创] 如何防止softmax函数上溢出(overflow)和下溢出(underflow)</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%a6%82%e4%bd%95%e9%98%b2%e6%ad%a2softmax%e5%87%bd%e6%95%b0%e4%b8%8a%e6%ba%a2%e5%87%baoverflow%e5%92%8c%e4%b8%8b%e6%ba%a2%e5%87%baunderflow/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%a6%82%e4%bd%95%e9%98%b2%e6%ad%a2softmax%e5%87%bd%e6%95%b0%e4%b8%8a%e6%ba%a2%e5%87%baoverflow%e5%92%8c%e4%b8%8b%e6%ba%a2%e5%87%baunderflow/#comments</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Sat, 11 Mar 2017 17:22:44 +0000</pubDate>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[overflow]]></category>
		<category><![CDATA[softmax]]></category>
		<category><![CDATA[underflow]]></category>
		<category><![CDATA[上溢出]]></category>
		<category><![CDATA[下溢出]]></category>
		<guid isPermaLink="false">http://www.codelast.com/?p=9211</guid>

					<description><![CDATA[<p>《<a href="http://www.deeplearningbook.org/" rel="noopener noreferrer" target="_blank">Deep Learning</a>》（Ian Goodfellow &#38; Yoshua Bengio &#38; Aaron Courville）第四章「数值计算」中，谈到了上溢出（overflow）和下溢出（underflow）对数值计算的影响，并以softmax函数和log softmax函数为例进行了讲解。这里我再详细地把它总结一下。<br />
<span id="more-9211"></span><br />
<span style="background-color:#00ff00;">『1』</span>什么是<span style="color: rgb(0, 0, 255);">下溢出</span>（underflow）和<span style="color: rgb(0, 0, 255);">上溢出</span>（overflow）<br />
实数在计算机内用二进制表示，所以不是一个精确值，当数值过小的时候，被四舍五入为0，这就是下溢出。此时如果对这个数再做某些运算（例如除以它）就会出问题。<br />
反之，当数值过大的时候，情况就变成了上溢出。</p>
<p><span style="background-color:#00ff00;">『2』</span>softmax函数是什么<br />
softmax函数如下：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1506d61211b9a5466d9421954776fde3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f{(x)_i} = \frac{{{e^{{x_i}}}}}{{\sum\limits_{j = 1}^n {{e^{{x_j}}}} }},j = 1,2,...,n" /></span><script type='math/tex'>f{(x)_i} = \frac{{{e^{{x_i}}}}}{{\sum\limits_{j = 1}^n {{e^{{x_j}}}} }},j = 1,2,...,n</script> <br />
从公式上看含义不是特别清晰，所以借用<a href="https://www.zhihu.com/question/23765351" rel="noopener noreferrer" target="_blank"><span style="background-color:#ffa07a;">知乎上</span></a>的一幅图来说明（感谢原作者）：</p>
<p><a href="http://www.codelast.com" rel="noopener noreferrer" target="_blank"><img decoding="async" alt="softmax function" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/softmax.jpg" style="width: 600px; height: 336px;" /></a></p>
<p>这幅图极其清晰地表明了softmax函数是什么，一图胜千言。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">『2』</span>计算softmax函数值的问题<br />
通常情况下，计算softmax函数值不会出现什么问题，例如，当softmax函数表达式里的所有 x<sub>i</sub> 都是一个&#8220;一般大小&#8221;的数值 c 时&#8212;&#8212;也就是上图中， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9386b124bb9a0ea80843f97b974c387f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{z_1} = {z_2} = {z_3} = c" /></span><script type='math/tex'>{z_1} = {z_2} = {z_3} = c</script> &#160;时，那么，计算出来的函数值  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_4719b045cf7e43b9e9c4ec40b1b0cfda.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{y_1} = {y_2} = {y_3} = \frac{1}{3}" /></span><script type='math/tex'>{y_1} = {y_2} = {y_3} = \frac{1}{3}</script> 。<br />
但是，当某些情况发生时，计算函数值就出问题了：</p>
<ul>
<li>
		c 极其大，导致分子计算  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_0b1e10e461ff30c53b0f45ab4ba9eb00.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{e^c}" /></span><script type='math/tex'>{e^c}</script>  时上溢出</li>
<li>
		c 为负数，且&#160; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_789cb215a4afd57b77d138cb09d2a0f7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\left&#124; c \right&#124;" /></span><script type='math/tex'>\left&#124; c \right&#124;</script>  很大，此时分母是一个极小的正数，有可能四舍五入为0，导致下溢出</li>
</ul>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%a6%82%e4%bd%95%e9%98%b2%e6%ad%a2softmax%e5%87%bd%e6%95%b0%e4%b8%8a%e6%ba%a2%e5%87%baoverflow%e5%92%8c%e4%b8%8b%e6%ba%a2%e5%87%baunderflow/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>《<a href="http://www.deeplearningbook.org/" rel="noopener noreferrer" target="_blank">Deep Learning</a>》（Ian Goodfellow &amp; Yoshua Bengio &amp; Aaron Courville）第四章「数值计算」中，谈到了上溢出（overflow）和下溢出（underflow）对数值计算的影响，并以softmax函数和log softmax函数为例进行了讲解。这里我再详细地把它总结一下。<br />
<span id="more-9211"></span><br />
<span style="background-color:#00ff00;">『1』</span>什么是<span style="color: rgb(0, 0, 255);">下溢出</span>（underflow）和<span style="color: rgb(0, 0, 255);">上溢出</span>（overflow）<br />
实数在计算机内用二进制表示，所以不是一个精确值，当数值过小的时候，被四舍五入为0，这就是下溢出。此时如果对这个数再做某些运算（例如除以它）就会出问题。<br />
反之，当数值过大的时候，情况就变成了上溢出。</p>
<p><span style="background-color:#00ff00;">『2』</span>softmax函数是什么<br />
softmax函数如下：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1506d61211b9a5466d9421954776fde3.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f{(x)_i} = \frac{{{e^{{x_i}}}}}{{\sum\limits_{j = 1}^n {{e^{{x_j}}}} }},j = 1,2,...,n" /></span><script type='math/tex'>f{(x)_i} = \frac{{{e^{{x_i}}}}}{{\sum\limits_{j = 1}^n {{e^{{x_j}}}} }},j = 1,2,...,n</script> <br />
从公式上看含义不是特别清晰，所以借用<a href="https://www.zhihu.com/question/23765351" rel="noopener noreferrer" target="_blank"><span style="background-color:#ffa07a;">知乎上</span></a>的一幅图来说明（感谢原作者）：</p>
<p><a href="http://www.codelast.com" rel="noopener noreferrer" target="_blank"><img decoding="async" alt="softmax function" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/softmax.jpg" style="width: 600px; height: 336px;" /></a></p>
<p>这幅图极其清晰地表明了softmax函数是什么，一图胜千言。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">『2』</span>计算softmax函数值的问题<br />
通常情况下，计算softmax函数值不会出现什么问题，例如，当softmax函数表达式里的所有 x<sub>i</sub> 都是一个&ldquo;一般大小&rdquo;的数值 c 时&mdash;&mdash;也就是上图中， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9386b124bb9a0ea80843f97b974c387f.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{z_1} = {z_2} = {z_3} = c" /></span><script type='math/tex'>{z_1} = {z_2} = {z_3} = c</script> &nbsp;时，那么，计算出来的函数值  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_4719b045cf7e43b9e9c4ec40b1b0cfda.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{y_1} = {y_2} = {y_3} = \frac{1}{3}" /></span><script type='math/tex'>{y_1} = {y_2} = {y_3} = \frac{1}{3}</script> 。<br />
但是，当某些情况发生时，计算函数值就出问题了：</p>
<ul>
<li>
		c 极其大，导致分子计算  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_0b1e10e461ff30c53b0f45ab4ba9eb00.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{e^c}" /></span><script type='math/tex'>{e^c}</script>  时上溢出</li>
<li>
		c 为负数，且&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_789cb215a4afd57b77d138cb09d2a0f7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\left| c \right|" /></span><script type='math/tex'>\left| c \right|</script>  很大，此时分母是一个极小的正数，有可能四舍五入为0，导致下溢出</li>
</ul>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">『3』</span>如何解决<br />
所以怎样规避这些问题呢？我们可以用同一个方法一口气解决俩：<br />
令&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_0d5a61c338abd8aff4445d1c25f572fc.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="M = \max ({x_i}),i = 1,2, \cdots ,n" /></span><script type='math/tex'>M = \max ({x_i}),i = 1,2, \cdots ,n</script> ，即 M 为所有  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9fc055e2c2e0857258028ea14586b4b2.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{x_i}" /></span><script type='math/tex'>{x_i}</script>  中最大的值，那么我们只需要把计算  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_182c3aad17e24f621a53493762cbab83.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f{(x)_i}" /></span><script type='math/tex'>f{(x)_i}</script>  的值，改为计算&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_93907f78f842e2d2d8d9c0fce17d1c7c.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({x_i} - M)" /></span><script type='math/tex'>f({x_i} - M)</script>  的值，就可以解决上溢出、下溢出的问题了，并且，计算结果理论上仍然和  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_182c3aad17e24f621a53493762cbab83.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f{(x)_i}" /></span><script type='math/tex'>f{(x)_i}</script>  保持一致。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
举个实例：还是以前面的图为例，本来我们计算&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_97f45a0c17c2a4a6cb57514984dda5b6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="f({z_2})" /></span><script type='math/tex'>f({z_2})</script> ，是用&ldquo;常规&rdquo;方法来算的：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_3ae3a1b3649531949433c41bb96344fc.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\frac{{{e^{{z_2}}}}}{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}} = \frac{{{e^1}}}{{{e^3} + {e^1} + {e^{ - 3}}}} = \frac{{2.7}}{{20 + 2.7 + 0.05}} \approx 0.12" /></span><script type='math/tex'>\frac{{{e^{{z_2}}}}}{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}} = \frac{{{e^1}}}{{{e^3} + {e^1} + {e^{ - 3}}}} = \frac{{2.7}}{{20 + 2.7 + 0.05}} \approx 0.12</script> <br />
现在我们改成：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_9c2d838566cf61705acd0d094d09ca81.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\frac{{{e^{({z_2} - M)}}}}{{{e^{({z_1} - M)}} + {e^{({z_2} - M)}} + {e^{({z_3} - M)}}}} = \frac{{{e^{(1 - 3)}}}}{{{e^{(3 - 3)}} + {e^{(1 - 3)}} + {e^{( - 3 - 3)}}}} \approx 0.12" /></span><script type='math/tex'>\frac{{{e^{({z_2} - M)}}}}{{{e^{({z_1} - M)}} + {e^{({z_2} - M)}} + {e^{({z_3} - M)}}}} = \frac{{{e^{(1 - 3)}}}}{{{e^{(3 - 3)}} + {e^{(1 - 3)}} + {e^{( - 3 - 3)}}}} \approx 0.12</script> <br />
其中， <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_a3769b09e6be65ab8971d2c1331435af.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="M = 3" /></span><script type='math/tex'>M = 3</script>  是&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_353cc7216665e636da2353e624e435b6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="{z_1},{z_2},{z_3}" /></span><script type='math/tex'>{z_1},{z_2},{z_3}</script>  中的最大值。<br />
可见计算结果并未改变。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
这是怎么做到的呢？通过简单的代数运算就可以参透其中的&ldquo;秘密&rdquo;：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_c29567cfa01a4957042af2d8056e59f2.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\frac{{{e^{{z_2}}}}}{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}} = \frac{{\frac{{{e^{{z_2}}}}}{{{e^M}}}}}{{\frac{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}}{{{e^M}}}}} = \frac{{\frac{{{e^{{z_2}}}}}{{{e^M}}}}}{{\frac{{{e^{{z_1}}}}}{{{e^M}}} + \frac{{{e^{{z_2}}}}}{{{e^M}}} + \frac{{{e^{{z_3}}}}}{{{e^M}}}}} = \frac{{{e^{\left( {{z_2} - M} \right)}}}}{{{e^{\left( {{z_1} - M} \right)}} + {e^{\left( {{z_2} - M} \right)}} + {e^{\left( {{z_3} - M} \right)}}}}" /></span><script type='math/tex'>\frac{{{e^{{z_2}}}}}{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}} = \frac{{\frac{{{e^{{z_2}}}}}{{{e^M}}}}}{{\frac{{{e^{{z_1}}} + {e^{{z_2}}} + {e^{{z_3}}}}}{{{e^M}}}}} = \frac{{\frac{{{e^{{z_2}}}}}{{{e^M}}}}}{{\frac{{{e^{{z_1}}}}}{{{e^M}}} + \frac{{{e^{{z_2}}}}}{{{e^M}}} + \frac{{{e^{{z_3}}}}}{{{e^M}}}}} = \frac{{{e^{\left( {{z_2} - M} \right)}}}}{{{e^{\left( {{z_1} - M} \right)}} + {e^{\left( {{z_2} - M} \right)}} + {e^{\left( {{z_3} - M} \right)}}}}</script> <br />
通过这样的变换，对任何一个 x<sub>i</sub>，减去M之后，e 的指数的最大值为0，所以不会发生上溢出；同时，分母中也至少会包含一个值为1的项，所以分母也不会下溢出（四舍五入为0）。<br />
所以这个技巧没什么高级的技术含量。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color:#00ff00;">『4』</span>延伸问题<br />
看似已经结案了，但仍然有一个问题：如果softmax函数中的分子发生下溢出，也就是前面所说的 c 为负数，且  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_789cb215a4afd57b77d138cb09d2a0f7.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\left| c \right|" /></span><script type='math/tex'>\left| c \right|</script>  很大，此时分母是一个极小的正数，有可能四舍五入为0的情况，此时，如果我们把softmax函数的计算结果再拿去计算 log，即 log softmax，其实就相当于计算&nbsp; <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_f28430e76ce080a889f0bdd64591d94a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\log (0)" /></span><script type='math/tex'>\log (0)</script> ，所以会得到  <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_8ce04232c352cfa94e22f923f71fa1dc.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt=" - \infty " /></span><script type='math/tex'> - \infty </script> ，但这实际上是错误的，因为它是由舍入误差造成的计算错误。<br />
所以，有没有一个方法，可以把这个问题也解决掉呢？<br />
答案还是采用和前面类似的策略来计算 log softmax 函数值：<br />
 <span class='MathJax_Preview'><img src='https://www.codelast.com/wp-content/plugins/latex/cache/tex_1ae2bb47902237d025410897d8644e12.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="\log [f({x_i})] = \log \left( {\frac{{{e^{{x_i}}}}}{{{e^{{x_1}}} + {e^{{x_2}}} + \cdots {e^{{x_n}}}}}} \right) = \log \left( {\frac{{\frac{{{e^{{x_i}}}}}{{{e^M}}}}}{{\frac{{{e^{{x_1}}}}}{{{e^M}}} + \frac{{{e^{{x_2}}}}}{{{e^M}}} + \cdots \frac{{{e^{{x_n}}}}}{{{e^M}}}}}} \right) = \log \left( {\frac{{{e^{\left( {{x_i} - M} \right)}}}}{{\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} }}} \right) = \log \left( {{e^{\left( {{x_i} - M} \right)}}} \right) - \log \left( {\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} } \right) = \left( {{x_i} - M} \right) - \log \left( {\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} } \right)" /></span><script type='math/tex'>\log [f({x_i})] = \log \left( {\frac{{{e^{{x_i}}}}}{{{e^{{x_1}}} + {e^{{x_2}}} + \cdots {e^{{x_n}}}}}} \right) = \log \left( {\frac{{\frac{{{e^{{x_i}}}}}{{{e^M}}}}}{{\frac{{{e^{{x_1}}}}}{{{e^M}}} + \frac{{{e^{{x_2}}}}}{{{e^M}}} + \cdots \frac{{{e^{{x_n}}}}}{{{e^M}}}}}} \right) = \log \left( {\frac{{{e^{\left( {{x_i} - M} \right)}}}}{{\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} }}} \right) = \log \left( {{e^{\left( {{x_i} - M} \right)}}} \right) - \log \left( {\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} } \right) = \left( {{x_i} - M} \right) - \log \left( {\sum\limits_j^n {{e^{\left( {{x_j} - M} \right)}}} } \right)</script> <br />
大家看到，在最后的表达式中，会产生下溢出的因素已经被消除掉了&mdash;&mdash;求和项中，至少有一项的值为1，这使得log后面的值不会下溢出，也就不会发生计算 log(0) 的悲剧。<br />
在很多数值计算的library中，都采用了此类方法来保持数值稳定。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="width: 200px; height: 200px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%a6%82%e4%bd%95%e9%98%b2%e6%ad%a2softmax%e5%87%bd%e6%95%b0%e4%b8%8a%e6%ba%a2%e5%87%baoverflow%e5%92%8c%e4%b8%8b%e6%ba%a2%e5%87%baunderflow/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
	</channel>
</rss>
