<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>too many open files &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/tag/too-many-open-files/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Mon, 27 Apr 2020 17:51:22 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>[原创] 解决Linux系统上由于程序占用的文件描述符(file descriptor)过多导致的异常问题</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3linux%e7%b3%bb%e7%bb%9f%e4%b8%8a%e7%94%b1%e4%ba%8e%e7%a8%8b%e5%ba%8f%e5%8d%a0%e7%94%a8%e7%9a%84%e6%96%87%e4%bb%b6%e6%8f%8f%e8%bf%b0%e7%ac%a6file-descriptor/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3linux%e7%b3%bb%e7%bb%9f%e4%b8%8a%e7%94%b1%e4%ba%8e%e7%a8%8b%e5%ba%8f%e5%8d%a0%e7%94%a8%e7%9a%84%e6%96%87%e4%bb%b6%e6%8f%8f%e8%bf%b0%e7%ac%a6file-descriptor/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Mon, 28 Aug 2017 17:06:30 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[file descriptor]]></category>
		<category><![CDATA[too many open files]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=9824</guid>

					<description><![CDATA[<div>
	&#160;</div>
<div>
	前几天发现服务器上的一个JAVA程序表现很不对劲，运行起来特别&#8220;慢&#8221;，仔细一看程序的日志，发现里面有Exception抛出，提示&#8220;too many open files&#8221;，由于无论是读写文件还是创建网络连接，都需要占用文件描述符（fd），于是怀疑是服务器上的某个程序占用的资源没有释放，达到了系统设置的上限从而导致程序异常。<br />
	<span id="more-9824"></span><br />
	<span style="background-color:#00ff00;">『1』</span>查看系统open files限制<br />
	可以用下面的命令来查看：
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &#34;Lucida Console&#34;, &#34;DejaVu Sans Mono&#34;, Monaco, &#34;Courier New&#34;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
<span style="color: rgb(255, 176, 84);">ulimit</span> -n</pre>
<p>	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="background-color:#00ff00;">『2』</span>查看系统里占用fd最多的进程<br />
	用root用户运行下面的命令，可以打印出每个进程占用的fd数量（从大到小）：</p></div>
<div>
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &#34;Lucida Console&#34;, &#34;DejaVu Sans Mono&#34;, Monaco, &#34;Courier New&#34;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
lsof -n <span style="color: rgb(255, 157, 0);">&#124;</span> awk <span style="color: rgb(58, 217, 0);">&#39;</span>{print $2}<span style="color: rgb(58, 217, 0);">&#39;</span> <span style="color: rgb(255, 157, 0);">&#124;</span> sort <span style="color: rgb(255, 157, 0);">&#124;</span> uniq -c <span style="color: rgb(255, 157, 0);">&#124;</span> sort -nr <span style="color: rgb(255, 157, 0);">&#124;</span> more</pre>
<p>	部分输出类似于：</p></div>
<div>
<blockquote>
<div>
			&#160; 12520 16485</div>
<div>
			&#160; &#160; 125 7054</div>
<div>
			&#160; &#160; &#160;69 20120</div>
<div>
			&#160; &#160; &#160;69 15291</div>
<div>
			&#160; &#160; &#160;65 20113</div>
<div>
			&#160; &#160; &#160;65 15284</div>
<div>
			&#160; &#160; &#160;57 19774</div>
<div>
			......</div></blockquote></div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3linux%e7%b3%bb%e7%bb%9f%e4%b8%8a%e7%94%b1%e4%ba%8e%e7%a8%8b%e5%ba%8f%e5%8d%a0%e7%94%a8%e7%9a%84%e6%96%87%e4%bb%b6%e6%8f%8f%e8%bf%b0%e7%ac%a6file-descriptor/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<div>
	&nbsp;</div>
<div>
	前几天发现服务器上的一个JAVA程序表现很不对劲，运行起来特别&ldquo;慢&rdquo;，仔细一看程序的日志，发现里面有Exception抛出，提示&ldquo;too many open files&rdquo;，由于无论是读写文件还是创建网络连接，都需要占用文件描述符（fd），于是怀疑是服务器上的某个程序占用的资源没有释放，达到了系统设置的上限从而导致程序异常。<br />
	<span id="more-9824"></span><br />
	<span style="background-color:#00ff00;">『1』</span>查看系统open files限制<br />
	可以用下面的命令来查看：</p>
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &quot;Lucida Console&quot;, &quot;DejaVu Sans Mono&quot;, Monaco, &quot;Courier New&quot;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
<span style="color: rgb(255, 176, 84);">ulimit</span> -n</pre>
<p>	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="background-color:#00ff00;">『2』</span>查看系统里占用fd最多的进程<br />
	用root用户运行下面的命令，可以打印出每个进程占用的fd数量（从大到小）：</div>
<div>
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &quot;Lucida Console&quot;, &quot;DejaVu Sans Mono&quot;, Monaco, &quot;Courier New&quot;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
lsof -n <span style="color: rgb(255, 157, 0);">|</span> awk <span style="color: rgb(58, 217, 0);">&#39;</span>{print $2}<span style="color: rgb(58, 217, 0);">&#39;</span> <span style="color: rgb(255, 157, 0);">|</span> sort <span style="color: rgb(255, 157, 0);">|</span> uniq -c <span style="color: rgb(255, 157, 0);">|</span> sort -nr <span style="color: rgb(255, 157, 0);">|</span> more</pre>
<p>	部分输出类似于：</p></div>
<div>
<blockquote>
<div>
			&nbsp; 12520 16485</div>
<div>
			&nbsp; &nbsp; 125 7054</div>
<div>
			&nbsp; &nbsp; &nbsp;69 20120</div>
<div>
			&nbsp; &nbsp; &nbsp;69 15291</div>
<div>
			&nbsp; &nbsp; &nbsp;65 20113</div>
<div>
			&nbsp; &nbsp; &nbsp;65 15284</div>
<div>
			&nbsp; &nbsp; &nbsp;57 19774</div>
<div>
			......</div>
</blockquote>
</div>
<div>
	第一列是占用的fd数量，第二列是进程的pid。最可疑的显然是占用数量最多的头几个进程。</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" target="_blank" rel="noopener noreferrer"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="background-color:#00ff00;">『3』</span>查找出有问题的程序到底在干什么<br />
	为便于排查，先从第一个进程查起，如果第一个进程就定位到了问题，那么就不用查后面的了。</div>
<div>
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &quot;Lucida Console&quot;, &quot;DejaVu Sans Mono&quot;, Monaco, &quot;Courier New&quot;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
ll /proc/16485/fd</pre>
</div>
<div>
	部分输出类似于：</p>
<blockquote>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9992 -&gt; socket:[547491750]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9993 -&gt; socket:[547491752]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9994 -&gt; socket:[547491753]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9995 -&gt; socket:[547491754]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9996 -&gt; socket:[547491755]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9997 -&gt; socket:[547491756]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9998 -&gt; socket:[547491757]</div>
<div>
			lrwx------ 1 root root 64 Aug 24 11:51 9999 -&gt; socket:[547491758]</div>
</blockquote>
</div>
<div>
	其实这个输出列表真的很长，只不过由于篇幅的原因，这里只粘贴上来了一小部分。从这个输出信息中，貌似一眼看不出来该程序占用的那些fd到底是在进行网络通信呢，还是在干嘛，于是我们可以用另一种方法：</div>
<div>
<pre style="margin-top: 0px; margin-bottom: 0px; font-stretch: normal; font-size: 0.9333em; line-height: 1.5em; font-family: Consolas, &quot;Lucida Console&quot;, &quot;DejaVu Sans Mono&quot;, Monaco, &quot;Courier New&quot;, monospace; background: rgb(0, 34, 64); color: rgb(255, 255, 255);">
lsof -p 16485</pre>
</div>
<div>
	还是只展示一小部分输出信息：</p>
<div>
<blockquote>
<div>
				java &nbsp; &nbsp;16485 root *500u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501909 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:57700-&gt;test1.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *501u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501910 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:targus-getdata-&gt;test2.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *502u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501911 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:59671-&gt;test3.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *503u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501939 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:55784-&gt;test4.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *504u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501942 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:netsupport-&gt;test5.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *505u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501995 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:58486-&gt;test6.abc.com:50010 (CLOSE_WAIT)</div>
<div>
				java &nbsp; &nbsp;16485 root *506u &nbsp;IPv4 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;547501996 &nbsp; &nbsp; &nbsp; &nbsp;0t0 &nbsp; &nbsp; &nbsp; TCP abc.abc.com:38031-&gt;test7.abc.com:50010 (CLOSE_WAIT)</div>
</blockquote></div>
<div>
		由于我这里的test*.abc.com是Hadoop集群的服务器，于是我马上就明白了：我的程序里有读写HDFS文件的操作，所以很可能是读写时没有close资源导致占用的fd持续增加。<br />
		于是我去检查了一下代码，果然在用BufferedReader频繁读取HDFS文件的时候，用完了也没有把它close()，于是fix之后赶紧试验了一下，问题解决！</p>
<p>		<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
		<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
		转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
		感谢关注我的微信公众号（微信扫一扫）：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
			<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="width: 200px; height: 200px;" /></p>
</p></div>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3linux%e7%b3%bb%e7%bb%9f%e4%b8%8a%e7%94%b1%e4%ba%8e%e7%a8%8b%e5%ba%8f%e5%8d%a0%e7%94%a8%e7%9a%84%e6%96%87%e4%bb%b6%e6%8f%8f%e8%bf%b0%e7%ac%a6file-descriptor/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
