<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Distributed Cache &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/tag/distributed-cache/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Tue, 28 Jul 2020 03:27:51 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>[原创] Hadoop 2.6.x 下Distributed Cache的用法</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-hadoop-2-6-x-%e4%b8%8bdistributed-cache%e7%9a%84%e7%94%a8%e6%b3%95/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-hadoop-2-6-x-%e4%b8%8bdistributed-cache%e7%9a%84%e7%94%a8%e6%b3%95/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Tue, 28 Jul 2020 03:27:51 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[Distributed Cache]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=12744</guid>

					<description><![CDATA[<p>仔细记录一下Java的Map-Reduce job使用distributed cache的方法，毕竟以前一直都是copy paste ~</p>
<p><span style="background-color: rgb(255, 255, 0);">✓</span>&#160;适用的Hadoop版本<br />
CDH 5.8.0（Hadoop 2.6.0）<br />
别的版本没有测试过，但后面相近的版本应该也能用。</p>
<p><span style="background-color:#ffff00;">✓</span>&#160;准备工作：上传本地文件到HDFS<br />
为了在Java代码中把一个文件加入 distributed cache，需要先把它上传到HDFS，之后应使用 HDFS 路径来加入 distributed cache。<br />
假设要加入 distributed cache 的文件为 file.txt：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="javascript language-javascript hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">hadoop&#160;fs&#160;-put&#160;file.txt&#160;/your/hdfs/dir/
</code></pre>
</section>
<p><span id="more-12744"></span><br />
<span style="background-color: rgb(255, 255, 0);">✓</span>&#160;job 配置</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&#160;&#160;&#160;&#160;String&#160;fileName&#160;=&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;file.txt&#34;</span>;
&#160;&#160;&#160;&#160;String&#160;filePathHdfs&#160;=&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;/your/hdfs/dir/file.txt&#34;</span>;

&#160;&#160;&#160;&#160;Job&#160;job&#160;=&#160;Job.getInstance(getConf());
&#160;&#160;&#160;&#160;Configuration&#160;conf&#160;=&#160;job.getConfiguration();
&#160;&#160;&#160;&#160;conf.set(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;fileName&#34;</span>,&#160;fileName);
&#160;&#160;&#160;&#160;conf.set(</code></pre>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-hadoop-2-6-x-%e4%b8%8bdistributed-cache%e7%9a%84%e7%94%a8%e6%b3%95/" class="read-more">Read More </a></section>]]></description>
										<content:encoded><![CDATA[<p>仔细记录一下Java的Map-Reduce job使用distributed cache的方法，毕竟以前一直都是copy paste ~</p>
<p><span style="background-color: rgb(255, 255, 0);">✓</span>&nbsp;适用的Hadoop版本<br />
CDH 5.8.0（Hadoop 2.6.0）<br />
别的版本没有测试过，但后面相近的版本应该也能用。</p>
<p><span style="background-color:#ffff00;">✓</span>&nbsp;准备工作：上传本地文件到HDFS<br />
为了在Java代码中把一个文件加入 distributed cache，需要先把它上传到HDFS，之后应使用 HDFS 路径来加入 distributed cache。<br />
假设要加入 distributed cache 的文件为 file.txt：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="javascript language-javascript hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">hadoop&nbsp;fs&nbsp;-put&nbsp;file.txt&nbsp;/your/hdfs/dir/
</code></pre>
</section>
<p><span id="more-12744"></span><br />
<span style="background-color: rgb(255, 255, 0);">✓</span>&nbsp;job 配置</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;fileName&nbsp;=&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;file.txt&quot;</span>;
&nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;filePathHdfs&nbsp;=&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;/your/hdfs/dir/file.txt&quot;</span>;

&nbsp;&nbsp;&nbsp;&nbsp;Job&nbsp;job&nbsp;=&nbsp;Job.getInstance(getConf());
&nbsp;&nbsp;&nbsp;&nbsp;Configuration&nbsp;conf&nbsp;=&nbsp;job.getConfiguration();
&nbsp;&nbsp;&nbsp;&nbsp;conf.set(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;fileName&quot;</span>,&nbsp;fileName);
&nbsp;&nbsp;&nbsp;&nbsp;conf.set(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;filePathHdfs&quot;</span>,&nbsp;filePathHdfs);

&nbsp;&nbsp;&nbsp;&nbsp;job.addCacheFile(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;URI(String.format(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;%s#%s&quot;</span>,&nbsp;filePathHdfs,&nbsp;fileName)));
</code></pre>
</section>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
这里往 conf 里塞了两个变量，一个是文件名 fileName，另一个是文件的HDFS路径 filePathHdfs，并且在 addCacheFile() 的时候，拼成了&nbsp;<span style="color:#0000ff;">/your/hdfs/dir/file.txt</span><span style="color:#ff0000;">#</span><span style="color:#0000ff;">file.txt</span> 这样奇怪的形式，这种形式使得在 mapper 或 reducer 中读取 distributed cache 文件的时候，直接用文件名就能读出文件，特别方便！</p>
<p><span style="background-color: rgb(255, 255, 0);">✓</span>&nbsp;在 mapper 或 reducer 的 setup() 方法中读取 distributed cache 文件</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">setup</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Context&nbsp;context)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException,&nbsp;InterruptedException&nbsp;</span>{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Configuration&nbsp;conf&nbsp;=&nbsp;context.getConfiguration();

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File&nbsp;myFile&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;File(conf.get(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;file.txt&quot;</span>));
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;FileInputStream&nbsp;fis&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;FileInputStream(myFile);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;BufferedReader&nbsp;reader&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;BufferedReader(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;InputStreamReader(fis));

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;line;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&nbsp;((line&nbsp;=&nbsp;reader.readLine())&nbsp;!=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">null</span>)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//<span class="hljs-doctag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">TODO:</span>&nbsp;deal&nbsp;with&nbsp;each&nbsp;line</span>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;}
</code></pre>
</section>
<p>
看到没？上面的 conf.get(&quot;file.txt&quot;) 就只使用了文件名就能找到 distributed cache 里的文件，可以做这样&ldquo;神奇&rdquo;的操作是因为背后有一种叫symbolic link的技术。<br />
下面留空的 TODO 那里，需要你自己填写处理每一行数据的逻辑。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="width: 200px; height: 200px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-hadoop-2-6-x-%e4%b8%8bdistributed-cache%e7%9a%84%e7%94%a8%e6%b3%95/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
