<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>综合 &#8211; 编码无悔 /  Intent &amp; Focused</title>
	<atom:link href="https://www.codelast.com/category/misc/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.codelast.com</link>
	<description>最优化之路</description>
	<lastBuildDate>Wed, 25 Mar 2026 09:49:41 +0000</lastBuildDate>
	<language>zh-Hans</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>
	<item>
		<title>批量复制历史日期的Hive表</title>
		<link>https://www.codelast.com/%e6%89%b9%e9%87%8f%e5%a4%8d%e5%88%b6%e5%8e%86%e5%8f%b2%e6%97%a5%e6%9c%9f%e7%9a%84hive%e8%a1%a8/</link>
					<comments>https://www.codelast.com/%e6%89%b9%e9%87%8f%e5%a4%8d%e5%88%b6%e5%8e%86%e5%8f%b2%e6%97%a5%e6%9c%9f%e7%9a%84hive%e8%a1%a8/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Wed, 25 Mar 2026 09:49:41 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[复制Hive表]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=14221</guid>

					<description><![CDATA[<p>如果Hive表的分区是日期，并且每天数据不大，那么如果想批量把某一天的数据复制出多天的数据，最快的方法可能是直接复制磁盘文件，然后再用一行命令处理一下即可。<br />
<span id="more-14221"></span><br />
（1）先找到Hive表所在的HDFS目录，假设我们想用 2026-03-20 的数据复制出 2026-03-21 的数据，则：</p>
<blockquote>
<p>
		hadoop fs -cp /path/to/your/hive/table/hdfs/dir/date=2026-03-20&#160;/path/to/your/hive/table/hdfs/dir/date=2026-03-21</p>
</blockquote>
<p>（2）光复制目录没用，数据仍然是查询不到的，需要用在Hive命令行交互模式下，执行以下命令让复制出来的数据&#34;生效&#34;</p>
<blockquote>
<p>
		msck repair table 表名;</p>
</blockquote>
<p>该命令用于修复表的元数据。<br />
直接在 HDFS 上创建了分区目录，但未通过 ALTER TABLE ADD PARTITION 命令注册到 Hive 元数据中，运行msck命令后，这些分区会被自动发现并添加到元数据。&#8230; <a href="https://www.codelast.com/%e6%89%b9%e9%87%8f%e5%a4%8d%e5%88%b6%e5%8e%86%e5%8f%b2%e6%97%a5%e6%9c%9f%e7%9a%84hive%e8%a1%a8/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>如果Hive表的分区是日期，并且每天数据不大，那么如果想批量把某一天的数据复制出多天的数据，最快的方法可能是直接复制磁盘文件，然后再用一行命令处理一下即可。<br />
<span id="more-14221"></span><br />
（1）先找到Hive表所在的HDFS目录，假设我们想用 2026-03-20 的数据复制出 2026-03-21 的数据，则：</p>
<blockquote>
<p>
		hadoop fs -cp /path/to/your/hive/table/hdfs/dir/date=2026-03-20&nbsp;/path/to/your/hive/table/hdfs/dir/date=2026-03-21</p>
</blockquote>
<p>（2）光复制目录没用，数据仍然是查询不到的，需要用在Hive命令行交互模式下，执行以下命令让复制出来的数据&quot;生效&quot;</p>
<blockquote>
<p>
		msck repair table 表名;</p>
</blockquote>
<p>该命令用于修复表的元数据。<br />
直接在 HDFS 上创建了分区目录，但未通过 ALTER TABLE ADD PARTITION 命令注册到 Hive 元数据中，运行msck命令后，这些分区会被自动发现并添加到元数据。</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e6%89%b9%e9%87%8f%e5%a4%8d%e5%88%b6%e5%8e%86%e5%8f%b2%e6%97%a5%e6%9c%9f%e7%9a%84hive%e8%a1%a8/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 美化git diff命令在终端的显示效果</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%be%8e%e5%8c%96git-diff%e5%91%bd%e4%bb%a4%e5%9c%a8%e7%bb%88%e7%ab%af%e7%9a%84%e6%98%be%e7%a4%ba%e6%95%88%e6%9e%9c/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%be%8e%e5%8c%96git-diff%e5%91%bd%e4%bb%a4%e5%9c%a8%e7%bb%88%e7%ab%af%e7%9a%84%e6%98%be%e7%a4%ba%e6%95%88%e6%9e%9c/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Mon, 09 Mar 2026 03:20:04 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[git diff]]></category>
		<category><![CDATA[git-delta]]></category>
		<category><![CDATA[左右双屏]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=14209</guid>

					<description><![CDATA[<p>本文适用的环境：<br />
MacOS、Ubuntu（仅在20.04.6 LTS上做了测试）<br />
git diff命令在终端执行时，其显示效果是：<br />
* 按文件分块：每个有改动的文件单独一段，从上到下依次展示。<br />
* 统一在一个终端窗口内纵向滚动，默认不会并排显示左右对比。<br />
我个人觉得，这种显示方式，不如&#34;左右对比&#34;的diff形式直观。<br />
所以，有没有办法把git diff命令的输出，改造成更美观的形式呢？<br />
<span id="more-14209"></span><br />
在MacOS下，可以安装 git-delta，再稍加配置，就可以让终端里的 git diff&#160;命令显式美观得多。<br />
先看最终效果：<br />
<img decoding="async" alt="git diff" src="https://www.codelast.com/wp-content/uploads/2026/03/git_diff_style_change.jpg" style="width: 700px; height: 259px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></p>
<p>这个效果是怎么实现的呢？按如下步骤操作&#8212;&#8212;<br />
以MacOS为例，安装：</p>
<blockquote>
<p>
		brew install git-delta</p>
</blockquote>
<p>全局配置到git，修改 ~/.gitconfig&#160;文件，添加如下内容：</p>
<blockquote>
<div>
		[core]</div>
<div>
		&#160; &#160; pager = delta</div>
<div>
		[interactive]</div>
<div>
		&#160; &#160; diffFilter = delta --color-only</div>
<div>
		[delta]</div>
<div>
		&#160; &#160; syntax-theme = Monokai Extended</div>
<div>
		&#160; &#160; line-numbers = true</div>
<div>
		&#160; &#160; side-by-side = true</div>
</blockquote>
<div>
	各配置项含义如下：
<div class="document">
<div class="section">
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[core] pager = delta</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">把 Git 的&#8220;分页器&#8221;改成 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">delta</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">影响的命令：如 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git diff</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">、</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git log -p</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;"> 等需要分页显示的输出</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">作用：这些命令的输出不再通过 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">less</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">，而是先经过 delta 进行美化后再显示</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[interactive] diffFilter = delta --color-only</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">专门给交互式操作（如 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git add -p</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">）设置 diff 过滤器</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">Git 在交互式展示每一块 diff 时，先把原始 diff 丢给 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">delta --color-only</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">-color-only</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">：只加颜色高亮，不改行号、不改文本结构，确保交互命令正常工作</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] syntax-theme = Monokai Extended</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">设置 delta 的语法高亮主题为 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">Monokai Extended</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">影响代码内容的配色风格（关键字、字符串、注释等的颜色方案）</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] line-numbers = true</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">在 delta 输出中展示行号</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">一般会在左侧或边栏显示老/新文件的行号，方便定位</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] side-by-side = true</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">使用左右并排对比模式显示 diff</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &#34;Times New Roman&#34;;">&#160;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">左侧通常是旧版本，右侧是新版本，效果类似 GitHub PR 的对比视图</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
</div>
</div>
<p>
	其中，你可以用&#160;delta --list-syntax-themes&#160;命令查看所有内置主题，并设置到 syntax-theme&#160;配置项中。</p>
<p>	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></p></div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%be%8e%e5%8c%96git-diff%e5%91%bd%e4%bb%a4%e5%9c%a8%e7%bb%88%e7%ab%af%e7%9a%84%e6%98%be%e7%a4%ba%e6%95%88%e6%9e%9c/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<p>本文适用的环境：<br />
MacOS、Ubuntu（仅在20.04.6 LTS上做了测试）<br />
git diff命令在终端执行时，其显示效果是：<br />
* 按文件分块：每个有改动的文件单独一段，从上到下依次展示。<br />
* 统一在一个终端窗口内纵向滚动，默认不会并排显示左右对比。<br />
我个人觉得，这种显示方式，不如&quot;左右对比&quot;的diff形式直观。<br />
所以，有没有办法把git diff命令的输出，改造成更美观的形式呢？<br />
<span id="more-14209"></span><br />
在MacOS下，可以安装 git-delta，再稍加配置，就可以让终端里的 git diff&nbsp;命令显式美观得多。<br />
先看最终效果：<br />
<img decoding="async" alt="git diff" src="https://www.codelast.com/wp-content/uploads/2026/03/git_diff_style_change.jpg" style="width: 700px; height: 259px;" /><br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></p>
<p>这个效果是怎么实现的呢？按如下步骤操作&mdash;&mdash;<br />
以MacOS为例，安装：</p>
<blockquote>
<p>
		brew install git-delta</p>
</blockquote>
<p>全局配置到git，修改 ~/.gitconfig&nbsp;文件，添加如下内容：</p>
<blockquote>
<div>
		[core]</div>
<div>
		&nbsp; &nbsp; pager = delta</div>
<div>
		[interactive]</div>
<div>
		&nbsp; &nbsp; diffFilter = delta --color-only</div>
<div>
		[delta]</div>
<div>
		&nbsp; &nbsp; syntax-theme = Monokai Extended</div>
<div>
		&nbsp; &nbsp; line-numbers = true</div>
<div>
		&nbsp; &nbsp; side-by-side = true</div>
</blockquote>
<div>
	各配置项含义如下：</p>
<div class="document">
<div class="section">
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[core] pager = delta</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">把 Git 的&ldquo;分页器&rdquo;改成 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">delta</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">影响的命令：如 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git diff</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">、</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git log -p</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;"> 等需要分页显示的输出</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">作用：这些命令的输出不再通过 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">less</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">，而是先经过 delta 进行美化后再显示</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[interactive] diffFilter = delta --color-only</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">专门给交互式操作（如 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">git add -p</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">）设置 diff 过滤器</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">Git 在交互式展示每一块 diff 时，先把原始 diff 丢给 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">delta --color-only</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">-color-only</span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">：只加颜色高亮，不改行号、不改文本结构，确保交互命令正常工作</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] syntax-theme = Monokai Extended</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">设置 delta 的语法高亮主题为 </span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">Monokai Extended</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">影响代码内容的配色风格（关键字、字符串、注释等的颜色方案）</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] line-numbers = true</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">在 delta 输出中展示行号</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">一般会在左侧或边栏显示老/新文件的行号，方便定位</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt; font-size: 12pt; text-align: justify; font-family: 等线;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">●<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; font-weight: bold; letter-spacing: 0pt; vertical-align: baseline;">[delta] side-by-side = true</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">使用左右并排对比模式显示 diff</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
<p class="paragraph text-align-type-justify MsoNormal" style="margin: 3pt 0pt 3pt 3.52727em; font-size: 12pt; text-align: justify; font-family: 等线; text-indent: -16.8pt;">
				<span style="font-family: Wingdings;"><span style="font-size: 11pt; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;">○<span style="font-family: &quot;Times New Roman&quot;;">&nbsp;</span></span></span><span data-font-family="微软雅黑" style="font-size: 11pt; font-family: 微软雅黑; letter-spacing: 0pt; vertical-align: baseline;">左侧通常是旧版本，右侧是新版本，效果类似 GitHub PR 的对比视图</span><span lang="EN-US" style="font-size: 11pt; font-family: 微软雅黑; color: rgb(51, 51, 51); letter-spacing: 0pt; vertical-align: baseline;"><o:p></o:p></span></p>
</p></div>
</p></div>
<p>
	其中，你可以用&nbsp;delta --list-syntax-themes&nbsp;命令查看所有内置主题，并设置到 syntax-theme&nbsp;配置项中。</p>
<p>	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
		<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
<div>
		&nbsp;</div>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%be%8e%e5%8c%96git-diff%e5%91%bd%e4%bb%a4%e5%9c%a8%e7%bb%88%e7%ab%af%e7%9a%84%e6%98%be%e7%a4%ba%e6%95%88%e6%9e%9c/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] JAVA map-reduce job的counter页面无法显示的问题(error 500)</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e7%9a%84counter%e9%a1%b5%e9%9d%a2%e6%97%a0%e6%b3%95%e6%98%be%e7%a4%ba%e7%9a%84%e9%97%ae%e9%a2%98error-500/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e7%9a%84counter%e9%a1%b5%e9%9d%a2%e6%97%a0%e6%b3%95%e6%98%be%e7%a4%ba%e7%9a%84%e9%97%ae%e9%a2%98error-500/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Tue, 30 Apr 2024 09:11:34 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[counter]]></category>
		<category><![CDATA[error 500]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[map-reduce]]></category>
		<category><![CDATA[RFC 2616]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=14101</guid>

					<description><![CDATA[<p>这个问题已经不是第一次遇到了，只不过今天痛下决心花费不少时间把它写成文章，给遇到同样问题的朋友一些参考。<br />
我们知道，一个JAVA M-R job跑完后，无论是在命令行，还是在job的信息展示网页上，都会看到输出job counter的信息。在网页上，通过点击job信息页中的&#34;counter&#34;链接就能看到。<br />
<span id="more-14101"></span><br />
<img decoding="async" alt="hadoop job info page" src="https://www.codelast.com/wp-content/uploads/2024/04/hadoop_job_info_page.png" style="width: 339px; height: 361px;" /></p>
<p>本来嘛，点进这个页面，会看到正常的counter数据，但是，出问题的情况下，点进去看到的是这种情况：<br />
<img decoding="async" alt="hadoop counter info error" src="https://www.codelast.com/wp-content/uploads/2024/04/hadoop_counter_error.png" style="width: 561px; height: 144px;" /><br />
同时，在shell命令行你也会发现，该job没有任何counter信息输出。<br />
从错误信息页上，你得不到关于错误的任何有效信息，那个&#34;Error Details&#34;里也没有。<br />
虽然counter无法显示，但M-R job是可以正常跑完、正常输出数据的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
经过测试，我这个case的的问题是：在程序中添加了数量比较多的Hadoop counter造成的。<br />
什么算多？我不知道。我的程序里有240多个counter就出问题了，当我把counter缩减了一半，最终只有120多个counter的时候，counter信息统计就正常了。<br />
如果你遇到了类似问题，可以首先检查一下job中的counter数量是否太多。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;版权声明&#160;<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&#160;<br />
感谢关注我的微信公众号（微信扫一扫）：<br />
<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" />&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e7%9a%84counter%e9%a1%b5%e9%9d%a2%e6%97%a0%e6%b3%95%e6%98%be%e7%a4%ba%e7%9a%84%e9%97%ae%e9%a2%98error-500/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>这个问题已经不是第一次遇到了，只不过今天痛下决心花费不少时间把它写成文章，给遇到同样问题的朋友一些参考。<br />
我们知道，一个JAVA M-R job跑完后，无论是在命令行，还是在job的信息展示网页上，都会看到输出job counter的信息。在网页上，通过点击job信息页中的&quot;counter&quot;链接就能看到。<br />
<span id="more-14101"></span><br />
<img decoding="async" alt="hadoop job info page" src="https://www.codelast.com/wp-content/uploads/2024/04/hadoop_job_info_page.png" style="width: 339px; height: 361px;" /></p>
<p>本来嘛，点进这个页面，会看到正常的counter数据，但是，出问题的情况下，点进去看到的是这种情况：<br />
<img decoding="async" alt="hadoop counter info error" src="https://www.codelast.com/wp-content/uploads/2024/04/hadoop_counter_error.png" style="width: 561px; height: 144px;" /><br />
同时，在shell命令行你也会发现，该job没有任何counter信息输出。<br />
从错误信息页上，你得不到关于错误的任何有效信息，那个&quot;Error Details&quot;里也没有。<br />
虽然counter无法显示，但M-R job是可以正常跑完、正常输出数据的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
经过测试，我这个case的的问题是：在程序中添加了数量比较多的Hadoop counter造成的。<br />
什么算多？我不知道。我的程序里有240多个counter就出问题了，当我把counter缩减了一半，最终只有120多个counter的时候，counter信息统计就正常了。<br />
如果你遇到了类似问题，可以首先检查一下job中的counter数量是否太多。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：<br />
<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e7%9a%84counter%e9%a1%b5%e9%9d%a2%e6%97%a0%e6%b3%95%e6%98%be%e7%a4%ba%e7%9a%84%e9%97%ae%e9%a2%98error-500/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 解决IntelliJ idea启动崩溃：error occurred during error reporting (), id 0x6, SIGABRT (0x6) at pc=...</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3intellij-idea%e5%90%af%e5%8a%a8%e5%b4%a9%e6%ba%83%ef%bc%9aerror-occurred-during-error-reporting-id-0x6-sigabrt-0x6-at-pc/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3intellij-idea%e5%90%af%e5%8a%a8%e5%b4%a9%e6%ba%83%ef%bc%9aerror-occurred-during-error-reporting-id-0x6-sigabrt-0x6-at-pc/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Fri, 15 Mar 2024 09:48:13 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[crash]]></category>
		<category><![CDATA[IntelliJ IDEA]]></category>
		<category><![CDATA[SIGABRT]]></category>
		<category><![CDATA[启动]]></category>
		<category><![CDATA[崩溃]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=14081</guid>

					<description><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/2024/03/intellij_idea_logo.jpeg" style="width: 225px; height: 225px;" /></div>
<p>有时候，一个用了好久、一直好用的方法突然失灵，并且还折腾了几天，真的会逼疯人。<br />
前几天我就遇到了这种破事：在Ubuntu开发机上自己升级IntelliJ idea到最新版之后，就无法再启动它。<br />
启动时永远会崩溃，无论是重启系统、删除IntelliJ idea的本地缓存，或者使用回旧版，都无法再启动它（仿佛什么文件被&#34;污染&#34;了，再也回不去了），十分烦人。经过几天各种方法的尝试，终于解决了问题，我的解决办法不具有普适性，但如果你遇到了此类问题，或许可以为你提供一些解决思路。<br />
<span id="more-14081"></span><br />
OS：<span style="color:#0000ff;">Ubuntu 20.04.6 LTS</span><br />
JDK：<span style="color:#0000ff;">1.8.0_382</span><br />
原来安装的IntelliJ idea版本：<span style="color:#b22222;">idea-IC-232.8660.185</span><br />
从JetBrains官网上下载的新版IntelliJ idea版本：<span style="color:#b22222;">idea-IC-233.14808.21</span><br />
我当时不是利用IDE里的升级功能来升级的，而是自己下载了新版的压缩包，解压出来一个&#160;idea-IC-233.14808.21 目录，直接进入 bin 目录下执行 idea.sh 来跑的新版。众所周知，这样跑起来之后，新版会自动把旧版里的配置引入进来，只要没有问题，是可以无缝切换到新版不需要重新配置的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
然而，启动新版的时候挂了，最后看到命令行报错：</p>
<blockquote>
<div>
		[error occurred during error reporting (), id 0x6, SIGABRT (0x6) at pc=0x00007fed3c5cf00b]</div>
<div>
		Aborted (core dumped)</div>
</blockquote>
<div>
	进不了IDE主界面。同时在/home目录下生成了一个内容超长的错误报告文件 java_error_in_idea_xxx.log</div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3intellij-idea%e5%90%af%e5%8a%a8%e5%b4%a9%e6%ba%83%ef%bc%9aerror-occurred-during-error-reporting-id-0x6-sigabrt-0x6-at-pc/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/2024/03/intellij_idea_logo.jpeg" style="width: 225px; height: 225px;" /></div>
<p>有时候，一个用了好久、一直好用的方法突然失灵，并且还折腾了几天，真的会逼疯人。<br />
前几天我就遇到了这种破事：在Ubuntu开发机上自己升级IntelliJ idea到最新版之后，就无法再启动它。<br />
启动时永远会崩溃，无论是重启系统、删除IntelliJ idea的本地缓存，或者使用回旧版，都无法再启动它（仿佛什么文件被&quot;污染&quot;了，再也回不去了），十分烦人。经过几天各种方法的尝试，终于解决了问题，我的解决办法不具有普适性，但如果你遇到了此类问题，或许可以为你提供一些解决思路。<br />
<span id="more-14081"></span><br />
OS：<span style="color:#0000ff;">Ubuntu 20.04.6 LTS</span><br />
JDK：<span style="color:#0000ff;">1.8.0_382</span><br />
原来安装的IntelliJ idea版本：<span style="color:#b22222;">idea-IC-232.8660.185</span><br />
从JetBrains官网上下载的新版IntelliJ idea版本：<span style="color:#b22222;">idea-IC-233.14808.21</span><br />
我当时不是利用IDE里的升级功能来升级的，而是自己下载了新版的压缩包，解压出来一个&nbsp;idea-IC-233.14808.21 目录，直接进入 bin 目录下执行 idea.sh 来跑的新版。众所周知，这样跑起来之后，新版会自动把旧版里的配置引入进来，只要没有问题，是可以无缝切换到新版不需要重新配置的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
然而，启动新版的时候挂了，最后看到命令行报错：</p>
<blockquote>
<div>
		[error occurred during error reporting (), id 0x6, SIGABRT (0x6) at pc=0x00007fed3c5cf00b]</div>
<div>
		Aborted (core dumped)</div>
</blockquote>
<div>
	进不了IDE主界面。同时在/home目录下生成了一个内容超长的错误报告文件 java_error_in_idea_xxx.log<br />
	我一开始并没有看这个log文件，而是按网上搜到的方法，分别尝试了：<br />
	1、重启系统<br />
	2、删除IntelliJ idea的缓存<br />
	3、使用回旧版IntelliJ idea<br />
	4、仿照<a href="https://youtrack.jetbrains.com/issue/IDEA-315192/IntelliJ-would-not-open-after-being-closed-once-on-Ubuntu-22.04-LTS.-The-only-solution-is-rebooting." rel="noopener" target="_blank">这个</a>类似的问题，卸载了snap又重新安装<br />
	以上方法都没用。<br />
	实在没辙了，只能硬着头皮看崩溃产生的日志文件&nbsp;java_error_in_idea_xxx.log，没想到一看就发现了端倪。<br />
	开头有一段内容是：</p>
<div>
<blockquote>
<div>
				# Problematic frame:</div>
<div>
				# C&nbsp; [x86_64-linux-gnu-tree-sitter-cpp.so+0x38ec09]&nbsp; tree_sitter_cpp_external_scanner_deserialize+0x179</div>
</blockquote>
<div>
<div>
				虽然我不知道它是什么确切的意思，但是这里写的是&quot;问题帧&quot;，说明崩溃和它有关。<br />
				<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
				再往下看日志，会看到：</p>
<blockquote>
<div>
						---------------&nbsp; T H R E A D&nbsp; ---------------</div>
<div>
						&nbsp;</div>
<div>
						Current thread (0x00007fec7c02b370):&nbsp; JavaThread &quot;AWT-EventQueue-0&quot; [_thread_in_native, id=352672, stack(0x00007feb95ae5000,0x00007feb95be6000)]</div>
<div>
						&nbsp;</div>
<div>
						Stack: [0x00007feb95ae5000,0x00007feb95be6000],&nbsp; sp=0x00007feb95be0310,&nbsp; free space=1004k</div>
<div>
						Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)</div>
<div>
						C&nbsp; [x86_64-linux-gnu-tree-sitter-cpp.so+0x38ec09]&nbsp; tree_sitter_cpp_external_scanner_deserialize+0x179</div>
<div>
						C&nbsp; [x86_64-linux-gnu-tree-sitter.so+0x30b3e]&nbsp; ts_parser_reset+0x30e</div>
<div>
						C&nbsp; [x86_64-linux-gnu-tree-sitter.so+0x2e329]&nbsp; ts_parser_set_language+0x399</div>
<div>
						C&nbsp; [x86_64-linux-gnu-tree-sitter.so+0xb4875]&nbsp; Java_org_treesitter_TSParser_ts_1parser_1set_1language+0x25</div>
<div>
						j&nbsp; org.treesitter.TSParser.ts_parser_set_language(JJ)Z+0</div>
<div>
						j&nbsp; org.treesitter.TSParser.setLanguage(Lorg/treesitter/TSLanguage;)Z+10</div>
<div>
						j&nbsp; ai.codegeex.plugin.lang.agent.d.a()V+64</div>
<div>
						j&nbsp; ai.codegeex.plugin.lang.agent.d.&lt;init&gt;()V+365</div>
<div>
						j&nbsp; ai.codegeex.plugin.lang.agent.d.a()Lai/codegeex/plugin/lang/agent/d;+10</div>
<div>
						j&nbsp; ai.codegeex.plugin.lang.agent.CodegeexAgentCompletionService.e()V+0</div>
<div>
						j&nbsp; ai.codegeex.plugin.lang.agent.CodegeexAgentCompletionService.&lt;init&gt;()V+266</div>
</blockquote>
<div>
<div>
						可见，和出问题的&quot;x86_64-linux-gnu-tree-sitter-cpp.so+0x38ec09&quot;有关系的插件，第一个出现的就是&quot;ai.codegeex.plugin&quot;，这个对应的就是我安装的CodeGeeX插件。</div>
<div>
						所以我怀疑，删除这个插件可以解决IntelliJ idea启动崩溃的问题。</div>
<div>
						在Ubuntu系统上，插件安装在这个目录下：<span style="color:#0000ff;">~/.local/share/JetBrains/IdeaIC2023.3</span></div>
<div>
						其中，IdeaIC2023.3是IntelliJ idea的版本号，每升级一个版本，~/.local/share/JetBrains 目录下都会生成一个新的目录。</div>
<div>
						在这个目录下，会看到有一个名为&quot;CodeGeeX&quot;的目录，这个目录就是CodeGeeX插件的安装目录，删除它即可。</div>
<div>
						然后再次尝试启动IntelliJ idea，发现已经可以正常启动了。<br />
						虽然我现在还不知道为什么CodeGeeX插件会引起这个问题，但是如果你像我一样，实在找不到IDE崩溃的原因时，删除可能有问题的插件或许是解决问题的一个办法。<br />
						<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
						<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
						转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
						感谢关注我的微信公众号（微信扫一扫）：<br />
						<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
						以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
							<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
</p></div>
</p></div>
</p></div>
</p></div>
</p></div>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3intellij-idea%e5%90%af%e5%8a%a8%e5%b4%a9%e6%ba%83%ef%bc%9aerror-occurred-during-error-reporting-id-0x6-sigabrt-0x6-at-pc/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 和付费使用一年多的GitHub Copilot说再见</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%92%8c%e4%bb%98%e8%b4%b9%e4%bd%bf%e7%94%a8%e4%b8%80%e5%b9%b4%e5%a4%9a%e7%9a%84-github-copilot-%e8%af%b4%e5%86%8d%e8%a7%81/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%92%8c%e4%bb%98%e8%b4%b9%e4%bd%bf%e7%94%a8%e4%b8%80%e5%b9%b4%e5%a4%9a%e7%9a%84-github-copilot-%e8%af%b4%e5%86%8d%e8%a7%81/#comments</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Fri, 01 Mar 2024 19:16:53 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[AI辅助编程]]></category>
		<category><![CDATA[CodeGeeX]]></category>
		<category><![CDATA[GitHub Copilot]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=14064</guid>

					<description><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="GitHub CoPilot" src="https://www.codelast.com/wp-content/uploads/2024/03/github_copilot_1.png" style="width: 800px; height: 213px;" /></div>
<div>
	&#160;</div>
<div>
	昨天，我的GitHub Copilot订阅到期了。付费使用了一年多，现在也决定不再续费，颇有些感受。<br />
	&#160;</div>
<div>
	从付费之前的热切期盼，到使用过程中的逐渐习惯，再到付费结束时的&#34;从容分手&#34;，我终究还是向现实投降，选择了穷人的活法。<br />
	&#160;</div>
<div>
	毕竟一个月10美元的费用，说它值或不值都可以找出充分的理由，只不过于我而言，GitHub Copilot已经不再有$10/月的吸引力罢了。<br />
	<span id="more-14064"></span></div>
<div style="text-align: center;">
	<img decoding="async" alt="GitHub CoPilot" src="https://www.codelast.com/wp-content/uploads/2024/03/github_copilot_2.png" style="width: 800px; height: 309px;" /></div>
<div>
<!--more--></div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	GitHub Copilot作为世界上第一款效果一流的AI辅助编程工具，是在2022年9月份正式上线的。之前，我和很多同行一样，时不时困在&#34;写代码&#8594;某些片段忘了怎么写&#8594;搜Google&#8594;复制粘贴网上的代码测试&#8594;继续写代码&#34;的循环中。这种熟悉而又重复的感觉长时间下来会给人积累不少负能量。<br />
	&#160;</div>
<div>
	直到GitHub Copilot出现，在科技媒体的渲染、宣传下，以及民间艺人的自测报告加持下，它被赋予了一个响当当的名字：牛B！<br />
	&#160;</div>
<div>
	于是我心动了。在试用了一个月，又继续付费体验了一个月之后，GitHub Copilot给我的震动让我相信：它一定能在开发过程中为我节省海量时间。于是在2023年初，我下定决心要续一年的费。<br />
	&#160;</div>
<div>
	$10/月的费用，对很多开发者来说可能要下很大决心才能下手。当时我账户上有一个优惠，以90多美元的价格续了一年的费，也就是不到700人民币一年。<br />
	&#160;</div>
<div>
	GitHub Copilot代码补全的准确度令人印象深刻。我觉得最爽的一点就是：它补全中文注释的结果令我十分满意。无论是补全class头部的比较长的注释，还是在写代码的过程中，补全一行的那种注释，我都觉得它能&#34;想我所想，写我想写&#34;。<br />
	&#160;</div>
<div>
	当然也有最不爽的一点，就是它连接服务器时不时会卡顿&#8212;&#8212;服务器在国外，可以理解。<br />
	&#160;</div>
<div style="text-align: center;">
	<img decoding="async" alt="alternatives" src="https://www.codelast.com/wp-content/uploads/2024/03/alternative.jpg" style="width: 750px; height: 320px;" /></div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	在2023年初那时，如果你想找到一个GitHub Copilot的免费版&#34;平替&#34;，那么选择并不多。国内的aiXcoder、CodeGeeX算是比较知名的其二。</div>
<div style="text-align: center;">
	<img decoding="async" alt="aiXcoder" src="https://www.codelast.com/wp-content/uploads/2024/03/aixcoder.jpg" style="width: 360px; height: 147px;" /></div>
<div>
	aiXcoder的最初几个版本我一直觉得它的设计是真的&#34;有病&#34;&#8212;&#8212;需要在本地安装一个后端软件来做inference。由于和系统相关，这显然会导致在很多Linux发行版上会用不了。比如我曾经在Ubuntu 16.04上尝试过安装aiXcoder的本地推理软件，无奈由于依赖库的问题装不上，到官方QQ群里反馈问题，官方的开发也只是确认了问题，却不给任何解决方案。想必这种没有任何KPI又赚不了一毛钱的事情，鬼才会给你开发！<br />
	&#160;</div>
<div>
	所以我毫不犹豫地抛弃了aiXcoder。</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	GitHub Copilot那种&#34;一个插件解决所有问题&#34;、&#34;推理在云端完成&#34;的机制，就基本避开了系统版本的差异，没有了依赖库的问题。<br />
	&#160;</div>
<div style="text-align: center;">
	<img decoding="async" alt="CodeGeeX" src="https://www.codelast.com/wp-content/uploads/2024/03/codegeex.png" style="width: 727px; height: 153px;" /></div>
<div>
	<br />
	在2023年初，国产的CodeGeeX算是辅助编程领域的另一个耀眼明星。它和GitHub Copilot一样，inference都在云端完成，安装一个插件搞定一切&#8212;&#8212;这才像是一个正常的辅助编程软件应有的样子。<br />
	&#160;</div>
<div>
	我当时在大概10个case上详细对比过GitHub Copilot和CodeGeeX的效果，结论当然不出意外：GitHub Copilot全面碾压CodeGeeX&#8212;&#8212;这里不是指在某些技术指标上进行对比(比如用于评估代码生成质量的测试集等)，而是纯粹从个人的直观感受上看二者的输出谁更好。<br />
	&#160;</div>
<div>
	所以，实话实说，在2023年初的时候，经过我个人的测试，我宁可选择花700块这么大一笔费用去买GitHub Copilot，也不愿每天频繁使用免费的CodeGeeX，因为它当时的代码补全效果确实不太行，而且对某些编程语言的支持也很菜（比如Apache Pig），会影响我的开发工作。</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%92%8c%e4%bb%98%e8%b4%b9%e4%bd%bf%e7%94%a8%e4%b8%80%e5%b9%b4%e5%a4%9a%e7%9a%84-github-copilot-%e8%af%b4%e5%86%8d%e8%a7%81/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="GitHub CoPilot" src="https://www.codelast.com/wp-content/uploads/2024/03/github_copilot_1.png" style="width: 800px; height: 213px;" /></div>
<div>
	&nbsp;</div>
<div>
	昨天，我的GitHub Copilot订阅到期了。付费使用了一年多，现在也决定不再续费，颇有些感受。<br />
	&nbsp;</div>
<div>
	从付费之前的热切期盼，到使用过程中的逐渐习惯，再到付费结束时的&quot;从容分手&quot;，我终究还是向现实投降，选择了穷人的活法。<br />
	&nbsp;</div>
<div>
	毕竟一个月10美元的费用，说它值或不值都可以找出充分的理由，只不过于我而言，GitHub Copilot已经不再有$10/月的吸引力罢了。<br />
	<span id="more-14064"></span></div>
<div style="text-align: center;">
	<img decoding="async" alt="GitHub CoPilot" src="https://www.codelast.com/wp-content/uploads/2024/03/github_copilot_2.png" style="width: 800px; height: 309px;" /></div>
<div>
<!--more--></div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	GitHub Copilot作为世界上第一款效果一流的AI辅助编程工具，是在2022年9月份正式上线的。之前，我和很多同行一样，时不时困在&quot;写代码&rarr;某些片段忘了怎么写&rarr;搜Google&rarr;复制粘贴网上的代码测试&rarr;继续写代码&quot;的循环中。这种熟悉而又重复的感觉长时间下来会给人积累不少负能量。<br />
	&nbsp;</div>
<div>
	直到GitHub Copilot出现，在科技媒体的渲染、宣传下，以及民间艺人的自测报告加持下，它被赋予了一个响当当的名字：牛B！<br />
	&nbsp;</div>
<div>
	于是我心动了。在试用了一个月，又继续付费体验了一个月之后，GitHub Copilot给我的震动让我相信：它一定能在开发过程中为我节省海量时间。于是在2023年初，我下定决心要续一年的费。<br />
	&nbsp;</div>
<div>
	$10/月的费用，对很多开发者来说可能要下很大决心才能下手。当时我账户上有一个优惠，以90多美元的价格续了一年的费，也就是不到700人民币一年。<br />
	&nbsp;</div>
<div>
	GitHub Copilot代码补全的准确度令人印象深刻。我觉得最爽的一点就是：它补全中文注释的结果令我十分满意。无论是补全class头部的比较长的注释，还是在写代码的过程中，补全一行的那种注释，我都觉得它能&quot;想我所想，写我想写&quot;。<br />
	&nbsp;</div>
<div>
	当然也有最不爽的一点，就是它连接服务器时不时会卡顿&mdash;&mdash;服务器在国外，可以理解。<br />
	&nbsp;</div>
<div style="text-align: center;">
	<img decoding="async" alt="alternatives" src="https://www.codelast.com/wp-content/uploads/2024/03/alternative.jpg" style="width: 750px; height: 320px;" /></div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	在2023年初那时，如果你想找到一个GitHub Copilot的免费版&quot;平替&quot;，那么选择并不多。国内的aiXcoder、CodeGeeX算是比较知名的其二。</div>
<div style="text-align: center;">
	<img decoding="async" alt="aiXcoder" src="https://www.codelast.com/wp-content/uploads/2024/03/aixcoder.jpg" style="width: 360px; height: 147px;" /></div>
<div>
	aiXcoder的最初几个版本我一直觉得它的设计是真的&quot;有病&quot;&mdash;&mdash;需要在本地安装一个后端软件来做inference。由于和系统相关，这显然会导致在很多Linux发行版上会用不了。比如我曾经在Ubuntu 16.04上尝试过安装aiXcoder的本地推理软件，无奈由于依赖库的问题装不上，到官方QQ群里反馈问题，官方的开发也只是确认了问题，却不给任何解决方案。想必这种没有任何KPI又赚不了一毛钱的事情，鬼才会给你开发！<br />
	&nbsp;</div>
<div>
	所以我毫不犹豫地抛弃了aiXcoder。</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	GitHub Copilot那种&quot;一个插件解决所有问题&quot;、&quot;推理在云端完成&quot;的机制，就基本避开了系统版本的差异，没有了依赖库的问题。<br />
	&nbsp;</div>
<div style="text-align: center;">
	<img decoding="async" alt="CodeGeeX" src="https://www.codelast.com/wp-content/uploads/2024/03/codegeex.png" style="width: 727px; height: 153px;" /></div>
<div>
	<br />
	在2023年初，国产的CodeGeeX算是辅助编程领域的另一个耀眼明星。它和GitHub Copilot一样，inference都在云端完成，安装一个插件搞定一切&mdash;&mdash;这才像是一个正常的辅助编程软件应有的样子。<br />
	&nbsp;</div>
<div>
	我当时在大概10个case上详细对比过GitHub Copilot和CodeGeeX的效果，结论当然不出意外：GitHub Copilot全面碾压CodeGeeX&mdash;&mdash;这里不是指在某些技术指标上进行对比(比如用于评估代码生成质量的测试集等)，而是纯粹从个人的直观感受上看二者的输出谁更好。<br />
	&nbsp;</div>
<div>
	所以，实话实说，在2023年初的时候，经过我个人的测试，我宁可选择花700块这么大一笔费用去买GitHub Copilot，也不愿每天频繁使用免费的CodeGeeX，因为它当时的代码补全效果确实不太行，而且对某些编程语言的支持也很菜（比如Apache Pig），会影响我的开发工作。</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>
<div>
	然而，在2023年一整年，CodeGeeX做了令人印象深刻的重大改进，不仅代码补全的质量提高很多，而且还增加了大量新功能，三言两语无法说完。因此，在2024年的今天，当我的GitHub Copilot需要再次付费的时候，除非它降价到原来的10%（我当然知道这不可能），否则我是不会再续费了，我会选择用免费的国产平替：CodeGeeX。<br />
	&nbsp;</div>
<div>
	2023年至今，除了CodeGeeX的巨大进步之外，市场上还出现了大量免费竞品，包括Codeium（国外），Fitten Code（国产）等等，它们虽然可能比GitHub Copilot还有差距，但是你要相信：只要你不是特别挑剔，日常使用绝对够了。</div>
<div>
	&nbsp;</div>
<div>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
		<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%92%8c%e4%bb%98%e8%b4%b9%e4%bd%bf%e7%94%a8%e4%b8%80%e5%b9%b4%e5%a4%9a%e7%9a%84-github-copilot-%e8%af%b4%e5%86%8d%e8%a7%81/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 在Apache Pig中把数据按指定字段分组，每组取时间最新的一条记录</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%95%b0%e6%8d%ae%e6%8c%89%e6%8c%87%e5%ae%9a%e5%ad%97%e6%ae%b5%e5%88%86%e7%bb%84%ef%bc%8c%e6%af%8f%e7%bb%84%e5%8f%96%e6%97%b6%e9%97%b4/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%95%b0%e6%8d%ae%e6%8c%89%e6%8c%87%e5%ae%9a%e5%ad%97%e6%ae%b5%e5%88%86%e7%bb%84%ef%bc%8c%e6%af%8f%e7%bb%84%e5%8f%96%e6%97%b6%e9%97%b4/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Wed, 15 Nov 2023 08:15:25 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[apache pig]]></category>
		<category><![CDATA[GROUP]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13967</guid>

					<description><![CDATA[<p>查看更多Apache Pig的教程请点击<a href="https://www.codelast.com/?p=4550" rel="noopener" target="_blank"><span style="background-color: rgb(255, 160, 122);">这里</span></a>。</p>
<p>用Apache Pig处理大数据时，经常会有这种需求：把输入数据按指定的字段group，并且每个group内只输出时间最新的一条记录。<br />
<span id="more-13967"></span><br />
举个例子。有数据文件 input.txt ：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">10&#160;&#160;&#160;&#160;&#160;&#160;a&#160;&#160;&#160;&#160;&#160;&#160;&#160;1,2,3
9&#160;&#160;&#160;&#160;&#160;&#160;&#160;b&#160;&#160;&#160;&#160;&#160;&#160;&#160;1,2
8&#160;&#160;&#160;&#160;&#160;&#160;&#160;a&#160;&#160;&#160;&#160;&#160;&#160;&#160;2,3,4
13&#160;&#160;&#160;&#160;&#160;&#160;a&#160;&#160;&#160;&#160;&#160;&#160;&#160;1,2,3,4
6&#160;&#160;&#160;&#160;&#160;&#160;&#160;b&#160;&#160;&#160;&#160;&#160;&#160;&#160;1
</code></pre>
</section>
<p>该数据的三个字段分别代表：<span style="background-color: rgb(255, 255, 255); color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre;">time（时间戳），userId（用户id），userInterest（用户兴趣id）<br />
现在，要找出每个用户时间最新的</span><span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest，</span><span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">应该怎么做？</span><br />
<span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">即：对用户 a，最新的时间戳是13，</span><span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest是1,2,3,4；对用户 b，最新的时间戳是9，</span><span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest是1,2。</span><br />
<span style="color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">直接上代码：</span></p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">A&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">LOAD</span>&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#39;input.txt&#39;</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">AS</span>&#160;(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">time</span>:&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">long</span>,&#160;userId:&#160;chararray,&#160;userInterest:&#160;chararray);
A&#160;=&#160;FOREACH&#160;A&#160;GENERATE&#160;time,&#160;userId,&#160;userInterest;
B&#160;=&#160;GROUP&#160;A&#160;BY&#160;userId;
<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">--&#160;每个userId取时间最新的一条记录</span>
C&#160;=&#160;FOREACH&#160;B&#160;{
&#160;&#160;&#160;&#160;SORTED&#160;=&#160;ORDER&#160;A&#160;BY&#160;time&#160;DESC;
&#160;&#160;&#160;&#160;ONE_RECORD&#160;=&#160;LIMIT&#160;SORTED&#160;1;
&#160;&#160;&#160;&#160;GENERATE&#160;FLATTEN(ONE_RECORD);
};
DUMP&#160;C;
</code></pre>
</section>
<p>
在嵌套的FOREACH语句中，首先用ORDER BY对同一个group内的数据进行了降序排序，再用LIMIT取一条记录，由于是按time降序排序，因此LIMIT 1取到的就是时间戳最大的那条记录，即时间最新的记录。<br />
<span style="font-size: 16px; color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; white-space: pre; background-color: rgb(255, 255, 255);">输出：</span></p>
<blockquote>
<div>
		(13,a,1,2,3,4)</div>
<div>
		(9,b,1,2)</div>
</blockquote>
<p><span style="font-size: 16px; color: rgb(59, 59, 59); font-family: &#34;Droid Sans Mono&#34;, &#34;monospace&#34;, monospace; white-space: pre; background-color: rgb(255, 255, 255);">可见这个结果和我们前面人工判断出来的正确结果一致。</span></p>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%95%b0%e6%8d%ae%e6%8c%89%e6%8c%87%e5%ae%9a%e5%ad%97%e6%ae%b5%e5%88%86%e7%bb%84%ef%bc%8c%e6%af%8f%e7%bb%84%e5%8f%96%e6%97%b6%e9%97%b4/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>查看更多Apache Pig的教程请点击<a href="https://www.codelast.com/?p=4550" rel="noopener" target="_blank"><span style="background-color: rgb(255, 160, 122);">这里</span></a>。</p>
<p>用Apache Pig处理大数据时，经常会有这种需求：把输入数据按指定的字段group，并且每个group内只输出时间最新的一条记录。<br />
<span id="more-13967"></span><br />
举个例子。有数据文件 input.txt ：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">10&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1,2,3
9&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;b&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1,2
8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2,3,4
13&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1,2,3,4
6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;b&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1
</code></pre>
</section>
<p>该数据的三个字段分别代表：<span style="background-color: rgb(255, 255, 255); color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre;">time（时间戳），userId（用户id），userInterest（用户兴趣id）<br />
现在，要找出每个用户时间最新的</span><span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest，</span><span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">应该怎么做？</span><br />
<span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">即：对用户 a，最新的时间戳是13，</span><span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest是1,2,3,4；对用户 b，最新的时间戳是9，</span><span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">userInterest是1,2。</span><br />
<span style="color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; font-size: 16px; white-space: pre; background-color: rgb(255, 255, 255);">直接上代码：</span></p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">A&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">LOAD</span>&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#39;input.txt&#39;</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">AS</span>&nbsp;(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">time</span>:&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">long</span>,&nbsp;userId:&nbsp;chararray,&nbsp;userInterest:&nbsp;chararray);
A&nbsp;=&nbsp;FOREACH&nbsp;A&nbsp;GENERATE&nbsp;time,&nbsp;userId,&nbsp;userInterest;
B&nbsp;=&nbsp;GROUP&nbsp;A&nbsp;BY&nbsp;userId;
<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">--&nbsp;每个userId取时间最新的一条记录</span>
C&nbsp;=&nbsp;FOREACH&nbsp;B&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;SORTED&nbsp;=&nbsp;ORDER&nbsp;A&nbsp;BY&nbsp;time&nbsp;DESC;
&nbsp;&nbsp;&nbsp;&nbsp;ONE_RECORD&nbsp;=&nbsp;LIMIT&nbsp;SORTED&nbsp;1;
&nbsp;&nbsp;&nbsp;&nbsp;GENERATE&nbsp;FLATTEN(ONE_RECORD);
};
DUMP&nbsp;C;
</code></pre>
</section>
<p>
在嵌套的FOREACH语句中，首先用ORDER BY对同一个group内的数据进行了降序排序，再用LIMIT取一条记录，由于是按time降序排序，因此LIMIT 1取到的就是时间戳最大的那条记录，即时间最新的记录。<br />
<span style="font-size: 16px; color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; white-space: pre; background-color: rgb(255, 255, 255);">输出：</span></p>
<blockquote>
<div>
		(13,a,1,2,3,4)</div>
<div>
		(9,b,1,2)</div>
</blockquote>
<p><span style="font-size: 16px; color: rgb(59, 59, 59); font-family: &quot;Droid Sans Mono&quot;, &quot;monospace&quot;, monospace; white-space: pre; background-color: rgb(255, 255, 255);">可见这个结果和我们前面人工判断出来的正确结果一致。</span></p>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：<br />
<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%95%b0%e6%8d%ae%e6%8c%89%e6%8c%87%e5%ae%9a%e5%ad%97%e6%ae%b5%e5%88%86%e7%bb%84%ef%bc%8c%e6%af%8f%e7%bb%84%e5%8f%96%e6%97%b6%e9%97%b4/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 在Apache Pig中把时间字符串转换成时间戳</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%97%b6%e9%97%b4%e5%ad%97%e7%ac%a6%e4%b8%b2%e8%bd%ac%e6%8d%a2%e6%88%90%e6%97%b6%e9%97%b4%e6%88%b3/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%97%b6%e9%97%b4%e5%ad%97%e7%ac%a6%e4%b8%b2%e8%bd%ac%e6%8d%a2%e6%88%90%e6%97%b6%e9%97%b4%e6%88%b3/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Thu, 12 Oct 2023 09:37:25 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[apache pig]]></category>
		<category><![CDATA[时间字符串]]></category>
		<category><![CDATA[时间戳]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13959</guid>

					<description><![CDATA[<p>查看更多Apache Pig的教程请点击<a href="https://www.codelast.com/?p=4550" target="_blank" rel="noopener"><span style="background-color:#ffa07a;">这里</span></a>。</p>
<p>在Apache Pig中，怎样把 <span style="color:#ff0000;">2023-10-11_10:57:56</span> 这种格式的时间字符串，转成整型的时间戳？<br />
话不多说，直接上代码。<br />
假设输入数据文件 1.txt，其格式是一行一个时间字符串。<br />
<span id="more-13959"></span></p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">A&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">LOAD</span>&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#39;1.txt&#39;</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">AS</span>&#160;(dt:&#160;chararray);
A&#160;=&#160;FOREACH&#160;A&#160;GENERATE&#160;ToDate(dt,&#160;&#39;yyyy-MM-dd_HH:mm:ss&#39;)&#160;AS&#160;date;
B&#160;=&#160;FOREACH&#160;A&#160;GENERATE&#160;ToUnixTime(date)&#160;AS&#160;ts;
DUMP&#160;B;
</code></pre>
</section>
<p>
输出结果形如：</p>
<blockquote>
<p>
		1696993076</p>
</blockquote>
<p>
可见，这样得到的时间戳单位是&#8220;秒&#8221;。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;版权声明&#160;<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&#160;<br />
感谢关注我的微信公众号（微信扫一扫）：<br />
<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" />&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%97%b6%e9%97%b4%e5%ad%97%e7%ac%a6%e4%b8%b2%e8%bd%ac%e6%8d%a2%e6%88%90%e6%97%b6%e9%97%b4%e6%88%b3/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p>查看更多Apache Pig的教程请点击<a href="https://www.codelast.com/?p=4550" target="_blank" rel="noopener"><span style="background-color:#ffa07a;">这里</span></a>。</p>
<p>在Apache Pig中，怎样把 <span style="color:#ff0000;">2023-10-11_10:57:56</span> 这种格式的时间字符串，转成整型的时间戳？<br />
话不多说，直接上代码。<br />
假设输入数据文件 1.txt，其格式是一行一个时间字符串。<br />
<span id="more-13959"></span></p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">A&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">LOAD</span>&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#39;1.txt&#39;</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">AS</span>&nbsp;(dt:&nbsp;chararray);
A&nbsp;=&nbsp;FOREACH&nbsp;A&nbsp;GENERATE&nbsp;ToDate(dt,&nbsp;&#39;yyyy-MM-dd_HH:mm:ss&#39;)&nbsp;AS&nbsp;date;
B&nbsp;=&nbsp;FOREACH&nbsp;A&nbsp;GENERATE&nbsp;ToUnixTime(date)&nbsp;AS&nbsp;ts;
DUMP&nbsp;B;
</code></pre>
</section>
<p>
输出结果形如：</p>
<blockquote>
<p>
		1696993076</p>
</blockquote>
<p>
可见，这样得到的时间戳单位是&ldquo;秒&rdquo;。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
感谢关注我的微信公众号（微信扫一扫）：<br />
<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e5%9c%a8apache-pig%e4%b8%ad%e6%8a%8a%e6%97%b6%e9%97%b4%e5%ad%97%e7%ac%a6%e4%b8%b2%e8%bd%ac%e6%8d%a2%e6%88%90%e6%97%b6%e9%97%b4%e6%88%b3/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] JAVA map-reduce job中，reduce()方法漏写 @Override 注解引起的问题</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e4%b8%ad%ef%bc%8creduce%e6%96%b9%e6%b3%95%e6%bc%8f%e5%86%99-override-%e6%b3%a8%e8%a7%a3%e5%bc%95%e8%b5%b7%e7%9a%84%e9%97%ae%e9%a2%98/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e4%b8%ad%ef%bc%8creduce%e6%96%b9%e6%b3%95%e6%bc%8f%e5%86%99-override-%e6%b3%a8%e8%a7%a3%e5%bc%95%e8%b5%b7%e7%9a%84%e9%97%ae%e9%a2%98/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Sun, 06 Aug 2023 12:12:10 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[map-reduce job]]></category>
		<category><![CDATA[类型错误]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13934</guid>

					<description><![CDATA[<p>有一个JAVA写的map-reduce&#160;job，mapper输出的key、value类型分别为Text、NullWritable，所以reducer应该像下面这样写：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">static</span>&#160;<span class="hljs-class" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">class</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">QuerySegmentResultFromKVReducer</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">extends</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">Reducer</span>&#60;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">Text</span>,&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>,&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>,&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>&#62;&#160;</span>{

&#160;&#160;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&#160;&#160;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">setup</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Reducer.Context&#160;context)</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&#160;IOException,&#160;InterruptedException&#160;</span>{
&#160;&#160;}

&#160;&#160;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&#160;&#160;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">cleanup</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Reducer.Context&#160;context)</span></span></code></pre>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e4%b8%ad%ef%bc%8creduce%e6%96%b9%e6%b3%95%e6%bc%8f%e5%86%99-override-%e6%b3%a8%e8%a7%a3%e5%bc%95%e8%b5%b7%e7%9a%84%e9%97%ae%e9%a2%98/" class="read-more">Read More </a></section>]]></description>
										<content:encoded><![CDATA[<p>有一个JAVA写的map-reduce&nbsp;job，mapper输出的key、value类型分别为Text、NullWritable，所以reducer应该像下面这样写：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">static</span>&nbsp;<span class="hljs-class" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">class</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">QuerySegmentResultFromKVReducer</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">extends</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">Reducer</span>&lt;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">Text</span>,&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>,&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>,&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">NullWritable</span>&gt;&nbsp;</span>{

&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">setup</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Reducer.Context&nbsp;context)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException,&nbsp;InterruptedException&nbsp;</span>{
&nbsp;&nbsp;}

&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">cleanup</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Reducer.Context&nbsp;context)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException,&nbsp;InterruptedException&nbsp;</span>{
&nbsp;&nbsp;}

&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); overflow-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">protected</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">reduce</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(Text&nbsp;key,&nbsp;Iterable&lt;NullWritable&gt;&nbsp;values,&nbsp;Context&nbsp;context)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException,&nbsp;InterruptedException&nbsp;</span>{
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//<span class="hljs-doctag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">TODO:</span></span>
&nbsp;&nbsp;}
}
</code></pre>
</section>
<p>在这里，reducer输出的key、value类型都是NullWritable，我们不用关心，这不是本文的关注点。<br />
<span id="more-13934"></span><br />
如果reduce()方法漏掉了&nbsp;<span style="color:#ff0000;">@Override</span>&nbsp;注解，并且把&nbsp; Reducer&lt;Text, NullWritable, NullWritable, NullWritable&gt;&nbsp;错误地写成了&nbsp;Reducer&lt;Text, Text, NullWritable, NullWritable&gt;，会发现编译并不报错。<br />
但是，当你跑这个job的时候，诡异的事情就来了。你会发现，你在&ldquo;TODO:&rdquo;那里写的reduce逻辑并没有执行，即使没有用 context.write()&nbsp;方法把任何数据输出到HDFS上，Hadoop counter仍然显示该job输出了和reducer输入一样多的数据。<br />
从现象上看，就像是执行了一个默认的Reducer，把reducer的输入数据原样输出。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
所以这里不得不强调，一定不要漏写&nbsp;<span style="color: rgb(255, 0, 0);">@Override</span>&nbsp;注解！有这个注解的时候，IDE就会提示错误，编译就会失败！</p>
<div>
	@Override 注解是可选的，如果删除了它，编译器不会报错，因为在 JAVA 中，重写一个方法时不使用 @Override 注解也是允许的。</div>
<div>
	&nbsp;</div>
<div>
	但是，建议在重写父类或接口中的方法时使用 @Override 注解。这样做有以下几个好处：</div>
<div>
	➤ 增加代码可读性：通过使用 @Override 注解，其他开发人员可以清楚地知道该方法是对父类或接口中的方法进行重写的，代码更易于理解。</div>
<div>
	➤ 防止错误：如果你错误地拼写了要重写的方法名，或者方法签名不正确，编译器会给出错误提示，帮助你发现潜在的问题。</div>
<div>
	➤ 保证代码的健壮性：如果父类或接口中的方法发生了变化，使用 @Override 注解的方法会在编译时产生错误，提醒你需要更新重写的方法。</div>
<p>
	在本文的例子中，如果reduce()方法没有写 @Override 注解，那么当reducer类错误地定义成了extends Reducer&lt;Text, Text, NullWritable, NullWritable&gt;的时候，IDE并不会发现reduce()方法有错，从而让你误以为一切正常。<br />
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-map-reduce-job%e4%b8%ad%ef%bc%8creduce%e6%96%b9%e6%b3%95%e6%bc%8f%e5%86%99-override-%e6%b3%a8%e8%a7%a3%e5%bc%95%e8%b5%b7%e7%9a%84%e9%97%ae%e9%a2%98/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 解决Map-Reduce job OOM(Java Heap Space)错误的一个方法：调整内存参数</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3map-reduce-job-oomjava-heap-space%e9%94%99%e8%af%af%e7%9a%84%e4%b8%80%e4%b8%aa%e6%96%b9%e6%b3%95%ef%bc%9a%e8%b0%83%e6%95%b4%e5%86%85%e5%ad%98%e5%8f%82%e6%95%b0/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3map-reduce-job-oomjava-heap-space%e9%94%99%e8%af%af%e7%9a%84%e4%b8%80%e4%b8%aa%e6%96%b9%e6%b3%95%ef%bc%9a%e8%b0%83%e6%95%b4%e5%86%85%e5%ad%98%e5%8f%82%e6%95%b0/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Mon, 19 Jun 2023 05:21:18 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[Java Heap Space]]></category>
		<category><![CDATA[M-R job]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[调大内存]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13923</guid>

					<description><![CDATA[<p>无论是JAVA M-R job还是Pig M-R job发生Java Heap Space错误，一般情况下，我们要通过定位输入数据里的异常情况再想办法解决，例如，你在程序中对某个key做了GROUP操作，但输入数据中可能该key有大量记录，这就有可能导致job OOM。<br />
这个问题取决于数据的具体情况，以及程序实现逻辑，所以这里就不提了。<br />
本文要说的是：有时候程序实现/输入数据的问题&#8220;不是特别严重&#8221;，我们可以通过调整M-R job的内存参数来解决。<br />
<span id="more-13923"></span><br />
对JAVA M-R job，通过 -D 设置如下参数：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="bash language-bash hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&#160;&#160;-D&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;mapreduce.map.memory.mb=8192&#34;</span>&#160;\
&#160;&#160;-D&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;mapreduce.reduce.memory.mb=8192&#34;</span>&#160;\
&#160;&#160;-D&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;mapreduce.map.java.opts=-Xmx6144m&#34;</span>&#160;\
&#160;&#160;-D&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;mapreduce.reduce.java.opts=-Xmx6144m&#34;</span>&#160;\
</code></pre>
</section>
<p>
对Apache Pig M-R job，在Pig代码中添加如下语句：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&#160;mapreduce.map.memory.mb&#160;<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">8192</span>;
<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&#160;mapreduce.reduce.memory.mb&#160;<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">8192</span>;
<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&#160;mapreduce.map.java.opts&#160;-Xmx6144m;</code></pre>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3map-reduce-job-oomjava-heap-space%e9%94%99%e8%af%af%e7%9a%84%e4%b8%80%e4%b8%aa%e6%96%b9%e6%b3%95%ef%bc%9a%e8%b0%83%e6%95%b4%e5%86%85%e5%ad%98%e5%8f%82%e6%95%b0/" class="read-more">Read More </a></section>]]></description>
										<content:encoded><![CDATA[<p>无论是JAVA M-R job还是Pig M-R job发生Java Heap Space错误，一般情况下，我们要通过定位输入数据里的异常情况再想办法解决，例如，你在程序中对某个key做了GROUP操作，但输入数据中可能该key有大量记录，这就有可能导致job OOM。<br />
这个问题取决于数据的具体情况，以及程序实现逻辑，所以这里就不提了。<br />
本文要说的是：有时候程序实现/输入数据的问题&ldquo;不是特别严重&rdquo;，我们可以通过调整M-R job的内存参数来解决。<br />
<span id="more-13923"></span><br />
对JAVA M-R job，通过 -D 设置如下参数：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="bash language-bash hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&nbsp;&nbsp;-D&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;mapreduce.map.memory.mb=8192&quot;</span>&nbsp;\
&nbsp;&nbsp;-D&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;mapreduce.reduce.memory.mb=8192&quot;</span>&nbsp;\
&nbsp;&nbsp;-D&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;mapreduce.map.java.opts=-Xmx6144m&quot;</span>&nbsp;\
&nbsp;&nbsp;-D&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;mapreduce.reduce.java.opts=-Xmx6144m&quot;</span>&nbsp;\
</code></pre>
</section>
<p>
对Apache Pig M-R job，在Pig代码中添加如下语句：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="sql language-sql hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&nbsp;mapreduce.map.memory.mb&nbsp;<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">8192</span>;
<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&nbsp;mapreduce.reduce.memory.mb&nbsp;<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">8192</span>;
<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&nbsp;mapreduce.map.java.opts&nbsp;-Xmx6144m;
<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">SET</span>&nbsp;mapreduce.reduce.java.opts&nbsp;-Xmx6144m;
</code></pre>
</section>
<p>
其中，第1、2个参数需要你根据Hadoop集群的情况自行调整，第3、4个参数设置成第1、2个参数的70%～80%</p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e8%a7%a3%e5%86%b3map-reduce-job-oomjava-heap-space%e9%94%99%e8%af%af%e7%9a%84%e4%b8%80%e4%b8%aa%e6%96%b9%e6%b3%95%ef%bc%9a%e8%b0%83%e6%95%b4%e5%86%85%e5%ad%98%e5%8f%82%e6%95%b0/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>如何查看指定的Hadoop(HDFS)目录的配额</title>
		<link>https://www.codelast.com/%e5%a6%82%e4%bd%95%e6%9f%a5%e7%9c%8b%e6%8c%87%e5%ae%9a%e7%9a%84hadoophdfs%e7%9b%ae%e5%bd%95%e7%9a%84%e9%85%8d%e9%a2%9d/</link>
					<comments>https://www.codelast.com/%e5%a6%82%e4%bd%95%e6%9f%a5%e7%9c%8b%e6%8c%87%e5%ae%9a%e7%9a%84hadoophdfs%e7%9b%ae%e5%bd%95%e7%9a%84%e9%85%8d%e9%a2%9d/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Thu, 11 May 2023 09:12:49 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[hadoop fs -count -q]]></category>
		<category><![CDATA[Hadoop目录配额]]></category>
		<category><![CDATA[HDFS目录配额]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13908</guid>

					<description><![CDATA[<p style="font-size: 16px; line-height: inherit; margin: 0px 0px 16px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-variant-alternates: inherit; font-stretch: inherit; font-family: system-ui, -apple-system, BlinkMacSystemFont, &#34;Segoe UI&#34;, Roboto, &#34;Helvetica Neue&#34;, Arial, &#34;Noto Sans&#34;, sans-serif, &#34;Apple Color Emoji&#34;, &#34;Segoe UI Emoji&#34;, &#34;Segoe UI Symbol&#34;, &#34;Noto Color Emoji&#34;; font-optical-sizing: inherit; font-kerning: inherit; font-feature-settings: inherit; font-variation-settings: inherit; vertical-align: baseline; color: rgb(17, 24, 39);">
	Hadoop目录的配额是指为了限制一个HDFS文件夹中所包含的数据块和名称空间元素（如子文件夹和文件）的数量而设置的最大值。简单来说，就是对一个文件夹中可存储数据的上限进行管理和控制，以便于维护整个系统的健康和性能。</p>
<p style="font-size: 16px; line-height: inherit; margin: 0px 0px 16px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-variant-alternates: inherit; font-stretch: inherit; font-family: system-ui, -apple-system, BlinkMacSystemFont, &#34;Segoe UI&#34;, Roboto, &#34;Helvetica Neue&#34;, Arial, &#34;Noto Sans&#34;, sans-serif, &#34;Apple Color Emoji&#34;, &#34;Segoe UI Emoji&#34;, &#34;Segoe UI Symbol&#34;, &#34;Noto Color Emoji&#34;; font-optical-sizing: inherit; font-kerning: inherit; font-feature-settings: inherit; font-variation-settings: inherit; vertical-align: baseline; color: rgb(17, 24, 39);">
	通过设置适当的配额，管理员可以确保每个目录不会超出其可承受的容量范围，防止集群资源被长时间占用或滥用。同时也可以通过监视使用情况来优化系统性能并减少故障风险。</p>
<p><span id="more-13908"></span></p>
<p>
	使用 <span style="color:#ff0000;">hadoop fs -count -q&#160;</span><span style="color:#0000ff;">&#60;your_hdfs_dir&#62;</span> 命令可以查看指定目录的配额，该命令输出有八列，以下是每列的含义：</p>
<ul>
<li>
		QUOTA：文件和目录的总数限制</li>
<li>
		REMAINING_QUOTA：该用户可以创建的剩余文件和目录总数</li>
<li>
		SPACE_QUOTA：授予该用户的空间配额</li>
<li>
		REMAINING_SPACE_QUOTA：该用户剩余的空间配额</li>
<li>
		DIR_COUNT：当前目录总数</li>
<li>
		FILE_COUNT：当前文件总数</li>
<li>
		CONTENT_SIZE：当前文件大小</li>
<li>
		PATH_NAME：目录路径</li>
</ul>
<p></p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;版权声明&#160;<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&#160;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" />&#8230; <a href="https://www.codelast.com/%e5%a6%82%e4%bd%95%e6%9f%a5%e7%9c%8b%e6%8c%87%e5%ae%9a%e7%9a%84hadoophdfs%e7%9b%ae%e5%bd%95%e7%9a%84%e9%85%8d%e9%a2%9d/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p style="font-size: 16px; line-height: inherit; margin: 0px 0px 16px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-variant-alternates: inherit; font-stretch: inherit; font-family: system-ui, -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, &quot;Helvetica Neue&quot;, Arial, &quot;Noto Sans&quot;, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;, &quot;Noto Color Emoji&quot;; font-optical-sizing: inherit; font-kerning: inherit; font-feature-settings: inherit; font-variation-settings: inherit; vertical-align: baseline; color: rgb(17, 24, 39);">
	Hadoop目录的配额是指为了限制一个HDFS文件夹中所包含的数据块和名称空间元素（如子文件夹和文件）的数量而设置的最大值。简单来说，就是对一个文件夹中可存储数据的上限进行管理和控制，以便于维护整个系统的健康和性能。</p>
<p style="font-size: 16px; line-height: inherit; margin: 0px 0px 16px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-variant-alternates: inherit; font-stretch: inherit; font-family: system-ui, -apple-system, BlinkMacSystemFont, &quot;Segoe UI&quot;, Roboto, &quot;Helvetica Neue&quot;, Arial, &quot;Noto Sans&quot;, sans-serif, &quot;Apple Color Emoji&quot;, &quot;Segoe UI Emoji&quot;, &quot;Segoe UI Symbol&quot;, &quot;Noto Color Emoji&quot;; font-optical-sizing: inherit; font-kerning: inherit; font-feature-settings: inherit; font-variation-settings: inherit; vertical-align: baseline; color: rgb(17, 24, 39);">
	通过设置适当的配额，管理员可以确保每个目录不会超出其可承受的容量范围，防止集群资源被长时间占用或滥用。同时也可以通过监视使用情况来优化系统性能并减少故障风险。</p>
<p><span id="more-13908"></span></p>
<p>
	使用 <span style="color:#ff0000;">hadoop fs -count -q&nbsp;</span><span style="color:#0000ff;">&lt;your_hdfs_dir&gt;</span> 命令可以查看指定目录的配额，该命令输出有八列，以下是每列的含义：</p>
<ul>
<li>
		QUOTA：文件和目录的总数限制</li>
<li>
		REMAINING_QUOTA：该用户可以创建的剩余文件和目录总数</li>
<li>
		SPACE_QUOTA：授予该用户的空间配额</li>
<li>
		REMAINING_SPACE_QUOTA：该用户剩余的空间配额</li>
<li>
		DIR_COUNT：当前目录总数</li>
<li>
		FILE_COUNT：当前文件总数</li>
<li>
		CONTENT_SIZE：当前文件大小</li>
<li>
		PATH_NAME：目录路径</li>
</ul>
<p></p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%a6%82%e4%bd%95%e6%9f%a5%e7%9c%8b%e6%8c%87%e5%ae%9a%e7%9a%84hadoophdfs%e7%9b%ae%e5%bd%95%e7%9a%84%e9%85%8d%e9%a2%9d/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 用JAVA读取本地的TFRecord文件</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84tfrecord%e6%96%87%e4%bb%b6/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84tfrecord%e6%96%87%e4%bb%b6/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Mon, 24 Apr 2023 18:09:06 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[TensorFlow]]></category>
		<category><![CDATA[TFRecord]]></category>
		<category><![CDATA[本地]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13895</guid>

					<description><![CDATA[<div>
	TFRecord是一种用于TensorFlow的二进制数据格式，它可以更高效地存储和读取大规模数据集。TFRecord文件包含了一系列记录（record），每个记录可以是一个张量（tensor）或者一个序列（sequence）。</div>
<div>
	与文本文件不同，TFRecord文件被编码成二进制格式，这使得它们更易于在网络上传输和存储。同时，TFRecord也允许我们将大型数据集分割成多个部分，并且可以有效地并行读取和处理这些部分。</div>
<div>
	在TensorFlow中，我们通常使用TFRecord文件来存储和加载模型的训练数据、验证数据、测试数据等。创建TFRecord文件需要经过一定的序列化操作，但这些操作很容易实现，因为TensorFlow提供了相应的API支持。</div>
<p><span id="more-13895"></span><br />
在大数据处理流程中，TFRecord文件通常是由map-reduce&#160;job生成的，数据量通常很大。有时为了验证文件内容正确，我们需要取少量数据来检查，例如，我们可以拿map-reduce job生成的N个TFRecord文件中的一个，在本地解析出来，打印出其中的内容看是否正确。<br />
下面就是一个用JAVA程序读取TFRecord文件并打印出其中一个Example的例子：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="javascript language-javascript hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&#160;&#160;&#160;&#160;<span class="hljs-built_in" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">String</span>&#160;localTfRecordFile&#160;=&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;/path/to/your/tfrecord/file&#34;</span>;
&#160;&#160;&#160;&#160;InputStream&#160;inputStream&#160;=&#160;Files.newInputStream(Paths.get(localTfRecordFile));
&#160;&#160;&#160;&#160;DataInput&#160;dataInput&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&#160;DataInputStream(inputStream);
&#160;&#160;&#160;&#160;TFRecordReader&#160;reader&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&#160;TFRecordReader(dataInput,&#160;<span class="hljs-literal" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">true</span>);

&#160;&#160;&#160;&#160;byte[]&#160;recordBytes&#160;=&#160;reader.read();
&#160;&#160;&#160;&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&#160;(recordBytes&#160;!=&#160;<span class="hljs-literal" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">null</span>)&#160;{
&#160;&#160;&#160;&#160;&#160;&#160;Example&#160;example&#160;=&#160;Example.parseFrom(recordBytes);
&#160;&#160;&#160;&#160;&#160;&#160;System.out.println(example.toString());
&#160;&#160;&#160;&#160;&#160;&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">break</span>;&#160;&#160;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&#160;只打印一个Example</span>
&#160;&#160;&#160;&#160;}
&#160;&#160;&#160;&#160;inputStream.close();
</code></pre>
</section>
<p>唯一需要注意的就是一个引入：import java.nio.file.Paths;<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
再详细说明一下：</p>
<div>
	TFRecord文件和Example是TensorFlow中用于数据序列化和存储的两个概念，它们之间有着紧密的关系。</div>
<div>
	TFRecord是一种二进制格式的文件，在TensorFlow中被用来高效地存储大量的数据。它通常是由多个Example组成的序列化数据。而Example则是TensorFlow中序列化数据的标准格式，可以包含多个Features，每个Feature又包含一个Tensor（可以是张量、字符串等）。在将数据写入TFRecord文件时，需要将其封装为Example格式；在读取TFRecord文件时，也需要将其中的每个Example解析出来。</div>
<div>
	简而言之，TFRecord文件就像是一个容器，而Example则是这个容器里面每个元素的具体格式。在使用TFRecord时，我们通常会先定义好我们要存储哪些数据以及这些数据应该怎么被划分为不同的Features，并封装成一个或多个Example，在把这些Example写入到TFRecord文件中。
<p>
		<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></p></div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84tfrecord%e6%96%87%e4%bb%b6/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<div>
	TFRecord是一种用于TensorFlow的二进制数据格式，它可以更高效地存储和读取大规模数据集。TFRecord文件包含了一系列记录（record），每个记录可以是一个张量（tensor）或者一个序列（sequence）。</div>
<div>
	与文本文件不同，TFRecord文件被编码成二进制格式，这使得它们更易于在网络上传输和存储。同时，TFRecord也允许我们将大型数据集分割成多个部分，并且可以有效地并行读取和处理这些部分。</div>
<div>
	在TensorFlow中，我们通常使用TFRecord文件来存储和加载模型的训练数据、验证数据、测试数据等。创建TFRecord文件需要经过一定的序列化操作，但这些操作很容易实现，因为TensorFlow提供了相应的API支持。</div>
<p><span id="more-13895"></span><br />
在大数据处理流程中，TFRecord文件通常是由map-reduce&nbsp;job生成的，数据量通常很大。有时为了验证文件内容正确，我们需要取少量数据来检查，例如，我们可以拿map-reduce job生成的N个TFRecord文件中的一个，在本地解析出来，打印出其中的内容看是否正确。<br />
下面就是一个用JAVA程序读取TFRecord文件并打印出其中一个Example的例子：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="javascript language-javascript hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-built_in" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">String</span>&nbsp;localTfRecordFile&nbsp;=&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;/path/to/your/tfrecord/file&quot;</span>;
&nbsp;&nbsp;&nbsp;&nbsp;InputStream&nbsp;inputStream&nbsp;=&nbsp;Files.newInputStream(Paths.get(localTfRecordFile));
&nbsp;&nbsp;&nbsp;&nbsp;DataInput&nbsp;dataInput&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;DataInputStream(inputStream);
&nbsp;&nbsp;&nbsp;&nbsp;TFRecordReader&nbsp;reader&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;TFRecordReader(dataInput,&nbsp;<span class="hljs-literal" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">true</span>);

&nbsp;&nbsp;&nbsp;&nbsp;byte[]&nbsp;recordBytes&nbsp;=&nbsp;reader.read();
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&nbsp;(recordBytes&nbsp;!=&nbsp;<span class="hljs-literal" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">null</span>)&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Example&nbsp;example&nbsp;=&nbsp;Example.parseFrom(recordBytes);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(example.toString());
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">break</span>;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&nbsp;只打印一个Example</span>
&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;inputStream.close();
</code></pre>
</section>
<p>唯一需要注意的就是一个引入：import java.nio.file.Paths;<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
再详细说明一下：</p>
<div>
	TFRecord文件和Example是TensorFlow中用于数据序列化和存储的两个概念，它们之间有着紧密的关系。</div>
<div>
	TFRecord是一种二进制格式的文件，在TensorFlow中被用来高效地存储大量的数据。它通常是由多个Example组成的序列化数据。而Example则是TensorFlow中序列化数据的标准格式，可以包含多个Features，每个Feature又包含一个Tensor（可以是张量、字符串等）。在将数据写入TFRecord文件时，需要将其封装为Example格式；在读取TFRecord文件时，也需要将其中的每个Example解析出来。</div>
<div>
	简而言之，TFRecord文件就像是一个容器，而Example则是这个容器里面每个元素的具体格式。在使用TFRecord时，我们通常会先定义好我们要存储哪些数据以及这些数据应该怎么被划分为不同的Features，并封装成一个或多个Example，在把这些Example写入到TFRecord文件中。</p>
<p>
		<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
		<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
		转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
		感谢关注我的微信公众号（微信扫一扫）：<br />
		<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
		以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
		<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84tfrecord%e6%96%87%e4%bb%b6/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 用JAVA程序读取本地的Hadoop sequence file</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e7%a8%8b%e5%ba%8f%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84hadoop-sequence-file/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e7%a8%8b%e5%ba%8f%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84hadoop-sequence-file/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Thu, 20 Apr 2023 12:23:18 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[Hadoop sequence file]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[本地]]></category>
		<category><![CDATA[读取]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13885</guid>

					<description><![CDATA[<div>
	Hadoop SequenceFile是Hadoop用于存储二进制键值对的文件格式。它支持存储不同的键值对类型,如:IntWritable/Text, NullWritable/BytesWritable等。</div>
<div>
	假设我的sequence file的key是BooleanWritable类型，value是Text类型，怎么读取它呢？<br />
	<span id="more-13885"></span></div>
<p>话不多说，直接上代码：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&#160;&#160;&#160;&#160;Configuration&#160;conf&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&#160;Configuration();
&#160;&#160;&#160;&#160;FileSystem&#160;fs&#160;=&#160;FileSystem.getLocal(conf);
&#160;&#160;&#160;&#160;Path&#160;seqFilePath&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&#160;Path(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;/path/to/your/sequence-file&#34;</span>);

&#160;&#160;&#160;&#160;{
&#160;&#160;&#160;&#160;&#160;&#160;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&#160;打印出sequence&#160;file的key和value的类型</span>
&#160;&#160;&#160;&#160;&#160;&#160;SequenceFile.Reader&#160;reader&#160;=&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&#160;SequenceFile.Reader(fs,&#160;seqFilePath,&#160;conf);
&#160;&#160;&#160;&#160;&#160;&#160;Writable&#160;key&#160;=&#160;(Writable)&#160;reader.getKeyClass().newInstance();
&#160;&#160;&#160;&#160;&#160;&#160;Writable&#160;value&#160;=&#160;(Writable)&#160;reader.getValueClass().newInstance();
&#160;&#160;&#160;&#160;&#160;&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&#160;(reader.next(key,&#160;value))&#160;{
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;System.out.println(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;Key&#160;type:&#160;&#34;</span>&#160;+&#160;key.getClass());
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;System.out.println(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;Value&#160;type:&#160;&#34;</span>&#160;+&#160;value.getClass());
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">break</span>;
&#160;&#160;&#160;&#160;&#160;&#160;}
&#160;&#160;&#160;&#160;&#160;&#160;reader.close();
&#160;&#160;&#160;&#160;}

&#160;&#160;&#160;&#160;{
&#160;&#160;&#160;&#160;&#160;&#160;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&#160;打印出sequence&#160;file的内容</span>
&#160;&#160;&#160;&#160;&#160;&#160;SequenceFile.Reader&#160;reader&#160;=&#160;</code></pre>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e7%a8%8b%e5%ba%8f%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84hadoop-sequence-file/" class="read-more">Read More </a></section>]]></description>
										<content:encoded><![CDATA[<div>
	Hadoop SequenceFile是Hadoop用于存储二进制键值对的文件格式。它支持存储不同的键值对类型,如:IntWritable/Text, NullWritable/BytesWritable等。</div>
<div>
	假设我的sequence file的key是BooleanWritable类型，value是Text类型，怎么读取它呢？<br />
	<span id="more-13885"></span></div>
<p>话不多说，直接上代码：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;">&nbsp;&nbsp;&nbsp;&nbsp;Configuration&nbsp;conf&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;Configuration();
&nbsp;&nbsp;&nbsp;&nbsp;FileSystem&nbsp;fs&nbsp;=&nbsp;FileSystem.getLocal(conf);
&nbsp;&nbsp;&nbsp;&nbsp;Path&nbsp;seqFilePath&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;Path(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;/path/to/your/sequence-file&quot;</span>);

&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&nbsp;打印出sequence&nbsp;file的key和value的类型</span>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SequenceFile.Reader&nbsp;reader&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;SequenceFile.Reader(fs,&nbsp;seqFilePath,&nbsp;conf);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Writable&nbsp;key&nbsp;=&nbsp;(Writable)&nbsp;reader.getKeyClass().newInstance();
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Writable&nbsp;value&nbsp;=&nbsp;(Writable)&nbsp;reader.getValueClass().newInstance();
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&nbsp;(reader.next(key,&nbsp;value))&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;Key&nbsp;type:&nbsp;&quot;</span>&nbsp;+&nbsp;key.getClass());
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;Value&nbsp;type:&nbsp;&quot;</span>&nbsp;+&nbsp;value.getClass());
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">break</span>;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reader.close();
&nbsp;&nbsp;&nbsp;&nbsp;}

&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&nbsp;打印出sequence&nbsp;file的内容</span>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;SequenceFile.Reader&nbsp;reader&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;SequenceFile.Reader(fs,&nbsp;seqFilePath),&nbsp;conf);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;BooleanWritable&nbsp;keyObj&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;BooleanWritable();
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Text&nbsp;valueObj&nbsp;=&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">new</span>&nbsp;Text();
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">while</span>&nbsp;(reader.next(keyObj,&nbsp;valueObj))&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;System.out.println(keyObj.get()&nbsp;+&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;:&nbsp;&quot;</span>&nbsp;+&nbsp;valueObj);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reader.close();
&nbsp;&nbsp;&nbsp;&nbsp;}
</code></pre>
</section>
<p>
输出类似于：</p>
<blockquote>
<div>
		Key type: class org.apache.hadoop.io.BooleanWritable</div>
<div>
		Value type: class org.apache.hadoop.io.Text<br />
		（文件内容此处省略）</div>
</blockquote>
<p>Maven依赖取决于你使用的Hadoop版本，下面是一个例子：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="xml language-xml hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependencies</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>org.apache.hadoop<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>hadoop-common<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>3.2.1<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>org.apache.hadoop<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>hadoop-client<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>3.2.1<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>org.apache.hadoop<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">groupId</span>&gt;</span>&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>hadoop-hdfs<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">artifactId</span>&gt;</span>
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>3.2.1<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">version</span>&gt;</span>
&nbsp;&nbsp;<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependency</span>&gt;</span>
<span class="hljs-tag" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">&lt;/<span class="hljs-name" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">dependencies</span>&gt;</span></code></pre>
</section>
<p><span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;关注不迷路&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;</p>
<p>
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%94%a8java%e7%a8%8b%e5%ba%8f%e8%af%bb%e5%8f%96%e6%9c%ac%e5%9c%b0%e7%9a%84hadoop-sequence-file/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] 为Azkaban job添加重试配置</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e4%b8%baazkaban-job%e6%b7%bb%e5%8a%a0%e9%87%8d%e8%af%95%e9%85%8d%e7%bd%ae/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e4%b8%baazkaban-job%e6%b7%bb%e5%8a%a0%e9%87%8d%e8%af%95%e9%85%8d%e7%bd%ae/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Thu, 30 Mar 2023 11:52:58 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[Azkaban]]></category>
		<category><![CDATA[retry]]></category>
		<category><![CDATA[重试]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13873</guid>

					<description><![CDATA[<p><a href="https://azkaban.github.io/" rel="noopener" target="_blank">Azkaban</a>是一款开源的workflow调度系统，它可以帮助用户完成任务的流程化自动化调度。Azkaban提供了一个易于使用的Web用户界面来创建、监视和执行工作流，并支持高级功能，如参数化配置、模块重用等。此外，Azkaban还提供了可扩展的插件接口，使其能够与其他系统进行集成。<br />
<span id="more-13873"></span></p>
<div style="text-align: center;">
	<img decoding="async" alt="Azkaban workflow" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/azkaban.png" style="width: 700px; height: 406px;" /></div>
<p>
在Azkaban job中添加重试配置是为了保障作业的稳定性和可靠性。当一个job执行失败时，自动进行重试可以提高作业成功率，降低人工干预的成本。通过在job中添加重试配置，可以控制重试次数、间隔时间等关键参数，以满足不同作业的需求。</p>
<p>有以下两种重试配置：<br />
<span style="background-color:#00ff00;">➤</span>&#160;common.retries=3<br />
多个job共同的重试次数配置。在同一个目录下，有多个.job文件时，如果想要每个文件都使用同样的重试配置，那么可以在这个目录下添加一个&#160;config.properties 文件，在里面添加上&#160;common.retries=3 这一行。当然，你可以把3改成任意你想要的重试次数。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color: rgb(0, 255, 0);">➤</span>&#160;retries=3<br />
当你只想为单个job添加重试配置时，可以直接在该.job文件中添加&#160;retries=3 这行配置。当然，你可以把3改成任意你想要的重试次数。<br />
此时，你还可以设置<span style="color: rgb(17, 24, 39); font-size: 16px; white-space: pre-wrap;">任务在重试之前等待的毫秒数</span>：</p>
<blockquote>
<p>
		retry.backoff=60000</p>
</blockquote>
<p>其中，<span style="color: rgb(17, 24, 39); font-family: inherit; font-size: 16px; white-space: pre-wrap;">60000表示6万毫秒。具体而言，如果任务失败并且需要进行重试，则该任务将在下一次尝试之前等待60秒钟。这可以帮助避免服务器超载，并在发生故障时提高任务的成功率。</span></p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;版权声明&#160;<span style="color: rgb(255, 0, 0);">➤➤</span>&#160;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&#160;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" />&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e4%b8%baazkaban-job%e6%b7%bb%e5%8a%a0%e9%87%8d%e8%af%95%e9%85%8d%e7%bd%ae/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<p><a href="https://azkaban.github.io/" rel="noopener" target="_blank">Azkaban</a>是一款开源的workflow调度系统，它可以帮助用户完成任务的流程化自动化调度。Azkaban提供了一个易于使用的Web用户界面来创建、监视和执行工作流，并支持高级功能，如参数化配置、模块重用等。此外，Azkaban还提供了可扩展的插件接口，使其能够与其他系统进行集成。<br />
<span id="more-13873"></span></p>
<div style="text-align: center;">
	<img decoding="async" alt="Azkaban workflow" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/azkaban.png" style="width: 700px; height: 406px;" /></div>
<p>
在Azkaban job中添加重试配置是为了保障作业的稳定性和可靠性。当一个job执行失败时，自动进行重试可以提高作业成功率，降低人工干预的成本。通过在job中添加重试配置，可以控制重试次数、间隔时间等关键参数，以满足不同作业的需求。</p>
<p>有以下两种重试配置：<br />
<span style="background-color:#00ff00;">➤</span>&nbsp;common.retries=3<br />
多个job共同的重试次数配置。在同一个目录下，有多个.job文件时，如果想要每个文件都使用同样的重试配置，那么可以在这个目录下添加一个&nbsp;config.properties 文件，在里面添加上&nbsp;common.retries=3 这一行。当然，你可以把3改成任意你想要的重试次数。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
<span style="background-color: rgb(0, 255, 0);">➤</span>&nbsp;retries=3<br />
当你只想为单个job添加重试配置时，可以直接在该.job文件中添加&nbsp;retries=3 这行配置。当然，你可以把3改成任意你想要的重试次数。<br />
此时，你还可以设置<span style="color: rgb(17, 24, 39); font-size: 16px; white-space: pre-wrap;">任务在重试之前等待的毫秒数</span>：</p>
<blockquote>
<p>
		retry.backoff=60000</p>
</blockquote>
<p>其中，<span style="color: rgb(17, 24, 39); font-family: inherit; font-size: 16px; white-space: pre-wrap;">60000表示6万毫秒。具体而言，如果任务失败并且需要进行重试，则该任务将在下一次尝试之前等待60秒钟。这可以帮助避免服务器超载，并在发生故障时提高任务的成功率。</span></p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e4%b8%baazkaban-job%e6%b7%bb%e5%8a%a0%e9%87%8d%e8%af%95%e9%85%8d%e7%bd%ae/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] JAVA sun HttpServer在handler中返回含中文的response应该怎么处理</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-sun-httpserver%e5%9c%a8handler%e4%b8%ad%e8%bf%94%e5%9b%9e%e5%90%ab%e4%b8%ad%e6%96%87%e7%9a%84response%e5%ba%94%e8%af%a5%e6%80%8e%e4%b9%88%e5%a4%84%e7%90%86/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-sun-httpserver%e5%9c%a8handler%e4%b8%ad%e8%bf%94%e5%9b%9e%e5%90%ab%e4%b8%ad%e6%96%87%e7%9a%84response%e5%ba%94%e8%af%a5%e6%80%8e%e4%b9%88%e5%a4%84%e7%90%86/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Thu, 23 Mar 2023 09:10:14 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[handler]]></category>
		<category><![CDATA[JAVA sun HttpServer]]></category>
		<category><![CDATA[中文]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13856</guid>

					<description><![CDATA[<div>
	在JAVA中，使用sun HttpServer实现一个web server的时候，使用了下面的hander类来处理请求：</div>
<div>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &#34;Helvetica Neue&#34;, Helvetica, &#34;Hiragino Sans GB&#34;, &#34;Microsoft YaHei&#34;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">static</span>&#160;<span class="hljs-class" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">class</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">MyHandler</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">implements</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">HttpHandler</span>&#160;</span>{
&#160;&#160;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&#160;&#160;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">public</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&#160;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">handle</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(HttpExchange&#160;httpExchange)</span>&#160;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&#160;IOException&#160;</span>{
&#160;&#160;&#160;&#160;String&#160;response&#160;=&#160;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&#34;test&#34;</span>;&#160;&#160;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&#160;返回固定内容</span>
&#160;&#160;&#160;&#160;httpExchange.sendResponseHeaders(<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">200</span>,&#160;response.length());
&#160;&#160;&#160;&#160;OutputStream&#160;os&#160;=&#160;httpExchange.getResponseBody();
&#160;&#160;&#160;&#160;os.write(response.getBytes());
&#160;&#160;&#160;&#160;os.close();
&#160;&#160;}
}
</code></pre>
</section>
</div>
<div>
<span id="more-13856"></span>	<br />
	如果要在response里返回带中文的内容，例如：</div>
<blockquote>
<div>
		String response = &#34;中文&#34;;</div>
</blockquote>
<div>
	那么你就会发现访问这个web server返回的是null，这说明handler实现有问题。那么代码应该怎么改？<br />
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a></div>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-sun-httpserver%e5%9c%a8handler%e4%b8%ad%e8%bf%94%e5%9b%9e%e5%90%ab%e4%b8%ad%e6%96%87%e7%9a%84response%e5%ba%94%e8%af%a5%e6%80%8e%e4%b9%88%e5%a4%84%e7%90%86/" class="read-more">Read More </a>]]></description>
										<content:encoded><![CDATA[<div>
	在JAVA中，使用sun HttpServer实现一个web server的时候，使用了下面的hander类来处理请求：</div>
<div>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">static</span>&nbsp;<span class="hljs-class" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">class</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">MyHandler</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">implements</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">HttpHandler</span>&nbsp;</span>{
&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">public</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">handle</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(HttpExchange&nbsp;httpExchange)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException&nbsp;</span>{
&nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;response&nbsp;=&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;test&quot;</span>;&nbsp;&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&nbsp;返回固定内容</span>
&nbsp;&nbsp;&nbsp;&nbsp;httpExchange.sendResponseHeaders(<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">200</span>,&nbsp;response.length());
&nbsp;&nbsp;&nbsp;&nbsp;OutputStream&nbsp;os&nbsp;=&nbsp;httpExchange.getResponseBody();
&nbsp;&nbsp;&nbsp;&nbsp;os.write(response.getBytes());
&nbsp;&nbsp;&nbsp;&nbsp;os.close();
&nbsp;&nbsp;}
}
</code></pre>
</section>
</div>
<div>
<span id="more-13856"></span>	<br />
	如果要在response里返回带中文的内容，例如：</div>
<blockquote>
<div>
		String response = &quot;中文&quot;;</div>
</blockquote>
<div>
	那么你就会发现访问这个web server返回的是null，这说明handler实现有问题。那么代码应该怎么改？<br />
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	如下：</p>
<section class="output_wrapper" id="output_wrapper_id" style="font-size: 16px; color: rgb(62, 62, 62); line-height: 1.6; letter-spacing: 0px; font-family: &quot;Helvetica Neue&quot;, Helvetica, &quot;Hiragino Sans GB&quot;, &quot;Microsoft YaHei&quot;, Arial, sans-serif;">
<pre style="font-size: inherit; color: inherit; line-height: inherit; margin-top: 0px; margin-bottom: 0px; padding: 0px;">
<code class="java language-java hljs" style="margin: 0px 2px; line-height: 18px; font-size: 14px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; color: rgb(169, 183, 198); background: rgb(40, 43, 46); padding: 0.5em; overflow-wrap: normal !important; word-break: normal !important; overflow: auto !important; display: -webkit-box !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">static</span>&nbsp;<span class="hljs-class" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">class</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">MyHandler</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">implements</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">HttpHandler</span>&nbsp;</span>{
&nbsp;&nbsp;<span class="hljs-meta" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(91, 218, 237); word-wrap: inherit !important; word-break: inherit !important;">@Override</span>
&nbsp;&nbsp;<span class="hljs-function" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;"><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">public</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">void</span>&nbsp;<span class="hljs-title" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(165, 218, 45); word-wrap: inherit !important; word-break: inherit !important;">handle</span><span class="hljs-params" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(255, 152, 35); word-wrap: inherit !important; word-break: inherit !important;">(HttpExchange&nbsp;httpExchange)</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; overflow-wrap: inherit !important; word-break: inherit !important;">throws</span>&nbsp;IOException&nbsp;</span>{
&nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;response&nbsp;=&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">&quot;中文&quot;</span>;
&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">byte</span>[]&nbsp;bytes&nbsp;=&nbsp;response.getBytes(StandardCharsets.UTF_8);&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">//&nbsp;转换为字节数组，指定编码为UTF-8</span>
&nbsp;&nbsp;&nbsp;&nbsp;httpExchange.sendResponseHeaders(<span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">200</span>,&nbsp;bytes.length);
&nbsp;&nbsp;&nbsp;&nbsp;OutputStream&nbsp;os&nbsp;=&nbsp;httpExchange.getResponseBody();
&nbsp;&nbsp;&nbsp;&nbsp;os.write(bytes);
&nbsp;&nbsp;&nbsp;&nbsp;os.close();
&nbsp;&nbsp;}
}
</code></pre>
</section>
</div>
<div>
	此时你会发现，向server发请求返回的不再是null而是正确的内容了。</p>
<p>
		<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
		<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
		转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
		感谢关注我的微信公众号（微信扫一扫）：<br />
		<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
		以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
		<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-java-sun-httpserver%e5%9c%a8handler%e4%b8%ad%e8%bf%94%e5%9b%9e%e5%90%ab%e4%b8%ad%e6%96%87%e7%9a%84response%e5%ba%94%e8%af%a5%e6%80%8e%e4%b9%88%e5%a4%84%e7%90%86/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>[原创] &quot;秘塔写作猫&quot;，想说爱你不容易</title>
		<link>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%a7%98%e5%a1%94%e5%86%99%e4%bd%9c%e7%8c%ab%ef%bc%8c%e6%83%b3%e8%af%b4%e7%88%b1%e4%bd%a0%e4%b8%8d%e5%ae%b9%e6%98%93/</link>
					<comments>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%a7%98%e5%a1%94%e5%86%99%e4%bd%9c%e7%8c%ab%ef%bc%8c%e6%83%b3%e8%af%b4%e7%88%b1%e4%bd%a0%e4%b8%8d%e5%ae%b9%e6%98%93/#respond</comments>
		
		<dc:creator><![CDATA[learnhard]]></dc:creator>
		<pubDate>Fri, 16 Dec 2022 03:17:11 +0000</pubDate>
				<category><![CDATA[原创]]></category>
		<category><![CDATA[综合]]></category>
		<category><![CDATA[AIGC]]></category>
		<category><![CDATA[AI写作助手]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[秘塔写作猫]]></category>
		<guid isPermaLink="false">https://www.codelast.com/?p=13789</guid>

					<description><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/ai_xiezuocat.png" style="width: 200px; height: 200px;" /></div>
<p>&#8220;<a href="https://xiezuocat.com/" rel="noopener" target="_blank">秘塔写作猫</a>&#8221;是一款基于 AI 的中文、英文纠错工具，它可以帮助&#8220;不会写文章&#8221;的你从0开始撰写文章大纲甚至全文。它是最近比较流行的一款AIGC产品。</p>
<blockquote>
<p>
		随着AI技术不断迭代，作为除UGC、PGC以外的一种新型生产方式，AIGC（AI generated content，人工智能创造内容）已从概念走向更多落地，它的字面意思就是AI生产内容。AI写诗、AI作曲、AI绘画、AI换脸等等。</p>
</blockquote>
<p>比如说，老板让你在很短的时间内迅速给出一份营销方案，你可能绞尽脑汁也凑不出那么多字，这个时候就可以用&#8220;秘塔写作猫&#8221;来完成这个任务。<br />
<span id="more-13789"></span><br />
AIGC的产品近一年来层出不穷，尽管我不以码字为生，但是看到秘塔写作猫这款受到很多科技媒体赞扬的产品出现的时候，我还是对它产生了很大的兴趣，于是试用了一番。</p>
<p>我认为秘塔写作猫有几个非常明显的问题：<br />
（1）善于编造有悖于事实的结论<br />
我写了一个标题：<span style="color:#0000ff;">《阿凡达2》电影票房分析</span>，让它写文章，结果给我来了这么一段：</p>
<p><img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/xiezuocat_avatar_2.png" style="width: 700px; height: 623px;" /></p>
<p>看看有多扯。2022年12月份才上映的《阿凡达2》电影，被AI说成了2015年上映，甚至连票房数据都编得像模像样的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
（2）给定了上文，它也不知道怎么写下去<br />
我写了一个文章标题：《<span style="color:#0000ff;">ChatGPT的出现，让机器替代Cambly成为了可能</span>》，又写了两段话作为文章开头：</p>
<blockquote>
<div>
		Cambly是一个和老外(native English speakers)视频聊天练习英语口语的平台。</div>
<div>
		ChatGPT是由OpenAI开发的一个人工智能聊天机器人程序，于2022年11月推出。该程序使用基于GPT-3.5架构的大型语言模型并通过强化学习进行训练。</div>
</blockquote>
<p>然后让AI接着写下去，它给出了下面的内容：</p>
<blockquote>
<div>
		<span style="color:#ff0000;">用户可以在 ChatGPT上与 Cambly进行视频聊天</span>。</div>
<div>
		ChatGPT能够在自然语言对话的同时，完成多种不同的任务，如图片、语音、文本、电话、视频和聊天。</div>
<div>
		ChatGPT是一个多模态的对话系统，通过多层感知机，可以对不同的输入进行分类。ChatGPT可以理解不同的数据流，并且拥有强大的自我学习能力，以提高其性能。</div>
<div>
		在对话系统中，通过提供自然语言生成模型来处理与特定用户相关的聊天对话内容。</div>
<div>
		ChatGPT是 OpenAI第一个可以通过训练和部署多层感知机来处理自然语言交互而无需任何其他代码的系统。</div>
<div>
		&#8220;我们是第一个真正实现了以训练模型为中心的多模态聊天机器人系统&#8221; OpenAI项目负责人 Hilary Wu表示。</div>
<div>
		&#8220;我们有一个完整且健壮且高效的多模态聊天系统和一个高效、健壮且高度可扩展的端到端架构。&#8221;</div>
<div>
		&#8220;除了 ChatGPT之外，我们还有很多其他基于多模态语言生成模型和数据驱动语言生成模型可供选择使用&#8221; OpenAI项目经理 Jay McCoy表示。</div>
<div>
		&#8220;OpenAI正在努力建立一个完整并可扩展的人机交互框架，使其成为全球领先且最为可靠且可预测且高效地开发和使用各种基于自然语言理解和生成模型（如知识表示与强化学习）来处理自然语言交互功能的平台&#8221; OpenAI项目负责人 Jay McCoy说。</div>
<div>
		<span style="color:#ff0000;">作为该领域最具影响力和开创性之一的开发项目之一， Cambly在短短几年内就取得了巨大成功：这是一款完全开源且无需代码即可运行于开源社区之上，为用户提供丰富自然语言交互功能的机器人应用程序。</span></div>
<div>
		<span style="color:#ff0000;">然而，该程序最近在新冠肺炎疫情下失去了其客户群体，这意味着许多企业现在都已重新采用 Cambly来完成他们业务方面的交流任务。</span></div>
</blockquote>
<p>我只能说太离谱了，这结果根本没法用。我在文章的开头，已经告诉了AI：Cambly是一个真人英语学习平台，但它根本就无视这个重要内容，而且还认为它是一个&#8220;机器人应用程序&#8221;，后面越编越离谱。</p>
<p>AI能给出的结论有多真，一是取决于其模型训练使用的数据有多新，二是取决于其使用的模型结构/范式。<br />
如果想让AI把互联网上所有的数据拿来训练是不可能的，因此具有反馈交互学习的能力非常重要，这就是强化学习的模型结构/范式。<br />
我不负责任地猜测，秘塔写作猫这套系统没有使用强化学习技术，只是基于传统机器学习模型构建。<br />
使用了强化学习技术的先进AI，会像ChatGPT一样，具有一定的交互学习能力。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a>&#8230; <a href="https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%a7%98%e5%a1%94%e5%86%99%e4%bd%9c%e7%8c%ab%ef%bc%8c%e6%83%b3%e8%af%b4%e7%88%b1%e4%bd%a0%e4%b8%8d%e5%ae%b9%e6%98%93/" class="read-more">Read More </a></p>]]></description>
										<content:encoded><![CDATA[<div style="text-align: center;">
	<img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/ai_xiezuocat.png" style="width: 200px; height: 200px;" /></div>
<p>&ldquo;<a href="https://xiezuocat.com/" rel="noopener" target="_blank">秘塔写作猫</a>&rdquo;是一款基于 AI 的中文、英文纠错工具，它可以帮助&ldquo;不会写文章&rdquo;的你从0开始撰写文章大纲甚至全文。它是最近比较流行的一款AIGC产品。</p>
<blockquote>
<p>
		随着AI技术不断迭代，作为除UGC、PGC以外的一种新型生产方式，AIGC（AI generated content，人工智能创造内容）已从概念走向更多落地，它的字面意思就是AI生产内容。AI写诗、AI作曲、AI绘画、AI换脸等等。</p>
</blockquote>
<p>比如说，老板让你在很短的时间内迅速给出一份营销方案，你可能绞尽脑汁也凑不出那么多字，这个时候就可以用&ldquo;秘塔写作猫&rdquo;来完成这个任务。<br />
<span id="more-13789"></span><br />
AIGC的产品近一年来层出不穷，尽管我不以码字为生，但是看到秘塔写作猫这款受到很多科技媒体赞扬的产品出现的时候，我还是对它产生了很大的兴趣，于是试用了一番。</p>
<p>我认为秘塔写作猫有几个非常明显的问题：<br />
（1）善于编造有悖于事实的结论<br />
我写了一个标题：<span style="color:#0000ff;">《阿凡达2》电影票房分析</span>，让它写文章，结果给我来了这么一段：</p>
<p><img decoding="async" alt="" src="https://www.codelast.com/wp-content/uploads/ckfinder/images/xiezuocat_avatar_2.png" style="width: 700px; height: 623px;" /></p>
<p>看看有多扯。2022年12月份才上映的《阿凡达2》电影，被AI说成了2015年上映，甚至连票房数据都编得像模像样的。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
（2）给定了上文，它也不知道怎么写下去<br />
我写了一个文章标题：《<span style="color:#0000ff;">ChatGPT的出现，让机器替代Cambly成为了可能</span>》，又写了两段话作为文章开头：</p>
<blockquote>
<div>
		Cambly是一个和老外(native English speakers)视频聊天练习英语口语的平台。</div>
<div>
		ChatGPT是由OpenAI开发的一个人工智能聊天机器人程序，于2022年11月推出。该程序使用基于GPT-3.5架构的大型语言模型并通过强化学习进行训练。</div>
</blockquote>
<p>然后让AI接着写下去，它给出了下面的内容：</p>
<blockquote>
<div>
		<span style="color:#ff0000;">用户可以在 ChatGPT上与 Cambly进行视频聊天</span>。</div>
<div>
		ChatGPT能够在自然语言对话的同时，完成多种不同的任务，如图片、语音、文本、电话、视频和聊天。</div>
<div>
		ChatGPT是一个多模态的对话系统，通过多层感知机，可以对不同的输入进行分类。ChatGPT可以理解不同的数据流，并且拥有强大的自我学习能力，以提高其性能。</div>
<div>
		在对话系统中，通过提供自然语言生成模型来处理与特定用户相关的聊天对话内容。</div>
<div>
		ChatGPT是 OpenAI第一个可以通过训练和部署多层感知机来处理自然语言交互而无需任何其他代码的系统。</div>
<div>
		&ldquo;我们是第一个真正实现了以训练模型为中心的多模态聊天机器人系统&rdquo; OpenAI项目负责人 Hilary Wu表示。</div>
<div>
		&ldquo;我们有一个完整且健壮且高效的多模态聊天系统和一个高效、健壮且高度可扩展的端到端架构。&rdquo;</div>
<div>
		&ldquo;除了 ChatGPT之外，我们还有很多其他基于多模态语言生成模型和数据驱动语言生成模型可供选择使用&rdquo; OpenAI项目经理 Jay McCoy表示。</div>
<div>
		&ldquo;OpenAI正在努力建立一个完整并可扩展的人机交互框架，使其成为全球领先且最为可靠且可预测且高效地开发和使用各种基于自然语言理解和生成模型（如知识表示与强化学习）来处理自然语言交互功能的平台&rdquo; OpenAI项目负责人 Jay McCoy说。</div>
<div>
		<span style="color:#ff0000;">作为该领域最具影响力和开创性之一的开发项目之一， Cambly在短短几年内就取得了巨大成功：这是一款完全开源且无需代码即可运行于开源社区之上，为用户提供丰富自然语言交互功能的机器人应用程序。</span></div>
<div>
		<span style="color:#ff0000;">然而，该程序最近在新冠肺炎疫情下失去了其客户群体，这意味着许多企业现在都已重新采用 Cambly来完成他们业务方面的交流任务。</span></div>
</blockquote>
<p>我只能说太离谱了，这结果根本没法用。我在文章的开头，已经告诉了AI：Cambly是一个真人英语学习平台，但它根本就无视这个重要内容，而且还认为它是一个&ldquo;机器人应用程序&rdquo;，后面越编越离谱。</p>
<p>AI能给出的结论有多真，一是取决于其模型训练使用的数据有多新，二是取决于其使用的模型结构/范式。<br />
如果想让AI把互联网上所有的数据拿来训练是不可能的，因此具有反馈交互学习的能力非常重要，这就是强化学习的模型结构/范式。<br />
我不负责任地猜测，秘塔写作猫这套系统没有使用强化学习技术，只是基于传统机器学习模型构建。<br />
使用了强化学习技术的先进AI，会像ChatGPT一样，具有一定的交互学习能力。<br />
<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
当然，我使用的是免费版的&ldquo;秘塔写作猫&rdquo;程序，从官方的价格表中可以看到，收费越高，AI能力就越强。所以如果我使用收费版的，不排除结果会变好一些。<br />
最后，我没有做过严格、全面的测试，仅仅用来对我想写的一些东西浅测了一下。所以我认为，对于一般的写作需求，它可能可以较好地完成任务，但是人必须花很多时间去校对结果，否则可能会闹出大笑话。</p>
<p>
	<span style="color: rgb(255, 255, 255);">文章来源：</span><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><span style="color: rgb(255, 255, 255);">https://www.codelast.com/</span></a><br />
	<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;版权声明&nbsp;<span style="color: rgb(255, 0, 0);">➤➤</span>&nbsp;<br />
	转载需注明出处：<u><a href="https://www.codelast.com/" rel="noopener noreferrer" target="_blank"><em><span style="color: rgb(0, 0, 255);"><strong style="font-size: 16px;"><span style="font-family: arial, helvetica, sans-serif;">codelast.com</span></strong></span></em></a></u>&nbsp;<br />
	感谢关注我的微信公众号（微信扫一扫）：<br />
	<img decoding="async" alt="wechat qrcode of codelast" src="https://www.codelast.com/codelast_wechat_qr_code.jpg" style="color: rgb(77, 77, 77); font-size: 13px; width: 200px; height: 200px;" /><br />
	以及我的微信视频号：</p>
<p style="border: 0px; font-size: 13px; margin: 0px 0px 9px; outline: 0px; padding: 0px; color: rgb(77, 77, 77);">
	<img decoding="async" alt="" src="https://www.codelast.com/wechat_shipinhao_qr_code.jpg" style="text-align: center; width: 200px; height: 199px;" /></p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.codelast.com/%e5%8e%9f%e5%88%9b-%e7%a7%98%e5%a1%94%e5%86%99%e4%bd%9c%e7%8c%ab%ef%bc%8c%e6%83%b3%e8%af%b4%e7%88%b1%e4%bd%a0%e4%b8%8d%e5%ae%b9%e6%98%93/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
