[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    RE: An issue in MapReduce Tutorial
From:       Gino Gu01 <Gino_Gu01 () infosys ! com>
Date:       2014-11-24 9:42:07
Message-ID: 5A819A5998DDA542AB3436F57CD47B33481E4DAA () PRCSGIMBX11 ! ad ! infosys ! com
[Download RAW message or body]

Adding  the link to the tutorial.
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html


From: Gino Gu01
Sent: Monday, November 24, 2014 5:29 PM
To: 'user@hadoop.apache.org'
Subject: An issue in MapReduce Tutorial

There is one bug in WordCount v2.0 which is part of MapReduce Tutorial.
How to reproduce:

Run the application:

$ bin/hadoop jar wc.jar WordCount2 /user/joe/wordcount/input \
/user/joe/wordcount/output It will throw Null Pointer Exception during map phase.

Reason:
Below highlighted line set the default value of wordcount.skip.patterns to true.
But in the arguments we didn't pass the any patterns file, so the line for (URI \
patternsURI : patternsURIs) throws exception.  public void setup(Context context) \
throws IOException,  InterruptedException {
      conf = context.getConfiguration();
      caseSensitive = conf.getBoolean("wordcount.case.sensitive", true);
      if (conf.getBoolean("wordcount.skip.patterns", true)) {
        URI[] patternsURIs = Job.getInstance(conf).getCacheFiles();
        for (URI patternsURI : patternsURIs) {
          Path patternsPath = new Path(patternsURI.getPath());
          String patternsFileName = patternsPath.getName().toString();
          parseSkipFile(patternsFileName);
        }
      }
}

How to fix it:
Change above highlighted line to
conf.getBoolean("wordcount.skip.patterns", false))


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***


[Attachment #3 (text/html)]

<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:"\@SimSun";
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:Verdana;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:Consolas;
	panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	line-height:normal;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:#0563C1;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:#954F72;
	text-decoration:underline;}
p
	{mso-style-priority:99;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	line-height:15.6pt;
	font-size:9.0pt;
	font-family:"Times New Roman","serif";
	color:black;}
tt
	{mso-style-priority:99;
	font-family:"Courier New";}
span.EmailStyle19
	{mso-style-type:personal;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
span.EmailStyle20
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#44546A;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#44546A">Adding &nbsp;the link to the \
tutorial.<o:p></o:p></span></p> <p class="MsoNormal"><span style="color:#44546A"><a \
href="http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-c \
lient-core/MapReduceTutorial.html">http://hadoop.apache.org/docs/current/hadoop-mapred \
uce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html</a><o:p></o:p></span></p>
 <p class="MsoNormal"><span style="color:#44546A"><o:p>&nbsp;</o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> Gino \
Gu01 <br>
<b>Sent:</b> Monday, November 24, 2014 5:29 PM<br>
<b>To:</b> 'user@hadoop.apache.org'<br>
<b>Subject:</b> An issue in MapReduce Tutorial<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">There is one bug in WordCount v2.0 which is part of MapReduce \
Tutorial.<o:p></o:p></p> <p class="MsoNormal"><b>How to reproduce:<o:p></o:p></b></p>
<p><span style="font-family:&quot;Verdana&quot;,&quot;sans-serif&quot;">Run the \
application:<o:p></o:p></span></p> <p><tt><span style="font-size:10.0pt">$ bin/hadoop \
jar wc.jar WordCount2 /user/joe/wordcount/input \
/user/joe/wordcount/output</span></tt><span \
style="font-family:&quot;Verdana&quot;,&quot;sans-serif&quot;"><o:p></o:p></span></p> \
<p class="MsoNormal">It will throw Null Pointer Exception during map \
phase.<o:p></o:p></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><b>Reason:<o:p></o:p></b></p>
<p class="MsoNormal">Below highlighted line set the default value of<b> </b><span \
style="font-size:10.0pt;font-family:Consolas;color:#2A00FF;background:yellow;mso-highlight:yellow">wordcount.skip.patterns</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#2A00FF"> </span>to \
true.<o:p></o:p></p> <p class="MsoNormal">But in the arguments we didn&#8217;t pass \
the any patterns file, so the line <b><span \
style="font-size:10.0pt;font-family:Consolas;color:#7F0055">for</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> (URI patternsURI : \
patternsURIs) throws exception.</span><b><o:p></o:p></b></p> <p class="MsoNormal" \
style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp; \
</span><b><span style="font-size:10.0pt;font-family:Consolas;color:#7F0055">public</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> </span><b><span \
style="font-size:10.0pt;font-family:Consolas;color:#7F0055">void</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> setup(Context context) \
</span><b><span style="font-size:10.0pt;font-family:Consolas;color:#7F0055">throws</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> IOException,</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
InterruptedException {</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 </span><span style="font-size:10.0pt;font-family:Consolas;color:#0000C0">conf</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> = \
context.getConfiguration();</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 </span><span style="font-size:10.0pt;font-family:Consolas;color:#0000C0">caseSensitive</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> = </span><span \
style="font-size:10.0pt;font-family:Consolas;color:#0000C0">conf</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">.getBoolean(</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#2A00FF">&quot;wordcount.case.sensitive&quot;</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">, </span><b><span \
style="font-size:10.0pt;font-family:Consolas;color:#7F0055">true</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black">);</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 </span><b><span style="font-size:10.0pt;font-family:Consolas;color:#7F0055">if</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> (</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#0000C0;background:yellow;mso-highlight:yellow">conf</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black;background:yellow;mso-highlight:yellow">.getBoolean(</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#2A00FF;background:yellow;mso-highlight:yellow">&quot;wordcount.skip.patterns&quot;</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black;background:yellow;mso-highlight:yellow">,
 </span><b><span style="font-size:10.0pt;font-family:Consolas;color:#7F0055;background:yellow;mso-highlight:yellow">true</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black;background:yellow;mso-highlight:yellow">))</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">  {</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
URI[] patternsURIs = Job.<i>getInstance</i>(</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#0000C0">conf</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">).getCacheFiles();</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 </span><b><span style="font-size:10.0pt;font-family:Consolas;color:#7F0055">for</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> (URI patternsURI : \
patternsURIs) {</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
Path patternsPath = </span><b><span \
style="font-size:10.0pt;font-family:Consolas;color:#7F0055">new</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black"> \
Path(patternsURI.getPath());</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
String patternsFileName = patternsPath.getName().toString();</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
parseSkipFile(patternsFileName);</span><span \
style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
}</span><span style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-autospace:none"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
}</span><span style="font-size:10.0pt;font-family:Consolas"><o:p></o:p></span></p> <p \
class="MsoNormal" style="text-indent:21.0pt"><span \
style="font-size:10.0pt;font-family:Consolas;color:black">}<o:p></o:p></span></p> <p \
class="MsoNormal"><span \
style="font-size:10.0pt;font-family:Consolas;color:black"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><b>How to fix it: <o:p></o:p></b></p>
<p class="MsoNormal">Change above highlighted line to <o:p></o:p></p>
<p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:Consolas;color:#0000C0">conf</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">.getBoolean(</span><span \
style="font-size:10.0pt;font-family:Consolas;color:#2A00FF">&quot;wordcount.skip.patterns&quot;</span><span \
style="font-size:10.0pt;font-family:Consolas;color:black">, </span><b><span \
style="font-size:10.0pt;font-family:Consolas;color:#7F0055;background:yellow;mso-highlight:yellow">false</span></b><span \
style="font-size:10.0pt;font-family:Consolas;color:black">))<o:p></o:p></span></p> <p \
class="MsoNormal" style="text-indent:21.0pt"><o:p>&nbsp;</o:p></p> </div>
</body>
</html>

<table><tr><td bgcolor=#ffffff><font color=#000000><pre>**************** CAUTION - \
Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL \
INFORMATION intended solely for the use of the addressee(s). If you are not the \
intended recipient, please notify the sender by e-mail and delete the original \
message. Further, you are not to copy, disclose, or distribute this e-mail or its \
contents to any other person and any such actions are unlawful. This e-mail may \
contain viruses. Infosys has taken every reasonable precaution to minimize this risk, \
but is not liable for any damage you may sustain as a result of any virus in this \
e-mail. You should carry out your own virus checks before opening the e-mail or \
attachment. Infosys reserves the right to monitor and review the content of all \
messages sent to or from this e-mail address. Messages sent to or from this e-mail \
address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer \
********INFOSYS***</pre></font></td></tr></table>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic