[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-user
Subject: Re: Simple MapReduce logic using Java API
From: Shahab Yunus <shahab.yunus () gmail ! com>
Date: 2015-03-31 18:21:45
Message-ID: CAEo-6+RqHBbEXL6RYKkNZpqcd9SFGWN1Dyw4qyn3nRtFyxGmbQ () mail ! gmail ! com
[Download RAW message or body]
What is the reason of using the queue?
"job.getConfiguration().set("mapred.job.queue.name", "exp_dsa");"
Is your mapper or reducer even been called?
Try adding the override annotation to the map/reduce methods as below:
@Override
public void map(Object key, Text value, Context context) throws
IOException, InterruptedException {
Regards,
Shahab
On Tue, Mar 31, 2015 at 3:26 AM, bradford li <bradfordli@gmail.com> wrote:
> I'm not sure why my Mapper and Reducer have no output. The logic behind my
> code is, given a file of UUIDs (new line separated), I want to use
> `globStatus` to display all the paths to all potential files that the UUID
> might be in. Open and read the file. Each file contains 1-n lines of JSON.
> The UUID is in `event_header.event_id` in the JSON.
>
> Right now the MapReduce job runs without errors. However, something is
> wrong because I dont have any output. I'm not sure how to debug MapReduce
> jobs as well. If someone could provide me a source that would be awesome!
> The expected output from this program should be
>
> UUID_1 1
> UUID_2 1
> UUID_3 1
> UUID_4 1
> ...
> ...
> UUID_n 1
>
> In my logic, the output file should be the UUIDs with a 1 next to them
> because upon found, 1 is written, if not found 0 is written. They should be
> all 1's because I pulled the UUIDs from the source.
>
> My Reducer currently does not do anything except I just wanted to see if I
> could get some simple logic working. There are most likely bugs in my code
> as I dont know have a easy way to debug MapReduce jobs
>
> Driver:
>
> public class SearchUUID {
>
> public static void main(String[] args) throws Exception {
> Configuration conf = new Configuration();
> Job job = Job.getInstance(conf, "UUID Search");
> job.getConfiguration().set("mapred.job.queue.name",
> "exp_dsa");
> job.setJarByClass(SearchUUID.class);
> job.setMapperClass(UUIDMapper.class);
> job.setReducerClass(UUIDReducer.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(Text.class);
> FileInputFormat.addInputPath(job, new Path(args[0]));
> FileOutputFormat.setOutputPath(job, new Path(args[1]));
> System.exit(job.waitForCompletion(true) ? 0 : 1);
> }
> }
>
>
> UUIDMapper:
>
> public class UUIDMapper extends Mapper<Object, Text, Text, Text> {
> public void map(Object key, Text value, Context context) throws
> IOException, InterruptedException {
>
> try {
> Text one = new Text("1");
> Text zero = new Text("0");
>
> FileSystem fs = FileSystem.get(new Configuration());
> FileStatus[] paths = fs.globStatus(new
> Path("/data/path/to/file/d_20150330-1650"));
> for (FileStatus path : paths) {
> BufferedReader br = new BufferedReader(new
> InputStreamReader(fs.open(path.getPath())));
> String json_string = br.readLine();
> while (json_string != null) {
> JsonElement jelement = new
> JsonParser().parse(json_string);
> JsonObject jsonObject = jelement.getAsJsonObject();
> jsonObject =
> jsonObject.getAsJsonObject("event_header");
> jsonObject =
> jsonObject.getAsJsonObject("event_id");
>
> if
> (value.toString().equals(jsonObject.getAsString())) {
> System.out.println(value.toString() +
> "slkdjfksajflkjsfdkljsadfk;ljasklfjklasjfklsadl;sjdf");
> context.write(value, one);
> } else {
> context.write(value, zero);
> }
>
> json_string = br.readLine();
> }
> }
> } catch (IOException failed) {
> }
> }
> }
>
>
> Reducer:
>
> public class UUIDReducer extends Reducer<Text, Text, Text, Text>{
>
> public void reduce(Text key, Text value, Context context) throws
> IOException, InterruptedException{
> context.write(key, value);
> }
> }
>
>
[Attachment #3 (text/html)]
<div dir="ltr">What is the reason of using the queue?<br><div><span \
style="font-size:12.8000001907349px">"job.getConfiguration().set("</span><a \
href="http://mapred.job.queue.name/" target="_blank" \
style="font-size:12.8000001907349px">mapred.job.queue.name</a><span \
style="font-size:12.8000001907349px">", \
"exp_dsa");"</span></div><div><br></div><div>Is your mapper or reducer \
even been called?<div><br></div><div>Try adding the override annotation to the \
map/reduce methods as below:</div><div><br></div><div><span \
style="font-size:12.8000001907349px">@Override<br> public void map(Object key, Text \
value, Context context) throws IOException, InterruptedException \
{</span><br><div><br></div><div>Regards,</div><div>Shahab</div></div></div></div><div \
class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 31, 2015 at 3:26 AM, \
bradford li <span dir="ltr"><<a href="mailto:bradfordli@gmail.com" \
target="_blank">bradfordli@gmail.com</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><div>I'm not sure why my Mapper and \
Reducer have no output. The logic behind my code is, given a file of UUIDs (new line \
separated), I want to use `globStatus` to display all the paths to all potential \
files that the UUID might be in. Open and read the file. Each file contains 1-n lines \
of JSON. The UUID is in `event_header.event_id` in the \
JSON.</div><div><br></div><div>Right now the MapReduce job runs without errors. \
However, something is wrong because I dont have any output. I'm not sure how to \
debug MapReduce jobs as well. If someone could provide me a source that would be \
awesome! The expected output from this program should be </div><div><br></div><div> \
UUID_1 1</div><div> UUID_2 1</div><div> UUID_3 1</div><div> UUID_4 \
1</div><div> ...</div><div> ...</div><div> UUID_n \
1</div><div><br></div><div>In my logic, the output file should be the UUIDs with a 1 \
next to them because upon found, 1 is written, if not found 0 is written. They should \
be all 1's because I pulled the UUIDs from the \
source.</div><div><br></div><div>My Reducer currently does not do anything except I \
just wanted to see if I could get some simple logic working. There are most likely \
bugs in my code as I dont know have a easy way to debug MapReduce \
jobs</div><div><br></div><div>Driver:</div><div><br></div><div> public class \
SearchUUID {</div><div> </div><div> public static void main(String[] \
args) throws Exception {</div><div> Configuration conf = new \
Configuration();</div><div> Job job = Job.getInstance(conf, \
"UUID Search");</div><div> \
job.getConfiguration().set("<a href="http://mapred.job.queue.name" \
target="_blank">mapred.job.queue.name</a>", "exp_dsa");</div><div> \
job.setJarByClass(SearchUUID.class);</div><div> \
job.setMapperClass(UUIDMapper.class);</div><div> \
job.setReducerClass(UUIDReducer.class);</div><div> \
job.setOutputKeyClass(Text.class);</div><div> \
job.setOutputValueClass(Text.class);</div><div> \
FileInputFormat.addInputPath(job, new Path(args[0]));</div><div> \
FileOutputFormat.setOutputPath(job, new Path(args[1]));</div><div> \
System.exit(job.waitForCompletion(true) ? 0 : 1);</div><div> }</div><div> \
}</div><div><br></div><div><br></div><div>UUIDMapper:</div><div><br></div><div> \
public class UUIDMapper extends Mapper<Object, Text, Text, Text> {</div><div> \
public void map(Object key, Text value, Context context) throws IOException, \
InterruptedException {</div><div> </div><div> try {</div><div> \
Text one = new Text("1");</div><div> Text zero = new \
Text("0");</div><div> </div><div> FileSystem \
fs = FileSystem.get(new Configuration());</div><div> \
FileStatus[] paths = fs.globStatus(new \
Path("/data/path/to/file/d_20150330-1650"));</div><div> \
for (FileStatus path : paths) {</div><div> \
BufferedReader br = new BufferedReader(new \
InputStreamReader(fs.open(path.getPath())));</div><div> \
String json_string = br.readLine();</div><div> while \
(json_string != null) {</div><div> JsonElement \
jelement = new JsonParser().parse(json_string);</div><div> \
JsonObject jsonObject = jelement.getAsJsonObject();</div><div> \
jsonObject = jsonObject.getAsJsonObject("event_header");</div><div> \
jsonObject = jsonObject.getAsJsonObject("event_id");</div><div> \
</div><div> if \
(value.toString().equals(jsonObject.getAsString())) {</div><div> \
System.out.println(value.toString() + \
"slkdjfksajflkjsfdkljsadfk;ljasklfjklasjfklsadl;sjdf");</div><div> \
context.write(value, one);</div><div> } else \
{</div><div> context.write(value, \
zero);</div><div> }</div><div><br></div><div> \
json_string = br.readLine();</div><div> }</div><div> \
}</div><div> } catch (IOException failed) {</div><div> \
}</div><div> }</div><div> \
}</div><div><br></div><div><br></div><div>Reducer:</div><div><br></div><div> \
public class UUIDReducer extends Reducer<Text, Text, Text, Text>{</div><div> \
</div><div> public void reduce(Text key, Text value, Context context) \
throws IOException, InterruptedException{</div><div> \
context.write(key, value);</div><div> }</div><div> \
}</div><div><br></div></div> </blockquote></div><br></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic