'Re: How does reducer get intermediate output?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    Re: How does reducer get intermediate output?
From:       Inifok Song <hadoop.inifok () gmail ! com>
Date:       2009-08-27 6:40:49
Message-ID: 3b1311780908262340i3cbdd80er9f9568ec6a46de60 () mail ! gmail ! com
[Download RAW message or body]


Hello Harish,

I find taskLogUrl.openConnection() often cause IOException. And I suspect
that the connection pool is too small. Could you tell me how can I get
settings of jetty for hadoop?

Thank you.

Inifok

2009/8/27 Harish Mallipeddi <harish.mallipeddi@gmail.com>

> On Thu, Aug 27, 2009 at 8:34 AM, inifok.song <hadoop.inifok@gmail.com
> >wrote:
>
> > Hi all,
> >
> > In my cluster, the reducer often can't fetch mapper's output. I know
> there
> > are many reasons for this situation. And I think it's necessary to find
> out
> > how does reducer get intermediate output. I have read the source code.
> > However, I'm not clear about the whole process. Could you tell me the
> > process of it? How does each node communicate with each other and how
> does
> > class ReduceCopier work?
> >
> > Thank you.
> >
> > Inifok
> >
>
> Each TaskTracker runs a Jetty webserver which is responsible for serving
> requests for intermediate map-outputs. The ReduceTask process receives
> notifications regarding completed MapTasks from its TaskTracker (which in
> turn receives that info from the JobTracker). Once it receives these
> notifications, the ReduceTask will start fetching these map-outputs via
> HTTP
> by requesting the corresponding TT's Jetty webserver.
>
> --
> Harish Mallipeddi
> http://blog.poundbang.in
>


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic