[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-dev
Subject: [jira] Created: (HADOOP-5367) After some jobs have finished,
From: "Thibaut (JIRA)" <jira () apache ! org>
Date: 2009-02-28 13:10:13
Message-ID: 860192477.1235826613394.JavaMail.jira () brutus
[Download RAW message or body]
After some jobs have finished, Reducer will run new job's reduce tasks sequentially \
and not in parallel (mapred.JobTracker: Serious problem. While updating status, \
cannot find taskid...)
-------------------------------------------------------------------------------------- \
-----------------------------------------------------------------------------------------------------
Key: HADOOP-5367
URL: https://issues.apache.org/jira/browse/HADOOP-5367
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.19.1
Environment: State: RUNNING
Started: Fri Feb 27 17:00:07 CET 2009
Version: 0.19.1, r745977
Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
Reporter: Thibaut
Priority: Critical
Hi,
After I while, my cluster will only run the reduce tasks sequentially (each reducer \
running on the same node), the other nodes stay empty. The map phase however will run \
the jobs on all the nodes. This happens in my cluster after about 160 successfully \
completed jobs. (Some jobs have reducer set to 0!). As possible solution I have to \
restart the mapreduce service.
I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 because \
of the multipleoutput bug when setting reducers to 0.
Anoter site node which might be related. I also tried running the jobs with \
speculative execution set to on. My cluster would always hold back one reducer and \
only run it (in multiple instances) after the first of the other 6 reducers had \
finished, instead of launching all of them at the same time.
Below is a short extract from related logfile. It's full of these kind of entries.
09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating status, \
cannot find taskid attempt_200902271700_0051_r_000006_1 09/02/28 12:48:08 INFO \
mapred.JobTracker: Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0041_r_000002_1 09/02/28 12:48:08 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0083_r_000006_1 09/02/28 12:48:08 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0041_r_000005_1 09/02/28 12:48:10 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0105_r_000006_1 09/02/28 12:48:10 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0102_r_000006_1 09/02/28 12:48:12 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0051_r_000006_1 09/02/28 12:48:13 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0041_r_000002_1 09/02/28 12:48:13 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0083_r_000006_1 09/02/28 12:48:13 INFO mapred.JobTracker: \
Serious problem. While updating status, cannot find taskid \
attempt_200902271700_0041_r_000005_1
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic