[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-commits
Subject: svn commit: r652207 - in /hadoop/core/branches/branch-0.16: CHANGES.txt
From: mukund () apache ! org
Date: 2008-04-30 0:52:50
Message-ID: 20080430005250.6A77B23889F5 () eris ! apache ! org
[Download RAW message or body]
Author: mukund
Date: Tue Apr 29 17:52:49 2008
New Revision: 652207
URL: http://svn.apache.org/viewvc?rev=652207&view=rev
Log:
HADOOP-3304. [HOD] Fixes the way the logcondense.py utility searches for log files \
that need to be deleted. (yhemanth via mukund)
Modified:
hadoop/core/branches/branch-0.16/CHANGES.txt
hadoop/core/branches/branch-0.16/src/contrib/hod/support/logcondense.py
hadoop/core/branches/branch-0.16/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
Modified: hadoop/core/branches/branch-0.16/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.16/CHANGES.txt?rev=652207&r1=652206&r2=652207&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.16/CHANGES.txt (original)
+++ hadoop/core/branches/branch-0.16/CHANGES.txt Tue Apr 29 17:52:49 2008
@@ -10,6 +10,9 @@
HADOOP-3294. Fix distcp to check the destination length and retry the copy
if it doesn't match the src length. (Tsz Wo (Nicholas), SZE via mukund)
+ HADOOP-3304. [HOD] Fixes the way the logcondense.py utility searches for log
+ files that need to be deleted. (yhemanth via mukund)
+
Release 0.16.3 - 2008-04-16
BUG FIXES
Modified: hadoop/core/branches/branch-0.16/src/contrib/hod/support/logcondense.py
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.16/src/contrib/hod/support/logcondense.py?rev=652207&r1=652206&r2=652207&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.16/src/contrib/hod/support/logcondense.py \
(original)
+++ hadoop/core/branches/branch-0.16/src/contrib/hod/support/logcondense.py Tue Apr \
29 17:52:49 2008 @@ -1,3 +1,5 @@
+#!/bin/sh
+
#Licensed to the Apache Software Foundation (ASF) under one
#or more contributor license agreements. See the NOTICE file
#distributed with this work for additional information
@@ -13,7 +15,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
-#!/bin/sh
""":"
work_dir=$(dirname $0)
base_name=$(basename $0)
@@ -84,8 +85,8 @@
'action' : "store",
'dest' : "log",
'metavar' : " ",
- 'default' : "/user/hod/logs",
- 'help' : "directory where the logs are stored"},
+ 'default' : "/user",
+ 'help' : "directory prefix under which logs are stored per user"},
{'short' : "-n",
'long' : "--dynamicdfs",
@@ -118,57 +119,64 @@
deletedNamePrefixes.append('1-tasktracker-*')
deletedNamePrefixes.append('0-datanode-*')
- cmd = getDfsCommand(options, "-lsr " + options.log)
+ filepath = '%s/\*/hod-logs/' % (options.log)
+ cmd = getDfsCommand(options, "-lsr " + filepath)
(stdin, stdout, stderr) = popen3(cmd)
lastjobid = 'none'
toPurge = { }
for line in stdout:
- m = re.match("^(.*?)\s.*$", line)
- filename = m.group(1)
- # file name format: \
<prefix>/<user>/hod-logs/<jobid>/[0-1]-[jobtracker|tasktracker|datanode|namenode|]-hostname-YYYYMMDDtime-random.tar.gz
- # first strip prefix:
- if filename.startswith(options.log):
- filename = filename.lstrip(options.log)
- if not filename.startswith('/'):
- filename = '/' + filename
- else:
- continue
-
- # Now get other details from filename.
- k = re.match("/(.*)/.*/(.*)/.*-.*-([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9]).*$", \
filename)
- if k:
- username = k.group(1)
- jobid = k.group(2)
- datetimefile = datetime(int(k.group(3)), int(k.group(4)), int(k.group(5)))
- datetimenow = datetime.utcnow()
- diff = datetimenow - datetimefile
- filedate = k.group(3) + k.group(4) + k.group(5)
- newdate = datetimenow.strftime("%Y%m%d")
- print "%s %s %s %d" % (filename, filedate, newdate, diff.days)
-
- # if the cluster is used to bring up dynamic dfs, we must also leave NameNode \
logs.
- foundFilteredName = False
- for name in filteredNames:
- if filename.find(name) >= 0:
- foundFilteredName = True
- break
-
- if foundFilteredName:
+ try:
+ m = re.match("^(.*?)\s.*$", line)
+ filename = m.group(1)
+ # file name format: \
<prefix>/<user>/hod-logs/<jobid>/[0-1]-[jobtracker|tasktracker|datanode|namenode|]-hostname-YYYYMMDDtime-random.tar.gz
+ # first strip prefix:
+ if filename.startswith(options.log):
+ filename = filename.lstrip(options.log)
+ if not filename.startswith('/'):
+ filename = '/' + filename
+ else:
continue
-
- if (diff.days > options.days):
- desttodel = filename
- if not toPurge.has_key(jobid):
- toPurge[jobid] = options.log.rstrip("/") + "/" + username + "/hod-logs/" + \
jobid +
+ # Now get other details from filename.
+ k = re.match("/(.*)/hod-logs/(.*)/.*-.*-([0-9][0-9][0-9][0-9])([0-9][0-9])([0-9][0-9]).*$", \
filename) + if k:
+ username = k.group(1)
+ jobid = k.group(2)
+ datetimefile = datetime(int(k.group(3)), int(k.group(4)), int(k.group(5)))
+ datetimenow = datetime.utcnow()
+ diff = datetimenow - datetimefile
+ filedate = k.group(3) + k.group(4) + k.group(5)
+ newdate = datetimenow.strftime("%Y%m%d")
+ print "%s %s %s %d" % (filename, filedate, newdate, diff.days)
+
+ # if the cluster is used to bring up dynamic dfs, we must also leave \
NameNode logs. + foundFilteredName = False
+ for name in filteredNames:
+ if filename.find(name) >= 0:
+ foundFilteredName = True
+ break
+
+ if foundFilteredName:
+ continue
+
+ if (diff.days > options.days):
+ desttodel = filename
+ if not toPurge.has_key(jobid):
+ toPurge[jobid] = options.log.rstrip("/") + "/" + username + "/hod-logs/" \
+ jobid + except Exception, e:
+ print >> sys.stderr, e
for job in toPurge.keys():
- for prefix in deletedNamePrefixes:
- cmd = getDfsCommand(options, "-rm " + toPurge[job] + '/' + prefix)
- print cmd
- ret = 0
- ret = os.system(cmd)
- if (ret != 0):
- print >> sys.stderr, "Command failed to delete file " + cmd
+ try:
+ for prefix in deletedNamePrefixes:
+ cmd = getDfsCommand(options, "-rm " + toPurge[job] + '/' + prefix)
+ print cmd
+ ret = 0
+ ret = os.system(cmd)
+ if (ret != 0):
+ print >> sys.stderr, "Command failed to delete file " + cmd
+ except Exception, e:
+ print >> sys.stderr, e
def process_args():
Modified: hadoop/core/branches/branch-0.16/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.16/src/docs/src/docume \
ntation/content/xdocs/hod_admin_guide.xml?rev=652207&r1=652206&r2=652207&view=diff \
==============================================================================
--- hadoop/core/branches/branch-0.16/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml \
(original)
+++ hadoop/core/branches/branch-0.16/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml \
Tue Apr 29 17:52:49 2008 @@ -1,238 +1,318 @@
-<?xml version="1.0"?>
-
-<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
- "http://forrest.apache.org/dtd/document-v20.dtd">
-
-
-<document>
-
- <header>
- <title>
- Hadoop On Demand
- </title>
- </header>
-
- <body>
-<section>
-<title>Overview</title>
-
-<p>The Hadoop On Demand (HOD) project is a system for provisioning and
-managing independent Hadoop MapReduce and HDFS instances on a shared cluster
-of nodes. HOD is a tool that makes it easy for administrators and users to
-quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers
-and testers who need to share a physical cluster for testing their own Hadoop
-versions.
-</p>
-
-<p>HOD relies on a resource manager (RM) for allocation of nodes that it can use for
-running Hadoop instances. At present it runs with the <a \
href="ext:hod/torque">Torque
-resource manager</a>.
-</p>
-
-<p>
-The basic system architecture of HOD includes components from:</p>
-<ul>
- <li>A Resource manager (possibly together with a scheduler),</li>
- <li>HOD components, and </li>
- <li>Hadoop Map/Reduce and HDFS daemons.</li>
-</ul>
-
-<p>
-HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances
-through interaction with the above components on a given cluster of nodes. A cluster \
of
-nodes can be thought of as comprising of two sets of nodes:</p>
-<ul>
- <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, \
and then
-use the Hadoop client to submit Hadoop jobs. </li>
- <li>Compute nodes: Using the resource manager, HOD components are run on these \
nodes to
-provision the Hadoop daemons. After that Hadoop jobs run on them.</li>
-</ul>
-
-<p>
-Here is a brief description of the sequence of operations in allocating a cluster \
and
-running jobs on them.
-</p>
-
-<ul>
- <li>The user uses the HOD client on the Submit node to allocate a required number \
of
-cluster nodes, and provision Hadoop on them.</li>
- <li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to submit \
a HOD
-process, called the RingMaster, as a Resource Manager job, requesting the user \
desired number
-of nodes. This job is submitted to the central server of the Resource Manager \
(pbs_server, in Torque).</li>
- <li>On the compute nodes, the resource manager slave daemons, (pbs_moms in \
Torque), accept
-and run jobs that they are given by the central server (pbs_server in Torque). The \
RingMaster
-process is started on one of the compute nodes (mother superior, in Torque).</li>
- <li>The Ringmaster then uses another Resource Manager interface, (pbsdsh, in \
Torque), to run
-the second HOD component, HodRing, as distributed tasks on each of the compute
-nodes allocated.</li>
- <li>The Hodrings, after initializing, communicate with the Ringmaster to get \
Hadoop commands,
-and run them accordingly. Once the Hadoop commands are started, they register with \
the RingMaster,
-giving information about the daemons.</li>
- <li>All the configuration files needed for Hadoop instances are generated by HOD \
itself,
-some obtained from options given by user in its own configuration file.</li>
- <li>The HOD client keeps communicating with the RingMaster to find out the \
location of the
-JobTracker and HDFS daemons.</li>
-</ul>
-
-<p>The rest of the document deals with the steps needed to setup HOD on a physical \
cluster of nodes.</p>
-
-</section>
-
-<section>
-<title>Pre-requisites</title>
-
-<p>Operating System: HOD is currently tested on RHEL4.<br/>
-Nodes : HOD requires a minimum of 3 nodes configured through a resource \
manager.<br/></p>
-
-<p> Software </p>
-<p>The following components are to be installed on *ALL* the nodes before using \
HOD:</p>
-<ul>
- <li>Torque: Resource manager</li>
- <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of \
Python.</li>
-</ul>
-
-<p>The following components can be optionally installed for getting better
-functionality from HOD:</p>
-<ul>
- <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
- used for improving the scalability of HOD. If this module is detected to be
- installed, HOD uses it, else it falls back to default modules.</li>
- <li><a href="ext:site">Hadoop</a>: HOD can automatically
- distribute Hadoop to all nodes in the cluster. However, it can also use a
- pre-installed version of Hadoop, if it is available on all nodes in the cluster.
- HOD currently supports Hadoop 0.15 and above.</li>
-</ul>
-
-<p>NOTE: HOD configuration requires the location of installs of these
-components to be the same on all nodes in the cluster. It will also
-make the configuration simpler to have the same location on the submit
-nodes.
-</p>
-</section>
-
-<section>
-<title>Resource Manager</title>
-<p> Currently HOD works with the Torque resource manager, which it uses for its \
node
- allocation and job submission. Torque is an open source resource manager from
- <a href="ext:hod/cluster-resources">Cluster Resources</a>, a community effort
- based on the PBS project. It provides control over batch jobs and distributed \
compute nodes. Torque is
- freely available for download from <a href="ext:hod/torque-download">here</a>.
- </p>
-
-<p> All documentation related to torque can be seen under
- the section TORQUE Resource Manager <a
- href="ext:hod/torque-docs">here</a>. You can
- get wiki documentation from <a
- href="ext:hod/torque-wiki">here</a>.
- Users may wish to subscribe to TORQUEâ € ™s mailing list or view the archive for \
questions,
- comments <a
- href="ext:hod/torque-mailing-list">here</a>.
-</p>
-
-<p>For using HOD with Torque:</p>
-<ul>
- <li>Install Torque components: pbs_server on one node(head node), pbs_mom on all
- compute nodes, and PBS client tools on all compute nodes and submit
- nodes. Perform atleast a basic configuration so that the Torque system is up and
- running i.e pbs_server knows which machines to talk to. Look <a
- href="ext:hod/torque-basic-config">here</a>
- for basic configuration.
-
- For advanced configuration, see <a
- href="ext:hod/torque-advanced-config">here</a></li>
- <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is \
the
- same as the HOD configuration parameter, resource-manager.queue. The Hod client \
uses this queue to
- submit the Ringmaster process as a Torque job.</li>
- <li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
- This can be done by using the 'qmgr' command. For example:
- qmgr -c "set node node properties=cluster-name". The name of the cluster is the \
same as
- the HOD configuration parameter, hod.cluster. </li>
- <li>Ensure that jobs can be submitted to the nodes. This can be done by
- using the 'qsub' command. For example:
- echo "sleep 30" | qsub -l nodes=3</li>
-</ul>
-
-</section>
-
-<section>
-<title>Installing HOD</title>
-
-<p>Now that the resource manager set up is done, we proceed on to obtaining and
-installing HOD.</p>
-<ul>
- <li>If you are getting HOD from the Hadoop tarball,it is available under the
- 'contrib' section of Hadoop, under the root directory 'hod'.</li>
- <li>If you are building from source, you can run ant tar from the Hadoop root
- directory, to generate the Hadoop tarball, and then pick HOD from there,
- as described in the point above.</li>
- <li>Distribute the files under this directory to all the nodes in the
- cluster. Note that the location where the files are copied should be
- the same on all the nodes.</li>
- <li>Note that compiling hadoop would build HOD with appropriate permissions
- set on all the required script files in HOD.</li>
-</ul>
-</section>
-
-<section>
-<title>Configuring HOD</title>
-
-<p>After HOD installation is done, it has to be configured before we start using
-it.</p>
-<section>
- <title>Minimal Configuration to get started</title>
-<ul>
- <li>On the node from where you want to run hod, edit the file hodrc
- which can be found in the <install dir>/conf directory. This file
- contains the minimal set of values required for running hod.</li>
- <li>
-<p>Specify values suitable to your environment for the following
- variables defined in the configuration file. Note that some of these
- variables are defined at more than one place in the file.</p>
-
- <ul>
- <li>${JAVA_HOME}: Location of Java for Hadoop. Hadoop supports Sun JDK
- 1.5.x and above.</li>
- <li>${CLUSTER_NAME}: Name of the cluster which is specified in the
- 'node property' as mentioned in resource manager configuration.</li>
- <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
- submit nodes.</li>
- <li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
- manager configuration.</li>
- <li>${RM_HOME}: Location of the resource manager installation on the
- compute and submit nodes.</li>
- </ul>
-</li>
-
-<li>
-<p>The following environment variables *may* need to be set depending on
- your environment. These variables must be defined where you run the
- HOD client, and also be specified in the HOD configuration file as the
- value of the key resource_manager.env-vars. Multiple variables can be
- specified as a comma separated list of key=value pairs.</p>
-
- <ul>
- <li>HOD_PYTHON_HOME: If you install python to a non-default location
- of the compute nodes, or submit nodes, then, this variable must be
- defined to point to the python executable in the non-standard
- location.</li>
- </ul>
-</li>
-</ul>
-</section>
-
- <section>
- <title>Advanced Configuration</title>
- <p> You can review other configuration options in the file and modify them to \
suit
- your needs. Refer to the <a href="hod_config_guide.html">Configuration Guide</a> \
for information about the HOD
- configuration.
- </p>
- </section>
-</section>
-
- <section>
- <title>Running HOD</title>
- <p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a> for \
information about how to run HOD,
- what are the various features, options and for help in trouble-shooting.</p>
- </section>
-</body>
-</document>
+<?xml version="1.0"?>
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
+ "http://forrest.apache.org/dtd/document-v20.dtd">
+
+
+<document>
+
+ <header>
+ <title>
+ Hadoop On Demand
+ </title>
+ </header>
+
+ <body>
+<section>
+<title>Overview</title>
+
+<p>The Hadoop On Demand (HOD) project is a system for provisioning and
+managing independent Hadoop MapReduce and HDFS instances on a shared cluster
+of nodes. HOD is a tool that makes it easy for administrators and users to
+quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers
+and testers who need to share a physical cluster for testing their own Hadoop
+versions.
+</p>
+
+<p>HOD relies on a resource manager (RM) for allocation of nodes that it can use for
+running Hadoop instances. At present it runs with the <a \
href="ext:hod/torque">Torque +resource manager</a>.
+</p>
+
+<p>
+The basic system architecture of HOD includes components from:</p>
+<ul>
+ <li>A Resource manager (possibly together with a scheduler),</li>
+ <li>HOD components, and </li>
+ <li>Hadoop Map/Reduce and HDFS daemons.</li>
+</ul>
+
+<p>
+HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances
+through interaction with the above components on a given cluster of nodes. A cluster \
of +nodes can be thought of as comprising of two sets of nodes:</p>
+<ul>
+ <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, \
and then +use the Hadoop client to submit Hadoop jobs. </li>
+ <li>Compute nodes: Using the resource manager, HOD components are run on these \
nodes to +provision the Hadoop daemons. After that Hadoop jobs run on them.</li>
+</ul>
+
+<p>
+Here is a brief description of the sequence of operations in allocating a cluster \
and +running jobs on them.
+</p>
+
+<ul>
+ <li>The user uses the HOD client on the Submit node to allocate a required number \
of +cluster nodes, and provision Hadoop on them.</li>
+ <li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to submit \
a HOD +process, called the RingMaster, as a Resource Manager job, requesting the user \
desired number +of nodes. This job is submitted to the central server of the \
Resource Manager (pbs_server, in Torque).</li> + <li>On the compute nodes, the \
resource manager slave daemons, (pbs_moms in Torque), accept +and run jobs that they \
are given by the central server (pbs_server in Torque). The RingMaster +process is \
started on one of the compute nodes (mother superior, in Torque).</li> + <li>The \
Ringmaster then uses another Resource Manager interface, (pbsdsh, in Torque), to run \
+the second HOD component, HodRing, as distributed tasks on each of the compute \
+nodes allocated.</li> + <li>The Hodrings, after initializing, communicate with the \
Ringmaster to get Hadoop commands, +and run them accordingly. Once the Hadoop \
commands are started, they register with the RingMaster, +giving information about \
the daemons.</li> + <li>All the configuration files needed for Hadoop instances are \
generated by HOD itself, +some obtained from options given by user in its own \
configuration file.</li> + <li>The HOD client keeps communicating with the \
RingMaster to find out the location of the +JobTracker and HDFS daemons.</li>
+</ul>
+
+<p>The rest of the document deals with the steps needed to setup HOD on a physical \
cluster of nodes.</p> +
+</section>
+
+<section>
+<title>Pre-requisites</title>
+
+<p>Operating System: HOD is currently tested on RHEL4.<br/>
+Nodes : HOD requires a minimum of 3 nodes configured through a resource \
manager.<br/></p> +
+<p> Software </p>
+<p>The following components are to be installed on *ALL* the nodes before using \
HOD:</p> +<ul>
+ <li>Torque: Resource manager</li>
+ <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of \
Python.</li> +</ul>
+
+<p>The following components can be optionally installed for getting better
+functionality from HOD:</p>
+<ul>
+ <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
+ used for improving the scalability of HOD. If this module is detected to be
+ installed, HOD uses it, else it falls back to default modules.</li>
+ <li><a href="ext:site">Hadoop</a>: HOD can automatically
+ distribute Hadoop to all nodes in the cluster. However, it can also use a
+ pre-installed version of Hadoop, if it is available on all nodes in the cluster.
+ HOD currently supports Hadoop 0.15 and above.</li>
+</ul>
+
+<p>NOTE: HOD configuration requires the location of installs of these
+components to be the same on all nodes in the cluster. It will also
+make the configuration simpler to have the same location on the submit
+nodes.
+</p>
+</section>
+
+<section>
+<title>Resource Manager</title>
+<p> Currently HOD works with the Torque resource manager, which it uses for its \
node + allocation and job submission. Torque is an open source resource manager from
+ <a href="ext:hod/cluster-resources">Cluster Resources</a>, a community effort
+ based on the PBS project. It provides control over batch jobs and distributed \
compute nodes. Torque is + freely available for download from <a \
href="ext:hod/torque-download">here</a>. + </p>
+
+<p> All documentation related to torque can be seen under
+ the section TORQUE Resource Manager <a
+ href="ext:hod/torque-docs">here</a>. You can
+ get wiki documentation from <a
+ href="ext:hod/torque-wiki">here</a>.
+ Users may wish to subscribe to TORQUEâ € ™s mailing list or view the archive for \
questions, + comments <a
+ href="ext:hod/torque-mailing-list">here</a>.
+</p>
+
+<p>For using HOD with Torque:</p>
+<ul>
+ <li>Install Torque components: pbs_server on one node(head node), pbs_mom on all
+ compute nodes, and PBS client tools on all compute nodes and submit
+ nodes. Perform atleast a basic configuration so that the Torque system is up and
+ running i.e pbs_server knows which machines to talk to. Look <a
+ href="ext:hod/torque-basic-config">here</a>
+ for basic configuration.
+
+ For advanced configuration, see <a
+ href="ext:hod/torque-advanced-config">here</a></li>
+ <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is \
the + same as the HOD configuration parameter, resource-manager.queue. The Hod \
client uses this queue to + submit the Ringmaster process as a Torque job.</li>
+ <li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
+ This can be done by using the 'qmgr' command. For example:
+ qmgr -c "set node node properties=cluster-name". The name of the cluster is the \
same as + the HOD configuration parameter, hod.cluster. </li>
+ <li>Ensure that jobs can be submitted to the nodes. This can be done by
+ using the 'qsub' command. For example:
+ echo "sleep 30" | qsub -l nodes=3</li>
+</ul>
+
+</section>
+
+<section>
+<title>Installing HOD</title>
+
+<p>Now that the resource manager set up is done, we proceed on to obtaining and
+installing HOD.</p>
+<ul>
+ <li>If you are getting HOD from the Hadoop tarball,it is available under the
+ 'contrib' section of Hadoop, under the root directory 'hod'.</li>
+ <li>If you are building from source, you can run ant tar from the Hadoop root
+ directory, to generate the Hadoop tarball, and then pick HOD from there,
+ as described in the point above.</li>
+ <li>Distribute the files under this directory to all the nodes in the
+ cluster. Note that the location where the files are copied should be
+ the same on all the nodes.</li>
+ <li>Note that compiling hadoop would build HOD with appropriate permissions
+ set on all the required script files in HOD.</li>
+</ul>
+</section>
+
+<section>
+<title>Configuring HOD</title>
+
+<p>After HOD installation is done, it has to be configured before we start using
+it.</p>
+<section>
+ <title>Minimal Configuration to get started</title>
+<ul>
+ <li>On the node from where you want to run hod, edit the file hodrc
+ which can be found in the <install dir>/conf directory. This file
+ contains the minimal set of values required for running hod.</li>
+ <li>
+<p>Specify values suitable to your environment for the following
+ variables defined in the configuration file. Note that some of these
+ variables are defined at more than one place in the file.</p>
+
+ <ul>
+ <li>${JAVA_HOME}: Location of Java for Hadoop. Hadoop supports Sun JDK
+ 1.5.x and above.</li>
+ <li>${CLUSTER_NAME}: Name of the cluster which is specified in the
+ 'node property' as mentioned in resource manager configuration.</li>
+ <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
+ submit nodes.</li>
+ <li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
+ manager configuration.</li>
+ <li>${RM_HOME}: Location of the resource manager installation on the
+ compute and submit nodes.</li>
+ </ul>
+</li>
+
+<li>
+<p>The following environment variables *may* need to be set depending on
+ your environment. These variables must be defined where you run the
+ HOD client, and also be specified in the HOD configuration file as the
+ value of the key resource_manager.env-vars. Multiple variables can be
+ specified as a comma separated list of key=value pairs.</p>
+
+ <ul>
+ <li>HOD_PYTHON_HOME: If you install python to a non-default location
+ of the compute nodes, or submit nodes, then, this variable must be
+ defined to point to the python executable in the non-standard
+ location.</li>
+ </ul>
+</li>
+</ul>
+</section>
+
+ <section>
+ <title>Advanced Configuration</title>
+ <p> You can review other configuration options in the file and modify them to \
suit + your needs. Refer to the <a href="hod_config_guide.html">Configuration \
Guide</a> for information about the HOD + configuration.
+ </p>
+ </section>
+</section>
+
+ <section>
+ <title>Running HOD</title>
+ <p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a> for \
information about how to run HOD, + what are the various features, options and for \
help in trouble-shooting.</p> + </section>
+
+ <section>
+ <title>Supporting Tools and Utilities</title>
+ <p>This section describes certain supporting tools and utilities that can be \
used in managing HOD deployments.</p> +
+ <section>
+ <title>logcondense.py - Tool for removing log files uploaded to DFS</title>
+ <p>As mentioned in
+ <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">this \
section</a> of the + <a href="hod_user_guide.html">HOD User Guide</a>, HOD \
can be configured to upload + Hadoop logs to a statically configured HDFS. \
Over time, the number of logs uploaded + to DFS could increase. \
logcondense.py is a tool that helps administrators to clean-up + the log \
files older than a certain number of days. </p> + <section>
+ <title>Running logcondense.py</title>
+ <p>logcondense.py is available under hod_install_location/support folder. \
You can either + run it using python, for e.g. <em>python logcondense.py</em>, \
or give execute permissions + to the file, and directly run it as \
<em>logcondense.py</em>. logcondense.py needs to be + run by a user who has \
sufficient permissions to remove files from locations where log + files are \
uploaded in the DFS, if permissions are enabled. For e.g. as mentioned in the + \
<a href="hod_config_guide.html#3.7+hodring+options">configuration guide</a>, the logs \
could + be configured to come under the user's home directory in HDFS. In that \
case, the user + running logcondense.py should have super user privileges to \
remove the files from under + all user home directories.</p>
+ </section>
+ <section>
+ <title>Command Line Options for logcondense.py</title>
+ <p>The following command line options are supported for logcondense.py.</p>
+ <table>
+ <tr>
+ <td>Short Option</td>
+ <td>Long option</td>
+ <td>Meaning</td>
+ <td>Example</td>
+ </tr>
+ <tr>
+ <td>-p</td>
+ <td>--package</td>
+ <td>Complete path to the hadoop script. The version of hadoop must be \
the same as the + one running HDFS.</td>
+ <td>/usr/bin/hadoop</td>
+ </tr>
+ <tr>
+ <td>-d</td>
+ <td>--days</td>
+ <td>Delete log files older than the specified number of days</td>
+ <td>7</td>
+ </tr>
+ <tr>
+ <td>-c</td>
+ <td>--config</td>
+ <td>Path to the Hadoop configuration directory, under which \
hadoop-site.xml resides. + The hadoop-site.xml must point to the HDFS \
NameNode from where logs are to be removed.</td> + \
<td>/home/foo/hadoop/conf</td> + </tr>
+ <tr>
+ <td>-l</td>
+ <td>--logs</td>
+ <td>A HDFS path, this must be the same HDFS path as specified for the \
log-destination-uri, + as mentioned in the <a \
href="hod_config_guide.html#3.7+hodring+options">configuration guide</a>, + \
without the hdfs:// URI string</td> + <td>/user</td>
+ </tr>
+ <tr>
+ <td>-n</td>
+ <td>--dynamicdfs</td>
+ <td>If true, this will indicate that the logcondense.py script should \
delete HDFS logs + in addition to Map/Reduce logs. Otherwise, it only \
deletes Map/Reduce logs, which is also the + default if this option is \
not specified. This option is useful if dynamic DFS installations + are \
being provisioned by HOD, and the static DFS installation is being used only to \
collect + logs - a scenario that may be common in test clusters.</td>
+ <td>false</td>
+ </tr>
+ </table>
+ <p>So, for example, to delete all log files older than 7 days using a \
hadoop-site.xml stored in + ~/hadoop-conf, using the hadoop installation under \
~/hadoop-0.17.0, you could say:</p> + <p><em>python logcondense.py -p \
~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user</em></p> + </section>
+ </section>
+ </section>
+</body>
+</document>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic