[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-commits
Subject: [Lucene-hadoop Wiki] Update of "GettingStartedWithHadoop" by SameerParanjpye
From: Apache Wiki <wikidiffs () apache ! org>
Date: 2006-09-20 7:22:36
Message-ID: 20060920072236.9884.9715 () ajax ! apache ! org
[Download RAW message or body]
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for \
change notification.
The following page has been changed by SameerParanjpye:
http://wiki.apache.org/lucene-hadoop/GettingStartedWithHadoop
------------------------------------------------------------------------------
* {{{mapred.local.dir}}}
=== Formatting the Namenode ===
-
The first step to starting up your Hadoop installation is formatting the \
filesystem. You need to do this the first time you set up a Hadoop installation. \
'''Do not''' format a running filesystem, this will cause all your data to be erased. \
To format the filesystem, run the command: [[BR]] {{{% \
$HADOOP_INSTALL/hadoop/bin/hadoop namenode -format}}}
=== Starting a Single node cluster ===
@@ -59, +58 @@
=== Stopping a Single node cluster ===
Run the command [[BR]] {{{% $HADOOP_INSTALL/hadoop/bin/stop-all.sh}}} [[BR]] to \
stop all the daemons running on your machine.
+ === Separating Configuration from Installation ===
+ In the example described above, the configuration files used by the Hadoop cluster \
all lie in the Hadoop installation. This can become cumbersome when upgrading to a \
new release since all custom config has to be re-created in the new installation. It \
is possible to separate the config from the install. To do so, select a + directory \
to house Hadoop configuration (let's say {{{/foo/bar/hadoop-config}}}. Copy the \
{{{hadoop-site.xml, slaves}}} and {{{hadoop-env.sh}}} files to this directory. You \
can either set the {{{HADOOP_CONF_DIR}}} environment variable to refer to this \
directory or pass it directly to the Hadoop scripts with the {{{--config}}} option. + \
In this case, the cluster start and stop commands specified in the above two \
sub-sections become [[BR]] {{{% $HADOOP_INSTALL/hadoop/bin/start-all.sh --config \
/foo/bar/hadoop-config}}} and [[BR]] {{{% $HADOOP_INSTALL/hadoop/bin/stop-all.sh \
--config /foo/bar/hadoop-config}}}. [[BR]] Only the absolute path to the config \
directory should be passed to the scripts. +
- === Starting up a real cluster ===
+ === Starting up a larger cluster ===
- * After formatting the namenode run bin/start-dfs.sh on the Namenode. This will \
bring up the dfs with Namenode running on the machine you ran the command on and \
Datanodes on the machines listed in the slaves file mentioned above. +
+ * Ensure that the Hadoop package is accessible from the same path on all nodes \
that are to be included in the cluster. If you have separated configuration from the \
install then ensure that the config directory is also accessible the same way. + * \
Populate the {{{slaves}}} file with the nodes to be included in the cluster. One node \
per line. + * Follow the steps in the ''Basic Configuration'' section above.
+ * Format the Namenode
+ * Run the command {{{% $HADOOP_INSTALL/hadoop/bin/start-dfs.sh}}} on the node you \
want the Namenode to run on. This will bring up HDFS with the Namenode running on the \
machine you ran the command on and Datanodes on the machines listed in the slaves \
file mentioned above.
- * Run bin/start-mapred.sh on the machine you plan to run the Jobtracker on. This \
will bring up the map reduce cluster with Jobtracker running on the machine you ran \
the command on and Tasktrackers running on machines listed in the slaves file. + * \
Run the command {{{% $HADOOP_INSTALL/hadoop/bin/start-mapred.sh}}} on the machine you \
plan to run the Jobtracker on. This will bring up the Map/Reduce cluster with \
Jobtracker running on the machine you ran the command on and Tasktrackers running on \
machines listed in the slaves file. + * The above two commands can also be executed \
with a {{{--config}}} option.
- * In case you have not set the HADOOP_CONF_DIR variable, you can use \
bin/start-mapred.sh (bin/start-dfs.sh) --config configure_directory.
- * Try executing bin/hadoop dfs -lsr / to see if it is working.
=== Stopping the cluster ===
- * You can stop the cluster by running bin/stop-mapred.sh and then bin/stop-dfs.sh \
on your Jobtracker and Namenode respectively. You can specify the configure directory \
by using the --config option. + * The cluster can be stopped by running {{{% \
$HADOOP_INSTALL/hadoop/bin/stop-mapred.sh}}} and then {{{% \
$HADOOP_INSTALL/hadoop/bin/stop-dfs.sh}}} on your Jobtracker and Namenode \
respectively. These commands also accept the {{{--config}}} option.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic