[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] Updated: (HADOOP-435) Encapsulating startup scripts and jars
From:       "Benjamin Reed (JIRA)" <jira () apache ! org>
Date:       2007-01-31 18:01:05
Message-ID: 10754075.1170266465783.JavaMail.jira () brutus
[Download RAW message or body]


     [ https://issues.apache.org/jira/browse/HADOOP-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]

Benjamin Reed updated HADOOP-435:
---------------------------------

    Attachment: hadoopit.patch

Patch against version 10.1. Added target hadoopit that builds a self contained jar \
file with defined main class.

> Encapsulating startup scripts and jars in a single Jar file.
> ------------------------------------------------------------
> 
> Key: HADOOP-435
> URL: https://issues.apache.org/jira/browse/HADOOP-435
> Project: Hadoop
> Issue Type: New Feature
> Affects Versions: 0.5.0
> Reporter: Benjamin Reed
> Attachments: hadoopit.patch, hadoopit.patch, start.sh, stop.sh
> 
> 
> Currently, hadoop is a set of scripts, configurations, and jar files. It makes it a \
> pain to install on compute and datanodes. It also makes it a pain to setup clients \
> so that they can use hadoop. Everytime things are updated the pain begins again. I \
> suggest that we should be able to build a single Jar file that has a Main-Class \
> defined with the configuration built in so that we can distribute that one file to \
> nodes and clients on updates. One nice thing that I haven't done would be to make \
> the jarfile downloadable from the JobTracker webpage so that clients can easily \
> submit the jobs. I currently use such a setup on my small cluster. To start the job \
> tracker I used "java -jar hadoop.jar -l /tmp/log jobtracker" to submit a job I use \
> "java -jar hadoop.jar jar wordcount.jar". I used the client on my linux and Mac OSX \
> machines and I'll I need installed in java and the hadoop.jar file. hadoop.jar \
> helps with logfiles and configurations. The default of pulling the config files \
> from the jar file can be overridden by specifying a config directory so that you \
> can easily have machine specific configs and still have the same hadoop.jar on all \
> machines. Here are the available commands from hadoop.jar:
> USAGE: hadoop [-l logdir] command
> User commands:
> dfs          run a DFS admin client
> jar          run a JAR file
> job          manipulate MapReduce jobs
> fsck         run a DFS filesystem check utility
> Runtime startup commands:
> datanode     run a DFS datanode
> jobtracker   run the MapReduce job Tracker node
> namenode     run the DFS namenode (namenode -format formats the FS)
> tasktracker  run a MapReduce task Tracker node
> HadoopLoader commands:
> buildJar     builds the HadoopLoader jar file
> conf         dump hadoop configuration
> Note, I don't have the classes for hadoop streaming built into this Jar file, but \
> if I had that would also be an option (it checks for needed classes before \
> displaying an option). It makes it very easy for users that just write scripts to \
> use hadoop straight from their machines. I'm also attaching the start.sh and \
> stop.sh scripts that I use. These are the only scripts I use to startup the \
> daemons. They are very simple and the start.sh script uses the config file to \
> figure out whether or not to start the jobtracker and the nameserver. The attached \
> patch adds the HadoopIt patch, modifies the Configuration class to find the config \
> files correctly, and modifies the build to make a fully contained hadoop.jar. To \
> update the configuration in a hadoop.jar you simply use "zip hadoop.jar \
> hadoop-site.xml".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic