[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    AutoInputFormat
From:       <thuhuangs09 () gmail ! com>
Date:       2012-04-26 9:08:08
Message-ID: CF205E58-112B-4EC8-AAD4-C046C7122940 () gmail ! com
[Download RAW message or body]


I use org.apache.hadoop.streaming.AutoInputFormat to handle sequence file input for \
streaming, but I found that it provide format below for <key, value>.  ( key is a \
string , value is binary)

"keystring\tvalue\n"

since value is binary, there is a lot '\n' within value, my mapper can't distinguish \
it.

in other words, I need value presented as length + raw bytes or typed bytes  

I called streaming as below:

         $HADOOP_HOME/bin/hadoop jar \
         $HADOOP_HOME/contrib/streaming/hadoop-streaming-1.0.2.jar \
         -input data.seq \
         -output output \
         -mapper mapper \
         -reducer reducer \
         -inputformat org.apache.hadoop.streaming.AutoInputFormat \
         -file mapper \
         -file reducer





huangs
thuhuangs09@gmail.com



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic