[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    having a directory as input split
From:       akhil1988 <akhilanger () gmail ! com>
Date:       2010-04-30 6:18:31
Message-ID: 28408886.post () talk ! nabble ! com
[Download RAW message or body]


How can I make a directory as a InputSplit rather than a file. I want that
the input split available to a map task should be a directory and not a
file. And I will implement my own record reader which will read appropriate
data from the directory and thus give the records to the map tasks. 

To explain in other words,
I have a list of directories distributed over hdfs and I know that each of
these directories is small enough to be present on a single node. I want
that one directory to be given  to each map task rather than the files
present in it. How to do this?

Thanks,
 Akhil
-- 
View this message in context: \
http://old.nabble.com/having-a-directory-as-input-split-tp28408886p28408886.html Sent \
from the Hadoop core-user mailing list archive at Nabble.com.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic