[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-user
Subject:    flume hdfs sink notify / callback to add partition
From:       Viral Bajaria <viral.bajaria () gmail ! com>
Date:       2014-07-28 21:40:49
Message-ID: CALckxSMSOTTUT-KEV_NUjFxTu1JFuJ3AwLzVn5Hd1eocEimW-w () mail ! gmail ! com
[Download RAW message or body]

Hi,

Is there a way to get the hdfs sink to signal that a file was just closed
and then use that signal to add a partition to hive if one does not exist
already.

Right now, what I do is:

- move files to s3
- run recover partitions <--- step takes forever.

But given that I have so much historical data, it's not feasible to run
recover partitions every single day since it takes forever.

I had much rather add an extra partition whenever I see a file in that
partition for the first time.

I looked around the code base and it seems the Flume-OG had something like
this but I don't see the capability in Flume-NG.

I can see a way to adding this by adding another Callback parameter to the
HdfsEventSink and create a customer wrapper around it.

Any other suggestions ?

Thanks,
Viral

[Attachment #3 (text/html)]

<div dir="ltr">Hi,<div><br></div><div>Is there a way to get the hdfs sink to signal \
that a file was just closed and then use that signal to add a partition to hive if \
one does not exist already.</div><div><br></div><div>Right now, what I do is:</div> \
<div><br></div><div>- move files to s3</div><div>- run recover partitions &lt;--- \
step takes forever.</div><div><br></div><div>But given that I have so much historical \
data, it&#39;s not feasible to run recover partitions every single day since it takes \
forever.</div> <div><br></div><div>I had much rather add an extra partition whenever \
I see a file in that partition for the first time.</div><div><br></div><div>I looked \
around the code base and it seems the Flume-OG had something like this but I \
don&#39;t see the capability in Flume-NG.</div> <div><br></div><div>I can see a way \
to adding this by adding another Callback parameter to the HdfsEventSink and create a \
customer wrapper around it.</div><div><br></div><div>Any other suggestions \
?</div><div><br></div><div> Thanks,</div><div>Viral</div><div><br></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic