[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-user
Subject:    Re: Transferring another server using flume
From:       ed <edorsey () gmail ! com>
Date:       2014-01-31 4:18:34
Message-ID: CAGO3nXJmw-ejYJW2pKL6znBjBtjrd6_yQ-BdjKvVFe-g4=t9-w () mail ! gmail ! com
[Download RAW message or body]

Hi Burak,

Unfortunately I don't have any experience with Scribe so can't provide any
advice there.  I briefly checked out the Github site for it and it did not
look like there is much (if any) activity on that project at this point.  I
think all of the Flume sources use a push model (rather than Pull)  so I
think it'll be tough if you can't install any software/scripts on the
remote servers you're trying to collect from to push data to Flume.  Can
you configure the remote servers to write their logs to some sort of shared
storage?  Other than that I'm not sure if there are other rsync like
programs you could try to use to pull the logs and get them into a Flume
Source like the File Spooler.  Maybe someone the mailing list with more
Linux tool experience will have some suggestions on rsync alternatives that
might work for you.

Best,

Ed


On Thu, Jan 30, 2014 at 9:34 AM, burakkk <burak.isikli@gmail.com> wrote:

> Hi Ed,
> Syslog isn't available for the remote machines and remote machines aren't
> desired to install any application or library as possible. I have to pull
> data from remote servers without depending on anything remotely.
>
> The problem with rsync is that on the remote servers so many small files
> are generating that rsync get stuck in some point. It doesn't fail but it's
> just waiting for something doing nothing. It means it's related to getting
> the files from the remote servers.
>
> After a brief review of flume, using scribe+flume may solve my problem.
> What do you think?
>
> Thanks
> Best regards...
>
>
>
> On Thu, Jan 30, 2014 at 1:58 AM, ed <edorsey@gmail.com> wrote:
>
>> Hi Burak,
>>
>> Do the machines with the logs on them have syslog available  (e.g.,
>> rsyslog for RedHat/CentOS)?  Can the remote servers do any kind of push or
>> do you have to pull data from them?  If you you have a syslog daemon
>> available on the remote servers then I would try configuring those to send
>> the logs to the Flume multiport syslog TCP source.
>>
>> In regards to pulling data from the remote servers, what part of rsync is
>> causing issues  (assuming your using rsync to pull data)?  Is the problem
>> with rsync itself in regards to getting the files from the remote servers
>> or is it an issue related to getting the files into HDFS once you've pulled
>> the files to the main server?  If the problem is related to getting the
>> files into HDFS you could try using the Spooling Directory Source and point
>> it at the directory on your main server where you are aggregating the logs
>> via rsync.
>>
>> Best,
>>
>> Ed
>>
>>
>> On Wed, Jan 29, 2014 at 11:24 PM, burakkk <burak.isikli@gmail.com> wrote:
>>
>>> Hi folks,
>>> I have question about flume-ng. There are some different generating log
>>> machines. These log files are small (around 4-5mb per file). I want to get
>>> or read these files into my main server from these remote servers on
>>> a specific directory and then I want to put it into HDFS. I can't install
>>> any kind of application on these remote servers so that I can't use avro
>>> and thrift source.
>>>
>>> For now I use rsync to sync files between two different machines and put
>>> them using hdfs file commands such as hdfs fs -put. But there are some
>>> issues about rsync.
>>>
>>> In order to solve this problem, what kind of source should I use and how
>>> can I do that?
>>>
>>>
>>> Thanks
>>> Best Regards...
>>>
>>> --
>>>
>>> *BURAK ISIKLI* | *http://burakisikli.wordpress.com
>>> <http://burakisikli.wordpress.com>*
>>>
>>>
>>
>
>
> --
>
> *BURAK ISIKLI* | *http://burakisikli.wordpress.com
> <http://burakisikli.wordpress.com>*
>
>

[Attachment #3 (text/html)]

<div dir="ltr">Hi Burak,<div><br></div><div>Unfortunately I don&#39;t have any \
experience with Scribe so can&#39;t provide any advice there.  I briefly checked out \
the Github site for it and it did not look like there is much (if any) activity on \
that project at this point.  I think all of the Flume sources use a push model \
(rather than Pull)  so I think it&#39;ll be tough if you can&#39;t install any \
software/scripts on the remote servers you&#39;re trying to collect from to push data \
to Flume.  Can you configure the remote servers to write their logs to some sort of \
shared storage?  Other than that I&#39;m not sure if there are other rsync like \
programs you could try to use to pull the logs and get them into a Flume Source like \
the File Spooler.  Maybe someone the mailing list with more Linux tool experience \
will have some suggestions on rsync alternatives that might work for you.</div> \
<div><br></div><div>Best,</div><div><br></div><div>Ed</div> </div><div \
class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jan 30, 2014 at 9:34 AM, \
burakkk <span dir="ltr">&lt;<a href="mailto:burak.isikli@gmail.com" \
target="_blank">burak.isikli@gmail.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Hi Ed,<div>Syslog isn&#39;t available for the \
remote machines and remote machines aren&#39;t desired to install any application or \
library as possible. I have to pull data from remote servers without depending on \
anything remotely.</div>

<div><br></div><div>The problem with rsync is that on the remote servers so many \
small files are generating that rsync get stuck in some point. It doesn&#39;t fail \
but it&#39;s just waiting for something doing nothing. It means it&#39;s related to \
getting the files from the remote servers.</div>

<div><br></div><div>After a brief review of flume, using scribe+flume may solve my \
problem. What do you think?</div><div><br></div><div>Thanks</div><div>Best \
regards...</div><div><br></div> </div><div class="HOEnZb"><div class="h5"><div \
class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jan 30, 2014 at 1:58 AM, \
ed <span dir="ltr">&lt;<a href="mailto:edorsey@gmail.com" \
target="_blank">edorsey@gmail.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div dir="ltr">Hi Burak,<div><br></div><div>Do the machines \
with the logs on them have syslog available  (e.g., rsyslog for RedHat/CentOS)?  Can \
the remote servers do any kind of push or do you have to pull data from them?  If you \
you have a syslog daemon available on the remote servers then I would try configuring \
those to send the logs to the Flume multiport syslog TCP source.  </div>


<div><br></div><div>In regards to pulling data from the remote servers, what part of \
rsync is causing issues  (assuming your using rsync to pull data)?  Is the problem \
with rsync itself in regards to getting the files from the remote servers or is it an \
issue related to getting the files into HDFS once you&#39;ve pulled the files to the \
main server?  If the problem is related to getting the files into HDFS you could try \
using the Spooling Directory Source and point it at the directory on your main server \
where you are aggregating the logs via rsync.</div>


<div><br></div><div>Best,</div><div><br></div><div>Ed</div></div><div><div><div \
class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 29, 2014 at 11:24 \
PM, burakkk <span dir="ltr">&lt;<a href="mailto:burak.isikli@gmail.com" \
target="_blank">burak.isikli@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Hi folks,<div>I have question about flume-ng. \
There are some different generating log machines. These log files are small (around \
4-5mb per file). I want to get or read these files into my main server from these \
remote servers on a specific directory and then I want to put it into HDFS. I \
can&#39;t install any kind of application on these remote servers so that I can&#39;t \
use avro and thrift source. </div>



<div><br></div><div>For now I use rsync to sync files between two different machines \
and put them using hdfs file commands such as hdfs fs -put. But there are some issues \
about rsync.</div><div><br></div> <div>In order to solve this problem, what kind of \
source should I use and how can I do \
that?</div><div><br></div><div><br></div><div>Thanks</div><div>Best \
Regards...</div><span><font color="#888888"><div><div> <br></div>
-- <br><span style="font-family:&#39;bookman old style&#39;,&#39;new \
york&#39;,times,serif;font-size:13px;color:rgb(0,0,127)"><div><font><font \
color="#00407f"><font face="&#39;bookman old style&#39;, &#39;new york&#39;, times, \
serif"><span style="font-size:x-small"><p style="margin:0px">



<a rel="nofollow" name="143e0919540ea64e_143e070bd5f7e502_143de6388d796dfd_SafeHtmlFilter_SafeHtmlFilter_SafeHtmlFilter__MailAutoSig"><b><span \
lang="EN-US" style="color:rgb(95,73,122)">BURAK ISIKLI</span></b></a><b><span \
lang="EN-US" style="color:rgb(211,125,11)"> </span></b><span lang="EN-US" \
style="color:rgb(166,166,166)">| </span><span style="color:rgb(95,73,122)"><b><span \
style="color:rgb(166,166,166);font-weight:normal"><a \
href="http://burakisikli.wordpress.com" \
target="_blank">http://burakisikli.wordpress.com</a></span></b></span></p>



<div><span lang="EN-US"><font \
color="#5F497A"><b><br></b></font></span></div></span></font></font></font></div></span>
 </div></font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><span \
style="font-family:&#39;bookman old style&#39;,&#39;new \
york&#39;,times,serif;font-size:13px;color:rgb(0,0,127)"><div><font><font \
color="#00407f"><font face="&#39;bookman old style&#39;, &#39;new york&#39;, times, \
serif"><span style="font-size:x-small"><p \
style="margin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">

<a rel="nofollow" name="143e0919540ea64e_SafeHtmlFilter_SafeHtmlFilter_SafeHtmlFilter__MailAutoSig"><b><span \
lang="EN-US" style="color:rgb(95,73,122)">BURAK ISIKLI</span></b></a><b><span \
lang="EN-US" style="color:rgb(211,125,11)"> </span></b><span lang="EN-US" \
style="color:rgb(166,166,166)">| </span><span style="color:rgb(95,73,122)"><b><span \
style="color:rgb(166,166,166);font-weight:normal"><a \
href="http://burakisikli.wordpress.com" \
target="_blank">http://burakisikli.wordpress.com</a></span></b></span></p>

<div><span lang="EN-US"><font \
color="#5F497A"><b><br></b></font></span></div></span></font></font></font></div></span>
 </div>
</div></div></blockquote></div><br></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic