'Re: Two parallel agents from same source to same sink'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-user
Subject:    Re: Two parallel agents from same source to same sink
From:       Gonzalo Herreros <gherreros () gmail ! com>
Date:       2016-01-21 15:23:10
Message-ID: CAM-G-DUZd4zOseLsijkTM8i_CYTdaxWLyT9_ExEOvTDhaP5DyQ () mail ! gmail ! com
[Download RAW message or body]

You can configure rsyslog to do the failover and only send to one of them
using "$ActionExecOnlyWhenPreviousIsSuspended on" I think
If you can life with an occasional duplicate that should do, otherwise you
need something more complex.

Regards,
Gonzalo

On 21 January 2016 at 15:05, Margus Roo <margus@roo.ee> wrote:

> Hi
>
> I try to set up flume high availability
> From rsyslog comes same feed to two different servers s1 and s2.
> In both servers are configured flume-agents to listen feed from rsyslog.
> Both agents are writing feed to HDFS.
> What I am getting into HDFS is different files with duplicated content.
>
> Is there any best practice architecture how to use flume in situations
> like this.
> What I am trying to avoid is in situation when one server is down then
> syslog is forwarded into two servers and at least one can transport events
> to HDFS.
>
> At the moment I thought I can clean after some time duplicates before hive
> will use directory.
>
> --
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
>
>

[Attachment #3 (text/html)]

<div dir="ltr">You can configure rsyslog to do the failover and only send to one of \
them using &quot;<span style="color:inherit;font-family:Consolas,Monaco,&#39;Andale \
Mono&#39;,monospace;font-size:inherit;line-height:1.42857;white-space:pre-wrap;background-color:transparent">$ActionExecOnlyWhenPreviousIsSuspended \
on&quot; I think</span><div>If you can life with an occasional duplicate that should \
do, otherwise you need something more \
complex.<br></div><div><br></div><div>Regards,</div><div>Gonzalo</div></div><div \
class="gmail_extra"><br><div class="gmail_quote">On 21 January 2016 at 15:05, Margus \
Roo <span dir="ltr">&lt;<a href="mailto:margus@roo.ee" \
target="_blank">margus@roo.ee</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">Hi<br> <br>
I try to set up flume high availability<br>
From rsyslog comes same feed to two different servers s1 and s2.<br>
In both servers are configured flume-agents to listen feed from rsyslog.<br>
Both agents are writing feed to HDFS.<br>
What I am getting into HDFS is different files with duplicated content.<br>
<br>
Is there any best practice architecture how to use flume in situations like this.<br>
What I am trying to avoid is in situation when one server is down then syslog is \
forwarded into two servers and at least one can transport events to HDFS.<br> <br>
At the moment I thought I can clean after some time duplicates before hive will use \
directory.<span class="HOEnZb"><font color="#888888"><br> <br>
-- <br>
Margus (margusja) Roo<br>
<a href="http://margus.roo.ee" rel="noreferrer" \
                target="_blank">http://margus.roo.ee</a><br>
skype: margusja<br>
<a href="tel:%2B372%2051%2048%20780" value="+3725148780" target="_blank">+372 51 48 \
780</a><br> <br>
</font></span></blockquote></div><br></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic