[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-user
Subject:    Re: Flume source channel sink tunning
From:       Roshan Naik <roshan () hortonworks ! com>
Date:       2015-10-08 19:12:20
Message-ID: D23C0D2F.1E47A%roshan () hortonworks ! com
[Download RAW message or body]

[Attachment #2 (text/plain)]

You can connect multiple sinks to same memory channel. Each sink (and source) will \
get a thread within the same flume agent process. Each sink should differ in the \
destination file it writes to. Configure each sink separately in the config file and \
set its  ...channel = Property to the same channel.

-roshan


From: IT CTO <goi.cto@gmail.com<mailto:goi.cto@gmail.com>>
Reply-To: "user@flume.apache.org<mailto:user@flume.apache.org>" \
                <user@flume.apache.org<mailto:user@flume.apache.org>>
Date: Thursday, October 8, 2015 10:45 AM
To: "user@flume.apache.org<mailto:user@flume.apache.org>" \
                <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Re: Flume source channel sink tunning


Each if the sinks will run in a different proccess?
Any way I can share the same sink configuration for both sinks?
Eran

בתאריך יום ה׳, 8 באוק׳ 2015, 19:50 מאת Roshan Naik \
<roshan@hortonworks.com<mailto:roshan@hortonworks.com>>: 1 – Bump up the –Xmx in \
flume-env.sh as the default is quite small 2 – increase the capacity on the \
channel. Looks like your source is running much faster than the sink can keep up. You \
can try adding more sinks to improve drain rate.

From: IT CTO <goi.cto@gmail.com<mailto:goi.cto@gmail.com>>
Reply-To: "user@flume.apache.org<mailto:user@flume.apache.org>" \
                <user@flume.apache.org<mailto:user@flume.apache.org>>
Date: Thursday, October 8, 2015 9:46 AM
To: "user@flume.apache.org<mailto:user@flume.apache.org>" \
                <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Flume source channel sink tunning

Hi,

I am using SpoolDir with memory channel to write to hdfs sink.
When I use a single spoolDir I get single threaded performance so based on some mails \
I read I splinted the source to 5 spoolDir all writing to the same memory channel \
which writes to the hdfs.

Now I am getting different errors
1) GC error => not engough memory for the channel => increase Xmx for the agent
2) Channel is full => the sink is not kipping up with the channel

so I find myself playing with the different parameters.
any best practice here or path to follow to get it tuned?
I feel that even if it works it will easily break given other events

Eran
--
Eran | "You don't need eyes to see, you need vision" (Faithless)
--
Eran | "You don't need eyes to see, you need vision" (Faithless)


[Attachment #3 (text/html)]

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: \
after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, \
sans-serif; "> <div>You can connect multiple sinks to same memory channel.&nbsp;Each \
sink (and source) will get a thread within the same flume agent process.</div> \
<div>Each sink should differ in the destination file it writes to.</div> \
<div>Configure each sink separately in the config file and set its &nbsp;...channel \
=&nbsp;</div> <div>Property to the same channel.</div>
<div><br>
</div>
<div>-roshan</div>
<div><br>
</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; \
BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; \
PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: \
medium none; PADDING-TOP: 3pt"> <span style="font-weight:bold">From: </span>IT CTO \
&lt;<a href="mailto:goi.cto@gmail.com">goi.cto@gmail.com</a>&gt;<br> <span \
style="font-weight:bold">Reply-To: </span>&quot;<a \
href="mailto:user@flume.apache.org">user@flume.apache.org</a>&quot; &lt;<a \
href="mailto:user@flume.apache.org">user@flume.apache.org</a>&gt;<br> <span \
style="font-weight:bold">Date: </span>Thursday, October 8, 2015 10:45 AM<br> <span \
style="font-weight:bold">To: </span>&quot;<a \
href="mailto:user@flume.apache.org">user@flume.apache.org</a>&quot; &lt;<a \
href="mailto:user@flume.apache.org">user@flume.apache.org</a>&gt;<br> <span \
style="font-weight:bold">Subject: </span>Re: Flume source channel sink tunning<br> \
</div> <div><br>
</div>
<div>
<div>
<p dir="ltr">Each if the sinks will run in a different proccess?<br>
Any way I can share the same sink configuration for both sinks? <br>
Eran</p>
<br>
<div class="gmail_quote">
<div dir="ltr">בתאריך יום ה׳, 8 באוק׳ 2015, 19:50&nbsp;מאת Roshan \
Naik &lt;<a href="mailto:roshan@hortonworks.com">roshan@hortonworks.com</a>&gt;:<br> \
</div> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div \
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <div>1 – Bump up the –Xmx in flume-env.sh as the default is quite small</div>
<div>2 – increase the capacity on the channel. Looks like your source is running \
much faster than the sink can keep up. You can try adding more sinks to improve drain \
rate.</div> <div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium \
none;BORDER-LEFT:medium \
none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt \
solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt"> <span style="font-weight:bold">From: \
</span>IT CTO &lt;<a href="mailto:goi.cto@gmail.com" \
target="_blank">goi.cto@gmail.com</a>&gt;<br> <span \
style="font-weight:bold">Reply-To: </span>&quot;<a \
href="mailto:user@flume.apache.org" target="_blank">user@flume.apache.org</a>&quot; \
&lt;<a href="mailto:user@flume.apache.org" \
target="_blank">user@flume.apache.org</a>&gt;<br> <span \
style="font-weight:bold">Date: </span>Thursday, October 8, 2015 9:46 AM<br> <span \
style="font-weight:bold">To: </span>&quot;<a href="mailto:user@flume.apache.org" \
target="_blank">user@flume.apache.org</a>&quot; &lt;<a \
href="mailto:user@flume.apache.org" target="_blank">user@flume.apache.org</a>&gt;<br> \
<span style="font-weight:bold">Subject: </span>Flume source channel sink tunning<br> \
</div> </span></div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <span>
<div><br>
</div>
<div>
<div>
<div dir="ltr">Hi,
<div><br>
</div>
<div>I am using SpoolDir with memory channel to write to hdfs sink.</div>
<div>When I use a single spoolDir I get single threaded performance so based on some \
mails I read I splinted the source to 5 spoolDir all writing to the same memory \
channel which writes to the hdfs.</div> <div><br>
</div>
<div>Now I am getting different errors</div>
<div>1) GC error =&gt; not engough memory for the channel =&gt; increase Xmx for the \
agent</div> <div>2) Channel is full =&gt; the sink is not kipping up with the \
channel</div> <div><br>
</div>
<div>so I find myself playing with the different parameters.</div>
<div><b>any best practice here or path to follow to get it tuned?</b></div>
<div>I feel that even if it works it will&nbsp;easily&nbsp;break given other \
events&nbsp;</div> <div><br>
</div>
<div>Eran</div>
</div>
<div dir="ltr">-- <br>
</div>
<div dir="ltr">Eran | &quot;You don't need eyes to see, you need vision&quot; \
(Faithless)</div> </div>
</div>
</span></div>
</blockquote>
</div>
<div dir="ltr">-- <br>
</div>
<div dir="ltr">Eran | &quot;You don't need eyes to see, you need vision&quot; \
(Faithless)</div> </div>
</div>
</span>
</body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic