[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bro
Subject:    [Zeek] Re: Possible memory leak in logger process?
From:       Dheeraj Gupta <dheeraj.gupta4 () gmail ! com>
Date:       2021-12-14 5:15:23
Message-ID: CAOsL98NuEAN62nNRCyAfB2Cz3AWTrHJ00kn8UCaCKaM36pVrsQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Thanks for the pointer Tim.

I will try to run jemalloc profiling and post back on the Github issue.

- Dheeraj

On Mon, Dec 13, 2021 at 11:18 PM Tim Wojtulewicz <tim@corelight.com> wrote:

> We also have another report of the same in
> https://github.com/zeek/zeek/issues/1856. Is it possible for you to
> rebuild with jemalloc support and run the jemalloc profiling plugin on your
> logger node? That should give more information about what's causing the
> bloat. We can use that issue to discuss more in depth what's going on with
> it, if that's easier than email.
>
> Tim
>
> On Dec 12, 2021, at 11:15 PM, Dheeraj Gupta <dheeraj.gupta4@gmail.com>
> wrote:
>
> Hi,
>
> We have a Zeek node that sees high volumes on working days. Due to our
> internal network configuration a lot of connections for our internal DNS
> servers are generated by certain endpoints (because our DNS does not
> resolve any external domains and certain applications keep repeating the
> DNS requests at astronomical rates). The node is a 16 core, 128GB VM and we
> use ASCII logger.
>
> We have observed that under high loads (~40k writes/s), the logger process
> starts lagging behind and its memory usage goes up. Once the machine is
> using >60% of its memory, Zeek starts dropping packets and a general drop
> in performance is observed. Only solution is to restart the zeek process.
>
> My understanding is that logger is buffering the unwritten lines in memory
> and so memory usage is going up.
>
> To work around this, I split the output files so that all connections to
> the DNS server and all DNS requests to high velocity domains are logged to
> separate files (conn-noise.log and dns-noise.log). These two files consume
> nearly 80% of the disk usage under the current directory (E.g. in 30
> minutes the current directory use is 4.9G out of which these two files use
> 4.0G). Doing this, I hoped that any lags would be limited to these two
> files and I will lose less data on a restart. Also by using separate
> threads for heavily written files, I may be able to get better performance.
> The idea has worked partially as lags for other files are generally low now
> although we do need to restart zeek if memory usage goes beyond 55%.
>
> The problem is that I have observed that logger memory usage does not
> decrease on its own when the loads reduce (e.g. at night). E.g. If Zeek was
> using 40G memory on Friday evening and dns-noise was showing a lag of 1800
> seconds, the memory usage on Monday morning is still 40G although the lag
> is only around 1 second. Has anyone experienced anything similar? I am
> running Zeek-4.1.1.
>
> Thanks,
> Dheeraj
>
> --
> zeek mailing list -- zeek@lists.zeek.org
> To unsubscribe send an email to zeek-leave@lists.zeek.org
>
>
>

[Attachment #5 (text/html)]

<div dir="ltr"><div>Thanks for the pointer Tim.</div><div><br></div><div>I will try \
to run jemalloc profiling and post back on the Github \
issue.<br></div><div><br></div><div>- Dheeraj</div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 13, 2021 at 11:18 \
PM Tim Wojtulewicz &lt;<a href="mailto:tim@corelight.com">tim@corelight.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div \
style="overflow-wrap: break-word;">We also have another report of the same in  <a \
href="https://github.com/zeek/zeek/issues/1856" \
target="_blank">https://github.com/zeek/zeek/issues/1856</a>. Is it possible for you \
to rebuild with jemalloc support and run the jemalloc profiling plugin on your logger \
node? That should give more information about what's causing the bloat. We can use \
that issue to discuss more in depth what's going on with it, if that's easier than \
email.<div><br></div><div>Tim<br><div><br><blockquote type="cite"><div>On Dec 12, \
2021, at 11:15 PM, Dheeraj Gupta &lt;<a href="mailto:dheeraj.gupta4@gmail.com" \
target="_blank">dheeraj.gupta4@gmail.com</a>&gt; wrote:</div><br><div><div \
dir="ltr"><div>Hi,</div><div><br></div><div>We have a Zeek node that sees high \
volumes on working days. Due to our internal network configuration a lot of \
connections for our internal DNS servers are generated by certain endpoints (because \
our DNS does not resolve any external domains and certain applications keep repeating \
the DNS requests at astronomical rates). The node is a 16 core, 128GB VM and we use \
ASCII logger.<br></div><div><br></div><div>We have observed that under high loads \
(~40k writes/s), the logger process starts lagging behind and its memory usage goes \
up. Once the machine is using &gt;60% of its memory, Zeek starts dropping packets and \
a general drop in performance is observed. Only solution is to restart the zeek \
process.</div><div><br></div><div>My understanding is that logger is buffering the \
unwritten lines in memory and so memory usage is going \
up.<br></div><div><br></div><div>To work around this, I split the output files so \
that all connections to the DNS server and all DNS requests to high velocity domains \
are logged to separate files (conn-noise.log and dns-noise.log). These two files \
consume nearly 80% of the disk usage under the current directory (E.g. in 30 minutes \
the current directory use is 4.9G out of which these two files use 4.0G). Doing this, \
I hoped that any lags would be limited to these two files and I will lose less data \
on a restart. Also by using separate threads for heavily written files, I may be able \
to get better performance. The idea has worked partially as lags for other files are \
generally low now although we do need to restart zeek if memory usage goes beyond \
55%.<br></div><div><br></div><div>The problem is that I have observed that logger \
memory usage does not decrease on its own when the loads reduce (e.g. at night). E.g. \
If Zeek was using 40G memory on Friday evening and dns-noise was showing a lag of \
1800 seconds, the memory usage on Monday morning is still 40G although the lag is \
only around 1 second. Has anyone experienced anything similar? I am running \
Zeek-4.1.1.<br></div><div><br></div><div>Thanks,</div><div>Dheeraj<br></div></div> \
<br>--<br>zeek mailing list -- <a href="mailto:zeek@lists.zeek.org" \
target="_blank">zeek@lists.zeek.org</a><br>To unsubscribe send an email to <a \
href="mailto:zeek-leave@lists.zeek.org" \
target="_blank">zeek-leave@lists.zeek.org</a></div></blockquote></div><br></div></div></blockquote></div>




--
zeek mailing list -- zeek@lists.zeek.org
To unsubscribe send an email to zeek-leave@lists.zeek.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic