'Re: [opennms-devel] jrobin file format'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       opennms-devel
Subject:    Re: [opennms-devel] jrobin file format
From:       DJ Gregor <dj () gregor ! com>
Date:       2011-04-26 1:27:52
Message-ID: 4540C4D2-A611-4C50-9CCF-63D556FE3A73 () gregor ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi Ron,

Great insight into how OpenNMS works with JRobin. I'm not familiar with the internals \
of JRobin, but am somewhat familiar with the internals for rrdtool, and the two are \
very similar and have similar challenges.

I've done some analysis with dtrace on Solaris to see what happens at a system call \
level and also at the disk IO level. What we see is similar to rrdtool. You are right \
that the system memory (or IO cache) is important because it effect how much can be \
cached to prevent reads from physical disks. I'd suggest you look at this paper:

http://www.usenix.org/event/lisa07/tech/full_papers/plonka/

I'd love to chat more now, but I'm getting ready to move in two days, so I need to \
get back to packing. Please look at the paper, and let's keep in touch--ideally on \
list so what we learn can be archived for those with curiosity in the future.


        - djg

On Apr 21, 2011, at 7:41 PM, "Roskens, Ronald" <Ronald.Roskens@biworldwide.com> \
wrote:

> I’ve been doing a little bit of code path analysis to see why my OpenNMS instance \
> is reporting over 70,000 total operations pending. Now that number will fluctuate \
> around, but it generally stays above about 69,000 operations. During the \
> consolidation points, its grows to around 82,000, but then over time slowing comes \
> back down to the 70k point. 
> 
> 
> So lets review what happens when we do one update operation in the queue. Its \
> actually kind of interesting because its not as simple as read here, write there. \
> There’s a lot of reading going on, with the write generally confined to small \
> areas. 
> 
> 
> QueueingRrdStrategy essentially ends up calling two RrdStrategy methods for storing \
> the data out:  openFile() and updateFile(). 
> 
> 
> JRobinRrdStrategy.openFile () is pretty simple it just creates a new \
> org.jrobin.core.RrdDb object for the filename given. The RrdDB contains 3 sections, \
> a header, datasource blocks, and data archive blocks. When each section is \
> processed, objects are created for all blocks in that section, and a space \
> allocation is made, but no I/O from the file is done unless needed. 
> 
> 
> To process the header, it only needs to read the first 40 bytes for the file \
> signature to validate that we have a jrobin rrd file. 
> 
> 
> To process the datasource definitions, it reads the number of datasources from the \
> header section, then just creates the objects mapping for the datasources in the \
> file. Since each datasource definition is fixed at 128 bytes, no I/O is required. 
> 
> 
> To process the archive definitions, it reads the number of archives, then proceeds \
> to walk through the rest of the file to reading in the number of rows for each \
> archive so it knows where to skip ahead for the next one. It has to do this for \
> each archive since each archive’s data could have a different size than the \
> previous one. The data collection doc shows a rrd configuration with 4 archives, 1 \
> with 8928 data points, and 3 more with 8784 data points. 
> 
> 
> At this point we’re pretty much done with openFile(), and moving on to \
> updateFile(). 
> 
> 
> The updateFile() method is pretty simple too. Get a jrobin sample object to store \
> our data in, then set & update with our current data. 
> 
> 
> The sample object wants to record the names of all the datasources when its \
> constructed, so it needs to read in the 40 byte dsName strings for each of the \
> datasources in the file. (You’d have to have more than 31 datasources to go beyond \
> 4k in the rrd file.) 
> 
> 
> RrdDb does a check on the lastUpdateTime (in the header) to see if our latest \
> update isn’t older than it. 
> 
> 
> We then proceed to walk through the list of datasources in the file and update each \
> one with our updates. So for each datasource we grab the step and lastUpdateTime \
> from the header, then proceed to read and update the values in the datasource \
> block. Then we proceed to walk through all the archives for the datasource and \
> update the archive header and the archive state & rrd data for the current \
> datasource. 
> 
> 
> 
> 
> I guess what I take from this is that tuning OpenNMS for its RRD usage isn’t easy. \
> NIO & MNIO might not be the best methods if your rrd data size is larger than the \
> memory in your system as they cause the OS to largely inflate the amount of IO \
> necessary for just a “simple update”. 
> 
> 
> Data structure layout inside the jrobin rrd files could also benefit from some more \
> analysis. Because of how we try to read specific areas from the file, if we could \
> restructure things around we might be able to make it so we’re not having to read \
> from random spots around the file. Based on how JRobin does its object creation & \
> byte allocation, I think it would take a fair bit of work to be able to \
> re-structure where the bits get layed out in the file. As an example of how this \
> isn’t optimal, to update the datapoints for current step using the default config \
> would require updating 4 different 512 byte blocks because the datapoints are layed \
> out sequentially for each archive. If we just had to update one 512 byte block, I’d \
> expect to see a noticeable improvement in writing out the queued updates. 
> 
> 
> Ron
> 
> 
> 
> 
> 
> 
> 
> 
> This e-mail message is being sent solely for use by the intended recipient(s) and \
> may contain confidential information. Any unauthorized review, use, disclosure or \
> distribution is prohibited. If you are not the intended recipient, please contact \
> the sender by phone or reply by e-mail, delete the original message and destroy all \
>                 copies. Thank you.
> ------------------------------------------------------------------------------
> Fulfilling the Lean Software Promise
> Lean software platforms are now widely adopted and the benefits have been 
> demonstrated beyond question. Learn why your peers are replacing JEE 
> containers with lightweight application servers - and what you can gain 
> from the move. http://p.sf.net/sfu/vmware-sfemails
> _______________________________________________
> Please read the OpenNMS Mailing List FAQ:
> http://www.opennms.org/index.php/Mailing_List_FAQ
> 
> opennms-devel mailing list
> 
> To *unsubscribe* or change your subscription options, see the bottom of this page:
> https://lists.sourceforge.net/lists/listinfo/opennms-devel


[Attachment #5 (unknown)]

<html><body bgcolor="#FFFFFF"><div>Hi Ron,</div><div><br></div><div>Great insight \
into how OpenNMS works with JRobin. I'm not familiar with the internals of JRobin, \
but am somewhat familiar with the internals for rrdtool, and the two are very similar \
and have similar challenges.</div><div><br></div><div>I've done some analysis with \
dtrace on Solaris to see what happens at a system call level and also at the disk IO \
level. What we see is similar to rrdtool. You are right that the system memory (or IO \
cache) is important because it effect how much can be cached to prevent reads from \
physical disks.<span class="Apple-style-span" style="-webkit-tap-highlight-color: \
rgba(26, 26, 26, 0.296875); -webkit-composition-fill-color: rgba(175, 192, 227, \
0.230469); -webkit-composition-frame-color: rgba(77, 128, 180, 0.230469); ">&nbsp;I'd \
suggest you look at this paper:</span></div><div><br></div><div><a \
href="http://www.usenix.org/event/lisa07/tech/full_papers/plonka/">http://www.usenix.org/event/lisa07/tech/full_papers/plonka/</a></div><div><br></div><div>I'd \
love to chat more now, but I'm getting ready to move in two days, so I need to get \
back to packing. Please look at the paper, and let's keep in touch--ideally on list \
so what we learn can be archived for those with curiosity in the \
future.</div><div><br></div><div><br></div><div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;- \
djg</div><div><br>On Apr 21, 2011, at 7:41 PM, "Roskens, Ronald" &lt;<a \
href="mailto:Ronald.Roskens@biworldwide.com">Ronald.Roskens@biworldwide.com</a>&gt; \
wrote:<br><br></div><div></div><blockquote type="cite"><div> <div \
class="WordSection1"> <p class="MsoNormal">I've been doing a little bit of code path \
analysis to see why my OpenNMS instance is reporting over 70,000 total operations \
pending. Now that number will fluctuate around, but it generally stays above about \
69,000 operations. During the  consolidation points, its grows to around 82,000, but \
then over time slowing comes back down to the 70k point.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">So lets review what \
happens when we do one update operation in the queue. Its actually kind of \
interesting because its not as simple as read here, write there. There's a lot of \
reading going on, with the write generally confined to small  areas.<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">QueueingRrdStrategy essentially ends up calling two RrdStrategy \
methods for storing the data out: &nbsp;openFile() and updateFile().<o:p></o:p></p> \
<p class="MsoNormal"><o:p>&nbsp;</o:p></p> <p \
class="MsoNormal">JRobinRrdStrategy.openFile () is pretty simple it just creates a \
new org.jrobin.core.RrdDb object for the filename given. The RrdDB contains 3 \
sections, a header, datasource blocks, and data archive blocks. When each section is \
processed,  objects are created for all blocks in that section, and a space \
allocation is made, but no I/O from the file is done unless needed.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">To process the header, \
it only needs to read the first 40 bytes for the file signature to validate that we \
have a jrobin rrd file. <o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">To process the datasource definitions, it reads the number of \
datasources from the header section, then just creates the objects mapping for the \
datasources in the file. Since each datasource definition is fixed at 128 bytes, no \
I/O is  required.<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">To process the archive definitions, it reads the number of \
archives, then proceeds to walk through the rest of the file to reading in the number \
of rows for each archive so it knows where to skip ahead for the next one. It has to \
do this  for each archive since each archive's data could have a different size than \
the previous one. The data collection doc shows a rrd configuration with 4 archives, \
1 with 8928 data points, and 3 more with 8784 data points.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">At this point we're \
pretty much done with openFile(), and moving on to updateFile().<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">The updateFile() method \
is pretty simple too. Get a jrobin sample object to store our data in, then set &amp; \
update with our current data.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">The sample object wants \
to record the names of all the datasources when its constructed, so it needs to read \
in the 40 byte dsName strings for each of the datasources in the file. (You'd have to \
have more than 31 datasources to go beyond  4k in the rrd file.)<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">RrdDb does a check on the lastUpdateTime (in the header) to see \
if our latest update isn't older than it.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal">We then proceed to walk \
through the list of datasources in the file and update each one with our updates. So \
for each datasource we grab the step and lastUpdateTime from the header, then proceed \
to read and update the values in the datasource  block. Then we proceed to walk \
through all the archives for the datasource and update the archive header and the \
archive state &amp; rrd data for the current datasource.<o:p></o:p></p> <p \
class="MsoNormal"><o:p>&nbsp;</o:p></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">I guess what I take from this is that tuning OpenNMS for its RRD \
usage isn't easy. NIO &amp; MNIO might not be the best methods if your rrd data size \
is larger than the memory in your system as they cause the OS to largely inflate the \
amount  of IO necessary for just a "simple update".<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Data structure layout inside the jrobin rrd files could also \
benefit from some more analysis. Because of how we try to read specific areas from \
the file, if we could restructure things around we might be able to make it so we're \
not having  to read from random spots around the file. Based on how JRobin does its \
object creation &amp; byte allocation, I think it would take a fair bit of work to be \
able to re-structure where the bits get layed out in the file. As an example of how \
this isn't optimal,  to update the datapoints for current step using the default \
config would require updating 4 different 512 byte blocks because the datapoints are \
layed out sequentially for each archive. If we just had to update one 512 byte block, \
I'd expect to see a noticeable  improvement in writing out the queued \
updates.<o:p></o:p></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">Ron<o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<br>
This e-mail message is being sent solely for use by the intended recipient(s) and may \
contain confidential information.  Any unauthorized review, use, disclosure or \
distribution is prohibited.  If you are not the intended recipient, please contact \
the sender by phone or reply by e-mail, delete the original message and destroy all \
copies. Thank you.<br>


</div></blockquote><blockquote \
type="cite"><div><span>------------------------------------------------------------------------------</span><br><span>Fulfilling \
the Lean Software Promise</span><br><span>Lean software platforms are now widely \
adopted and the benefits have been </span><br><span>demonstrated beyond question. \
Learn why your peers are replacing JEE </span><br><span>containers with lightweight \
application servers - and what you can gain </span><br><span>from the move. <a \
href="http://p.sf.net/sfu/vmware-sfemails"><a \
href="http://p.sf.net/sfu/vmware-sfemails">http://p.sf.net/sfu/vmware-sfemails</a></a></span></div></blockquote><blockquote \
type="cite"><div><span>_______________________________________________</span><br><span>Please \
read the OpenNMS Mailing List FAQ:</span><br><span><a \
href="http://www.opennms.org/index.php/Mailing_List_FAQ">http://www.opennms.org/index.php/Mailing_List_FAQ</a></span><br><span></span><br><span>opennms-devel \
mailing list</span><br><span></span><br><span>To *unsubscribe* or change your \
subscription options, see the bottom of this page:</span><br><span><a \
href="https://lists.sourceforge.net/lists/listinfo/opennms-devel">https://lists.source \
forge.net/lists/listinfo/opennms-devel</a></span></div></blockquote></body></html>



------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd

_______________________________________________
Please read the OpenNMS Mailing List FAQ:
http://www.opennms.org/index.php/Mailing_List_FAQ

opennms-devel mailing list

To *unsubscribe* or change your subscription options, see the bottom of this page:
https://lists.sourceforge.net/lists/listinfo/opennms-devel

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic