[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netsaint-devel
Subject:    [netsaint-devel] Re: [netsaint] re: high service latencies  scratch that!!
From:       Marcus Hildenbrand <Marcus.Hildenbrand () sap ! com>
Date:       2002-01-21 12:59:30
[Download RAW message or body]

Running netsaint the whole weekend without free_memory() there seems to
be no memeory leak.

The kernel should free all memory of a program when it exits.

I think large installations could also benefit from writing a
signalhandler instead of using 2 forks per check. In my configuration
the CPU spends more time in the system mode than in user mode.

Ethan Galstad wrote:
> 
> Glenn, you are correct.  My reason for calling free_memory() was to
> save memory since there is no reason the child process needs it.  If
> memory wasn't freed until after the check was actually run, the child
> proc could be eating up a lot of unnecessary memory.  Plus, there are
> really two fork()s per check, so the kernel has less data to copy
> when the grandchild forks if memory is freed beforehand.
> 
> On smaller systems this is more important than on those with 4GB of
> RAM, so it might make sense to comment this line out in some cases.
> However, I'm not really sure if the kernel frees memory when a
> program exits.  Ideally it would, but it seems better to stay on the
> safe side of things and clean up gracefully rather than leave it up
> to the OS - particularly if this is not a standard feature found in
> all *NIX kernels.  Still, if you don't see any signs of a memory leak
> when running with this line commented out, I'd say go for it.
> 
> Obviously this points to a need for future tweaks related to large
> installations.  Linked lists are easy to work with and fine for
> smaller lists, but definitely break down performance-wise as the size
> of the list increases.  New data structures are probably a good idea,
> as well as a memory allocation scheme that malloc()s enough mem for
> say 100 contiguous items at a time.  That way the kernel doesn't have
> as many blocks to free to clean things up.  At any rate, I'll keep it
> in mind for the future.
> 
> From:                   "Glenn A. Thompson" <glenn@cdrguys.com>
> > Hey:
> >
> > Now I see whats going on.
> > Service checks are done through forking a new process.
> > In the child process prior to running at new program in the process space he is freeing
> > the structure.
> > I don't beleive that it is required. Could be wrong of course
> > When the child dies after the service check completes the memory should be freed by the
> > system.
> >
> > SO in checks.c
> >
> >         /* if we are in the child process... */
> >         else if(pid==0){
> >
> >                 /* free allocated memory */
> >                 free_memory();
> >
> > Try commenting out:
> > /* free_memory(); */
> > See if all works OK. it should be even faster if it still works.
> >
> > This means my comment from before about the list construction is totally mute.
> >
> > I think he is trying to conserve on memory usage.
> > If I remember right.  The memory copied from the parent will stay around until the process
> > dies. (unless replaced by new program, and maybe not even then)
> > So if you had a 100 service checks running at the same time. you will have some rather
> > large process process doing pings etc.
> > Since he forks twice it could be even worse.
> >
> > Ethan, if you are reading this feel free to flame my assessment:-)
> >
> > Glenn
> >
> > Nicholas Tang wrote:
> >
> > > I think this was meant to go to the list as well as me... :)
> > >
> > > Ethan, or anyone else, have any ideas?  I'm not a c programmer either.
> > > (I can read it some, but...)
> > >
> > > Nicholas
> > >
> > > -----Forwarded Message-----
> > >
> > > From: Marcus Hildenbrand <Marcus.Hildenbrand@sap.com>
> > > To: Nicholas Tang <ntang@mail.communityconnect.com>
> > > Subject: Re: [netsaint] Re: high service latencies monitoring large numberof   servers
> > > Date: 18 Jan 2002 15:53:27 +0100
> > >
> > > The environment (ulimits, serverconfig (4x750Mhz, 4GB RAM, HW
> > > raidcontroller)) should be ok. Also I tried using a RAM drive. The
> > > kernel version is 2.4.16 which should also be ok
> > >
> > > Trying to find out the bottleneck I found out that the function
> > > free_object_data in common/objects.c is worth looking at. This function
> > > is executed for every service check. In this function the while loop
> > > after the section "/* free memory for the service list */" is executed
> > > for every service check in the whole configuarion (in my config 11000
> > > times). The functions is called from the main netsaint process and loops
> > > 11000 times for every service check using a lot of cpu ressources and
> > > blocking the starting of other service checks.
> > >
> > > Unfortunately I'm not a C programmer so I only commented this section
> > > out not knowing what's the effect to the rest of the world. But the
> > > effect to the number of service checks/second was great. Now it is
> > > possible to execute more than 40 checks per second with the same config.
> > > Does anybody know what the side effects are? Is someone able to tune
> > > this specific section?
> > >
> > > Marcus
> > >
> > > Nicholas Tang wrote:
> > > >
> > > > On Wed, 2002-01-16 at 13:49, Marcus Hildenbrand wrote:
> > > > > Hi Glenn,
> > > > >
> > > > > the only plugin for all checks (host and services) I'm using at the moment
> > > > > is check_dummy. I'm using Suse 7.2. I wonder if it's worth updating to Suse
> > > > > 7.3.
> > > >
> > > > Are the ulimits set appropriately?  It's possible the netsaint user
> > > > doesn't have enough filehandles or processes allocated to him to be able
> > > > to run everything.
> > > >
> > > > Also, play with your system's tools, check io usage, cpu usage, etc.,
> > > > try to find out what's taking it so long.  How much RAM does the system
> > > > have?  It probably couldn't hurt to put in some more - even if the procs
> > > > don't need it, the system could always use it as cache.
> > > >
> > > > Next, you might considering nice'ing up the netsaint processes, it might
> > > > help.  I'd need to read up on it but I'm pretty sure there's a way to
> > > > make child procs inherit the priority of the parent, which might help as
> > > > well.
> > > >
> > > > Finally, consider tracing some of the processes and see if there's any
> > > > specific thing that's slowing it down.  Maybe there's some blocking
> > > > going on; for instance, maybe all of the procs are waiting for it to
> > > > write to the netsaint.log file and so are waiting there.  Consider
> > > > making a RAM drive and putting all of the logs (at least the
> > > > non-archived ones) on that.  If you don't have it set up to only write
> > > > to the logs periodically, set that.
> > > >
> > > > Nicholas
> > > _______________________________________________
> > > Netsaint-users mailing list
> > > Netsaint-users@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/netsaint-users
> >
> >
> > _______________________________________________
> > Netsaint-users mailing list
> > Netsaint-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/netsaint-users
> 
> Ethan Galstad
> NetSaint Developer
> ---
> Email:   netsaint@netsaint.org
> Website: http://www.netsaint.org
> 
> _______________________________________________
> Netsaint-users mailing list
> Netsaint-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/netsaint-users

_______________________________________________
Netsaint-devel mailing list
Netsaint-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netsaint-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic