[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netsaint-devel
Subject:    [netsaint-devel] [Fwd: Re: [netsaint] Re: high service latencies monitoring large
From:       Nicholas Tang <ntang () mail ! communityconnect ! com>
Date:       2002-01-18 15:09:27
[Download RAW message or body]

I think this was meant to go to the list as well as me... :)

Ethan, or anyone else, have any ideas?  I'm not a c programmer either. 
(I can read it some, but...)

Nicholas


-----Forwarded Message-----

From: Marcus Hildenbrand <Marcus.Hildenbrand@sap.com>
To: Nicholas Tang <ntang@mail.communityconnect.com>
Subject: Re: [netsaint] Re: high service latencies monitoring large numberof   servers
Date: 18 Jan 2002 15:53:27 +0100

The environment (ulimits, serverconfig (4x750Mhz, 4GB RAM, HW
raidcontroller)) should be ok. Also I tried using a RAM drive. The
kernel version is 2.4.16 which should also be ok

Trying to find out the bottleneck I found out that the function
free_object_data in common/objects.c is worth looking at. This function
is executed for every service check. In this function the while loop
after the section "/* free memory for the service list */" is executed
for every service check in the whole configuarion (in my config 11000
times). The functions is called from the main netsaint process and loops
11000 times for every service check using a lot of cpu ressources and
blocking the starting of other service checks.

Unfortunately I'm not a C programmer so I only commented this section
out not knowing what's the effect to the rest of the world. But the
effect to the number of service checks/second was great. Now it is
possible to execute more than 40 checks per second with the same config.
Does anybody know what the side effects are? Is someone able to tune
this specific section?

Marcus

Nicholas Tang wrote:
> 
> On Wed, 2002-01-16 at 13:49, Marcus Hildenbrand wrote:
> > Hi Glenn,
> >
> > the only plugin for all checks (host and services) I'm using at the moment
> > is check_dummy. I'm using Suse 7.2. I wonder if it's worth updating to Suse
> > 7.3.
> 
> Are the ulimits set appropriately?  It's possible the netsaint user
> doesn't have enough filehandles or processes allocated to him to be able
> to run everything.
> 
> Also, play with your system's tools, check io usage, cpu usage, etc.,
> try to find out what's taking it so long.  How much RAM does the system
> have?  It probably couldn't hurt to put in some more - even if the procs
> don't need it, the system could always use it as cache.
> 
> Next, you might considering nice'ing up the netsaint processes, it might
> help.  I'd need to read up on it but I'm pretty sure there's a way to
> make child procs inherit the priority of the parent, which might help as
> well.
> 
> Finally, consider tracing some of the processes and see if there's any
> specific thing that's slowing it down.  Maybe there's some blocking
> going on; for instance, maybe all of the procs are waiting for it to
> write to the netsaint.log file and so are waiting there.  Consider
> making a RAM drive and putting all of the logs (at least the
> non-archived ones) on that.  If you don't have it set up to only write
> to the logs periodically, set that.
> 
> Nicholas

-- 
SAP AG                         E-Mail: Marcus.Hildenbrand@sap.com
IT-PSS 01 / Server Management  Phone:  [49] (6227) 7-46953
Raiffeisenring 45              Fax:    [49] (6227) 78-22099
D-68789 St. Leon Rot           URL:    http://www.sap.com




_______________________________________________
Netsaint-devel mailing list
Netsaint-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/netsaint-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic