[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ntop
Subject:    [Ntop] RE: Ntop spawns too many child processes and die eventually
From:       "Clive Luk" <clive () ilanet ! net ! au>
Date:       2005-09-22 23:21:43
Message-ID: 086201c5bfcc$67bf48b0$cd0010ac () ad ! sl ! nsw ! gov ! au
[Download RAW message or body]

Sorry Burton,

I am using version 3.1

Here are the info.

========= vmstat ============

bash-2.05# vmstat 
 kthr      memory            page            disk          faults
cpu
 r b w   swap  free  re  mf pi po fr de sr m1 m2 m3 m4   in   sy   cs us
sy id
 0 0 0 1172392 323536 7  40  0  1  1  0  0  2  0  2  0  557  344  128  3
1 96

========= vmstat ============

========= Ntop.conf ============

--user ntop

--db-file-path /usr/local/share/ntop

--interface hme0,qfe0

-M

--trace-level 7

--use-syslog=local3

--http-server 3000

--disable-schedyield

========= Ntop.conf ============

I may go and try the 3.2rc1. The qfe0 are monitoring the
incoming/outgoing traffic on the router. Any recommendation on limiting
the traffic by ntop.

Thanks,
Clive

 You neglected to tell us which version of ntop this is. Please try
3.2rc1.

Read the back traffic and docs/FAQ discussions on memory - it's quite
possible you are trying to monitor too many hosts and are legitimately
running out of memory. In the info.html / textinfo.html page are some
rough
estimates of the per-host memory usage for your configuration - you
should
be able to tell from that.

Tools like vmstat will tell you if you are swapping. Swapping is very,
very, bad.

It's also possible that ntop is creating zombies - I haven't done any
testing under Solaris for quite some time.

-----Burton 

-----Original Message-----
From: Clive Luk [mailto:clive@ilanet.net.au] 
Sent: Thursday, 22 September 2005 12:58 PM
To: ntop@unipi.it
Subject: Ntop spawns too many child processes and die eventually


Hi all Ntop Guru,

I have a big problem with my Ntop. I like ntop so much it collects all
data for me to present to my boss nicely in a meaningfully report
format.

Firstly, let me tell you what system that NTOP is running on

#####################
System Configuration:  Sun Microsystems  sun4u Netra t1 (UltraSPARC-IIi
440MHz)
System clock frequency: 110 MHz
Memory size: 512 Megabytes

========================= CPUs =========================

                    Run   Ecache   CPU    CPU
Brd  CPU   Module   MHz     MB    Impl.   Mask
---  ---  -------  -----  ------  ------  ----
 0     0     0      440     2.0   12       9.1

SunOS monitor 5.9 Generic_118558-10 sun4u sparc
SUNW,UltraSPARC-IIi-cEngine #####################

I have complied ntop with no problem here are the stuff I installed

-rw-r--r--   1 root     other    7182336 Aug 30  2004
gawk-3.1.4-sol9-sparc-local
drwxrwxrwx   6 200      300         4096 Sep  8 09:51 gd-2.0.33
-rw-r--r--   1 root     other    2519040 Sep  8 09:33 gd-2.0.33.tar
-rw-r--r--   1 root     other    1317376 May  4  2003
gdbm-1.8.3-sol9-sparc-local
-rw-r--r--   1 clive    other    1119232 Aug 31 19:15
libpcap-0.9.3-sol9-sparc-local
-rw-r--r--   1 root     other    1184768 Dec 13  2004
libpng-1.2.8-sol9-sparc-local
-rw-r--r--   1 root     other    2396672 Oct  5  2002
make-3.80-sol9-sparc-local
drwxr-xr-x  15 root     other       4096 Sep  8 10:43 ntop
-rw-r--r--   1 clive    other    9809920 Sep  7 17:48 ntop-3.1.tar

I have a quad-NIC on the system as well. I have mirrored the router port
on a switch to one of the port on the quad card for NTOP to collect
data.

The problem is when I load the default page of ntop. It first load the
"Traffic Summary". And there is a few charts on the first page. It
always won't fully load all the charts sucessfully on the first few
load. I need to manually refresh the browser few times for all the
charts to load completely. Howerver, when I look do a "ps -ef | grep
ntop" the parent ntop process spawned so many child processes. (I
assumed those child processes are spawned to generated those
charts/images while I was doing the manual browser refresh). If I don't
refresh the browser until it spawns enough processes, the browser will
just halt there and waiting for the charts to be loaded.

    ntop 29629     1  0 15:18:52 pts/2   204:36 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27032 29629  0 09:10:53 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 26999 29629  0 09:10:03 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 26998 29629  0 09:10:03 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27027 29629  0 09:10:44 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27025 29629  0 09:10:44 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 26996 29629  0 09:10:02 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27033 29629  0 09:10:53 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27026 29629  0 09:10:44 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27024 29629  0 09:10:44 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27031 29629  0 09:10:53 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf
    ntop 27000 29629  0 09:10:03 ?        0:00 /usr/local/bin/ntop
@/etc/ntop.conf


Here is the log please ignore the "local3.error" [ID 702911
local3.error] this is just a log format. It's not really an error. The
main keyword here is 

**ERROR** An error occurred while forking ntop [errno=12]..
**FATAL_ERROR** malloc(10560) @ pbuf.c:122 returned NULL [no more
memory?]

What I have here is I have got a browser running all the time on the
first page. It will reload the page every few minutes. I had it running
for probably 5-6 hours. 

**ERROR** An error occurred while forking ntop [errno=12]..

This message just keep coming up. If anyone can tell me what does the
error mean. I am greatly appreciated!

Ok. And then after a while 5-6 hours of collecting of data. NTOP just
die with 

**FATAL_ERROR** malloc(10560) @ pbuf.c:122 returned NULL [no more
memory?]

This error message come up and the parent process is terminated, all the
child processes still remain on the system.


######################
.
.
.
Sep 21 15:08:10 monitor last message repeated 7 times
Sep 21 15:08:33 monitor ntop[18258]: [ID 702911 local3.error]
[MSGID0825709] [hash:708] IDLE_PURGE: Device 0 [hme0] FINISHED
selection, 7 [out of 402] hosts selected Sep 21 15:08:33 monitor
ntop[18258]: [ID 702911 local3.error] [MSGID8477291] [hash:733]
IDLE_PURGE: Device 0 [hme0]: 7/401 hosts deleted, elapsed time is
0.682183 seconds (0.097455 per host) Sep 21 15:08:34 monitor
ntop[18258]: [ID 702911 local3.error] [MSGID0825709] [hash:708]
IDLE_PURGE: Device 1 [qfe0] FINISHED selection, 387 [out of 3637] hosts
selected Sep 21 15:08:34 monitor ntop[18258]: [ID 702911 local3.error]
[MSGID8477291] [hash:733] IDLE_PURGE: Device 1 [qfe0]: 387/3636 hosts
deleted, elapsed time is 0.842609 seconds (0.002177 per host) Sep 21
15:10:12 monitor ntop[18258]: [ID 702911 local3.error] [MSGID8644672]
[http:2546] **ERROR** An error occurred while forking ntop [errno=12]..
Sep 21 15:10:13 monitor last message repeated 7 times Sep 21 15:10:35
monitor ntop[18258]: [ID 702911 local3.error] [MSGID0825709] [hash:708]
IDLE_PURGE: Device 0 [hme0] FINISHED selection, 10 [out of 398] hosts
selected Sep 21 15:10:37 monitor ntop[18258]: [ID 702911 local3.error]
[MSGID8477291] [hash:733] IDLE_PURGE: Device 0 [hme0]: 10/397 hosts
deleted, elapsed time is 1.412442 seconds (0.141244 per host) Sep 21
15:10:38 monitor ntop[18258]: [ID 702911 local3.error] [MSGID0825709]
[hash:708] IDLE_PURGE: Device 1 [qfe0] FINISHED selection, 427 [out of
4073] hosts selected Sep 21 15:10:40 monitor ntop[18258]: [ID 702911
local3.error] [MSGID8477291] [hash:733] IDLE_PURGE: Device 1 [qfe0]:
427/4062 hosts deleted, elapsed time is 3.242642 seconds (0.007594 per
host) Sep 21 15:10:59 monitor ntop[18258]: [ID 702911 local3.error]
[MSGID8757584] [ntop:699] OSFP: scanFingerprintLoop() checked 774,
resolved 774 Sep 21 15:11:34 monitor ntop[18258]: [ID 702911
local3.error] [MSGID9233555] [vendor:355] MAC prefix '00:13:21' not
found in vendor database Sep 21 15:12:29 monitor ntop[18258]: [ID 702911
local3.error] [MSGID8483603] [leaks:512] **FATAL_ERROR** malloc(10560) @
pbuf.c:122 returned NULL [no more memory?] Sep 21 15:12:29 monitor
ntop[18258]: [ID 702911 local3.error] [MSGID9061761] [leaks:516]
**WARNING** ntop packet capture STOPPED Sep 21 15:12:29 monitor
ntop[18258]: [ID 702911 local3.error] [MSGID0261385] [leaks:517] NOTE:
ntop web server remains up Sep 21 15:12:29 monitor ntop[18258]: [ID
702911 local3.error] [MSGID0816631] [leaks:518] NOTE: Shutdown
gracefully and restart with more memory
#####################################

My question are:

1. Is the system I am running NTOP on powerful enough? More CPU power?
More memory? 2. Would that be any chance I have a complied problem? 3.
The libpng library I used is not correct? (I have installed the libpng
as a SUN package. But wouldn't ntop complain if I miss any library while
I complie Ntop?)


Thanks in advance!
Hope someone can save me here!

Cheers,
Clive





_______________________________________________
Ntop mailing list
Ntop@unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic