[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ntop
Subject:    [Ntop] Hang in freeing in termIPServices() (was: ssl not working)
From:       "Burton M. Strauss III" <Burton () ntopsupport ! com>
Date:       2002-05-23 17:41:01
Message-ID: JIEPJGFPFMFIGBNCPKGGKECCCIAA.Burton () ntopsupport ! com
[Download RAW message or body]

Luca - this one is too weird for me, help!  (I'm not talking about the ssl
issue, but rather the hang in termIPServices...)
-----Burton
============================================================================

That's very interesting ... and I see much the same - I'm researching if it
has to do with openssl's handling of alias interfaces.

I have actually seen the same hang from another type of termination.  The
key thing is that after the ctrl-c, ntop's threads actually do stop (some of
them)... you can attach gdb to the running process and see this:

(gdb) info thread
  6 Thread 4101 (LWP 5739)  0x405da5a1 in __libc_nanosleep () from
/lib/i686/libc.so.6
  5 Thread 3076 (LWP 5738)  0x4054dba5 in __sigsuspend (set=0x420a797c)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
  4 Thread 2051 (LWP 5737)  0x405a057e in chunk_free (ar_ptr=0x406542a0,
p=0x8223970)
    at malloc.c:3252
  3 Thread 1026 (LWP 5736)  0x4054dba5 in __sigsuspend (set=0x410a767c)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
  2 Thread 2049 (LWP 5735)  0x406053e7 in __poll (fds=0x82b08a4, nfds=1,
timeout=2000)
    at ../sysdeps/unix/sysv/linux/poll.c:63
  1 Thread 1024 (LWP 5734)  0x405da5a1 in __libc_nanosleep () from
/lib/i686/libc.so.6
(gdb)

The web server and packet sniffers are down...

23/May/2002 12:19:33 Started thread (1026) for network packet analyser.
23/May/2002 12:19:33 Started thread (2051) for idle hosts detection.
23/May/2002 12:19:33 Started thread (3076) for DNS address resolution.
23/May/2002 12:19:33 Started thread (4101) for address purge.
23/May/2002 12:19:33 Initializing plugins (if any)...
23/May/2002 12:19:33 NetFlow export disabled
23/May/2002 12:19:33 Waiting for HTTP connections on 192.168.0.34 port
3000...
23/May/2002 12:19:33 Waiting for HTTPS (SSL) connections on port 3001...
23/May/2002 12:19:33 Started thread (5126) for web server.
23/May/2002 12:19:33 Sniffying...
23/May/2002 12:19:33 Started thread (6151) for network packet sniffing on
eth0.
23/May/2002 12:19:33 Error while reading packets: recvfrom: Network is down.
23/May/2002 12:19:33 Started thread (7176) for network packet sniffing on
eth1.

It looks like it's hung up freeing something in idle hosts thread...

(gdb) thread 4
[Switching to thread 4 (Thread 2051 (LWP 5737))]#0  0x405a057e in chunk_free
(ar_ptr=0x406542a0,
    p=0x8223970) at malloc.c:3252
3252    malloc.c: No such file or directory.
        in malloc.c
(gdb) info stack
#0  0x405a057e in chunk_free (ar_ptr=0x406542a0, p=0x8223970) at
malloc.c:3252
#1  0x405a03e4 in __libc_free (mem=0x8223fb8) at malloc.c:3154
#2  0x402a02ad in ntop_safefree (ptr=0x8222b58, file=0x402bb66c "term.c",
line=40) at leaks.c:465
#3  0x402b0b69 in termIPServices () at term.c:40
#4  0x402a3a8c in cleanup (signo=2) at ntop.c:870
#5  0x404fdac5 in pthread_sighandler (signo=2, ctx=
      {gs = 31, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43,
__dsh = 0, edi = 65536, esi = 1099593956, ebp = 1099594388, esp =
1099593928, ebx = 1099593956, edx = 1080384980, ecx = 1099593956, eax =
4294967292, trapno = 0, err = 0, eip = 1079879073, cs = 35, __csh = 0,
eflags = 518, esp_at_signal = 1099593928, ss = 43, __ssh = 0, fpstate =
0x418a7648, oldmask = 2147549184, cr2 = 0})
    at signals.c:97
#6  <signal handler called>
#7  0x405da5a1 in __libc_nanosleep () from /lib/i686/libc.so.6
#8  0x405da4db in __sleep (seconds=60) at
../sysdeps/unix/sysv/linux/sleep.c:70
#9  0x402b7521 in ntop_sleep (secs=60) at util.c:3203
#10 0x402a3648 in scanIdleLoop (notUsed=0x0) at ntop.c:624
#11 0x404fac6f in pthread_start_thread (arg=0x418a7be0) at manager.c:284
(gdb)

What's interesting is I've seen this before in termIPServices, where the
structure didn't have a name, but it does have a port...  I put a patch in
the cvs to block the free if it's not found (it's line # 37, below), but it
doesn't always seem to work.

(gdb) frame 3
#3  0x402b0b69 in termIPServices () at term.c:40
40            free(myGlobals.tcpSvc[i]);
(gdb) list
35
36          if(myGlobals.tcpSvc[i] != NULL) {
37            if (myGlobals.tcpSvc[i]->name != NULL) {
38                free(myGlobals.tcpSvc[i]->name);
39            }
40            free(myGlobals.tcpSvc[i]);
41          }
42        }
43
44        free(myGlobals.udpSvc);
(gdb)

For me, it always hangs on the same record, #70, regardless of whether I
comment it out in /etc/services.   If I put an if test to skip freeing of
that ONE record, it runs to ompletion!!!
Seriously, make the code in term.c look like this:

35
           if (i != 70) {
36          if(myGlobals.tcpSvc[i] != NULL) {
37            if (myGlobals.tcpSvc[i]->name != NULL) {
38                free(myGlobals.tcpSvc[i]->name);
39            }
40            free(myGlobals.tcpSvc[i]);
41          }
           }
42        }
43
44        free(myGlobals.udpSvc);
(gdb)

(At least that's on MY system - you might have to put a
traceEvent(TRACE_INFO, "processing %d\n", i); in to see what entry it's
hanging on...)

I know it sounds stupid, but please try it and let me know.  Meanwhile, I'll
look further at openssl and alias handling.

-----Burton


-----Original Message-----
From: ntop-admin@unipi.it [mailto:ntop-admin@unipi.it]On Behalf Of
Razvan Cosma
Sent: Thursday, May 23, 2002 7:26 AM
To: ntop@Unipi.IT
Subject: RE: [Ntop] ssl not working (solved?)


Turns out if I use -w 0 I have to specify any OTHER port than 3000 for
-W. Another problem seems to be in using an interface with multiple IPs
for the -i param.
 I have
eth1   = 10.1.0.1
eth0   = 10.2.0.1
eth0:0 = 10.3.0.1
...
(netmask/16)
and if i use -i eth1, everything works fine: ssl binds to the correct
ip/port and I can shut it down with the web interface, kill,
^C, whatever. Now if I use -i eth0 or -i eth0:0, the https server binds
to 0.0.0.0 (yes, I know what it means:), and again it can't be shut
down. The logs:

Wait please: ntop is coming up...
23/May/2002 15:07:08 Initializing IP services...
23/May/2002 15:07:08 Initializing SSL...
23/May/2002 15:07:08 SSL initialized successfully
23/May/2002 15:07:08 Initializing GDBM...
23/May/2002 15:07:08 Initializing network devices...
23/May/2002 15:07:08 ntop v.2.0.99 MT (SSL) [i686-pc-linux-gnu]
(05/20/02 10:01:02 PM build)
23/May/2002 15:07:08 Listening on [eth0,eth0:0,eth0:1]
23/May/2002 15:07:08 Copyright 1998-2002 by Luca Deri <deri@ntop.org>
23/May/2002 15:07:08 Get the freshest ntop from http://www.ntop.org/
23/May/2002 15:07:08 Initializing...
23/May/2002 15:07:08 Loading plugins (if any)...
23/May/2002 15:07:08 Searching plugins in /usr/local/lib/ntop/plugins
23/May/2002 15:07:08 Welcome to icmpWatchPlugin. (C) 1999 by Luca Deri.
23/May/2002 15:07:08 Welcome to LastSeenWatchPlugin. (C) 1999 by Andrea
Marangoni.
23/May/2002 15:07:08 Welcome to NetFlow. (C) 2002 by Luca Deri.
23/May/2002 15:07:08 Welcome to nfsWatchPlugin. (C) 1999 by Luca Deri.
23/May/2002 15:07:08 Welcome to PDAPlugin. (C) 2001-2002 by L.Deri and
W.Brock
23/May/2002 15:07:08 Welcome to sFlowPlugin. (C) 2002 by Luca Deri.
23/May/2002 15:07:08 Resetting traffic statistics...

^^^^ This is another issue, how do I keep the traffic statistics
between sessions? I did specify -S 1 ...

23/May/2002 15:07:08 Started thread (1026) for network packet analyser.
23/May/2002 15:07:08 Started thread (2051) for idle hosts detection.
23/May/2002 15:07:08 Started thread (3076) for DB update.
23/May/2002 15:07:08 Started thread (4101) for DNS address resolution.
23/May/2002 15:07:08 Started thread (5126) for address purge.
23/May/2002 15:07:08 Initializing plugins (if any)...
23/May/2002 15:07:08 NetFlow export disabled
23/May/2002 15:07:08 Waiting for HTTPS (SSL) connections on port 4000...
23/May/2002 15:07:08 Started thread (6151) for web server.
23/May/2002 15:07:08 Sniffying...
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 Cleaning up...
23/May/2002 15:07:23 Waiting until threads terminate...
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 Terminating Web connections...
23/May/2002 15:07:23 ntop caught signal 2
23/May/2002 15:07:23 ntop caught signal 2
^^^ ^C...
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 ntop caught signal 15
23/May/2002 15:07:26 Freeing hash host instances... (3 device(s) to
save)
...still hanging, ps shows all ntop processes are still running, and
netstat:
tcp      0     0 0.0.0.0:4000         0.0.0.0:*             LISTEN

ntop.sh: line 1: 19911 Killed                  ntop -r 30 -a
/var/log/ntop/access.log -e 50 -i eth0 -m 10.0.0.0/8 -u ntop -v
user:pass:db:host -S 1 -W 10.1.0.1:4000 -w 0 -P /usr/local/share/ntop/
^^^ and kill -9 :)



_______________________________________________
Ntop mailing list
Ntop@unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic