[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ntop-dev
Subject:    [Ntop-dev] New showMutex.html report
From:       "Burton M. Strauss III" <Burton () ntopsupport ! com>
Date:       2003-10-31 17:13:48
Message-ID: JIEPJGFPFMFIGBNCPKGGOEHGEIAA.Burton () ntopsupport ! com
[Download RAW message or body]

(Yeah, this message is in html.  I don't usually do it, but for showing an
example of an hmtl page, it's the easiest way...)

For tracking down deadlocks, I've added this new page
http://127.0.0.1:3000/showMutex.html to the version in the ntop cvs.
(previously, the mutex report was only generated if the -K flag was set,
which has other consequences).

Note that just because something is 'blocked' for an instant doesn't imply a
deadlock.  That is COMPLETELY NORMAL behavior.

The normal output looks like this:

      Mutex Name State Last Lock Blocked Last UnLock # Locks/Releases Max
Lock
      gdbmMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
util.c:3386(13349)   util.c:3393(13349) 100 100 0 sec [util.c:3413]
      packetProcessMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
pbuf.c:1631(13350)   pbuf.c:1646(13350) 477 477 1 sec [pbuf.c:1646]
      packetQueueMutex unlocked at Fri 31 Oct 2003 10:53:37 AM CST
pbuf.c:1744(13344)   pbuf.c:1764(13344) 144 144 0 sec [pbuf.c:1689]
      purgeMutex locked at Fri 31 Oct 2003 10:53:41 AM CST
http.c:2584(13349)   :0(0) 1 0 0 sec [:0]
      hostsHashMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
pbuf.c:1998(13350)   pbuf.c:2703(13350) 477 477 1 sec [pbuf.c:2703]
      tcpSessionsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
sessions.c:584(13350)   sessions.c:1975(13350) 30 30 0 sec [sessions.c:1975]
      purgePortsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
pbuf.c:597(13350)   pbuf.c:618(13350) 57 57 0 sec [pbuf.c:618]
      securityItemsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
http.c:2210(13349)   http.c:2223(13349) 2 2 0 sec [http.c:2205]


  The give-away of a deadlock would be a line like this:

      Mutex Name State Last Lock Blocked Last UnLock # Locks/Releases Max
Lock
      gdbmMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
util.c:3386(13349)   util.c:3393(13349) 100 100 0 sec [util.c:3413]
      packetProcessMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
pbuf.c:1631(13350)   pbuf.c:1646(13350) 477 477 1 sec [pbuf.c:1646]
      packetQueueMutex unlocked at Fri 31 Oct 2003 10:53:37 AM CST
pbuf.c:1744(13344)   pbuf.c:1764(13344) 144 144 0 sec [pbuf.c:1689]
      purgeMutex locked at Fri 31 Oct 2003 10:53:41 AM CST
http.c:2584(13349)   :0(0) 1 0 0 sec [:0]
      hostsHashMutex locked at Fri 31 Oct 2003 10:52:17 AM CST
hash.c:552(13350) pbuf.c:1998 (13355)  hash.c:601 (13350) 477 477 1 sec
[pbuf.c:2703]
      tcpSessionsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
sessions.c:584(13350)   sessions.c:1975(13350) 30 30 0 sec [sessions.c:1975]
      purgePortsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
pbuf.c:597(13350)   pbuf.c:618(13350) 57 57 0 sec [pbuf.c:618]
      securityItemsMutex unlocked at Fri 31 Oct 2003 10:53:41 AM CST
http.c:2210(13349)   http.c:2223(13349) 2 2 0 sec [http.c:2205]


See how the 'at' tag is OLD and there's something in the blocked line?  If
you refresh a few times over 30s or a minute and that doesn't change, you're
deadlocked.

The KEY to reporting these is to go into the source and pull up the actual
lines - plus some context - where the Last Lock, Last UnLock and Block show
up.  (Since the cvs changes often, line #s aren't very useful without the
corresponding code).

For example, the report of the above should be:

Deadlock (web server active), showMutex.html shows hostsHashMutex, locked at
hash.c:552:

  for(idx=0; idx<myGlobals.device[actDevice].actualHashSize; idx++) {
#ifdef CFG_MULTITHREADED
    accessMutex(&myGlobals.hostsHashMutex, "scanIdleLoop");
#endif

Last Unlock at hash.c:601:

      if(numHosts >= (maxHosts-1)) break;

Blocked at pbuf.c:1998:


incrementTrafficCounter(&myGlobals.device[actualDeviceId].rcvdPktStats.tooLo
ng, 1);
  }

#ifdef CFG_MULTITHREADED
  accessMutex(&myGlobals.hostsHashMutex, "processPacket");
#endif

#ifdef DEBUG
  traceEvent(CONST_TRACE_INFO, "actualDeviceId = %d", actualDeviceId);
#endif



-----Burton



[Attachment #3 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1264" name=GENERATOR></HEAD>
<BODY>
<P>(Yeah, this message is in html.&nbsp; I don't usually do it, but for showing 
an example of an hmtl page, it's the easiest way...)</P>
<P><FONT size=2><FONT size=3>For tracking down deadlocks, I've added this new 
page <A 
href="http://127.0.0.1:3000/showMutex.html">http://127.0.0.1:3000/showMutex.html</A> 
to the version in the ntop cvs.&nbsp; (previously, the mutex report was only 
generated if the -K flag was set, which has other 
consequences).</FONT></FONT></P>
<P align=center><FONT color=#ff0000><STRONG>Note that just because something is 
'blocked' for an instant doesn't imply a deadlock.&nbsp; That is COMPLETELY 
NORMAL behavior</STRONG></FONT>.</P>
<P><FONT size=2><FONT size=3>The normal output looks like this:</FONT></P>
<P>
<TABLE border=1>
  <TBODY>
  <TR>
    <TH>Mutex Name</TH>
    <TH>State</TH>
    <TH>Last Lock</TH>
    <TH>Blocked</TH>
    <TH>Last UnLock</TH>
    <TH colSpan=2># Locks/Releases</TH>
    <TH>Max Lock</TH></TR>
  <TR>
    <TH align=left>gdbmMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST util.c:3386(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>util.c:3393(13349)</TD>
    <TD align=right>100</TD>
    <TD align=right>100</TD>
    <TD align=right>0 sec [util.c:3413]</TD></TR>
  <TR>
    <TH align=left>packetProcessMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1631(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:1646(13350)</TD>
    <TD align=right>477</TD>
    <TD align=right>477</TD>
    <TD align=right>1 sec [pbuf.c:1646]</TD></TR>
  <TR>
    <TH align=left>packetQueueMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:37 AM CST pbuf.c:1744(13344)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:1764(13344)</TD>
    <TD align=right>144</TD>
    <TD align=right>144</TD>
    <TD align=right>0 sec [pbuf.c:1689]</TD></TR>
  <TR>
    <TH align=left>purgeMutex</TH>
    <TD align=middle><FONT color=red>locked</FONT></TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST http.c:2584(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>:0(0)</TD>
    <TD align=right>1</TD>
    <TD align=right>0</TD>
    <TD align=right>0 sec [:0]</TD></TR>
  <TR>
    <TH align=left>hostsHashMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1998(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:2703(13350)</TD>
    <TD align=right>477</TD>
    <TD align=right>477</TD>
    <TD align=right>1 sec [pbuf.c:2703]</TD></TR>
  <TR>
    <TH align=left>tcpSessionsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST 
sessions.c:584(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>sessions.c:1975(13350)</TD>
    <TD align=right>30</TD>
    <TD align=right>30</TD>
    <TD align=right>0 sec [sessions.c:1975]</TD></TR>
  <TR>
    <TH align=left>purgePortsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:597(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:618(13350)</TD>
    <TD align=right>57</TD>
    <TD align=right>57</TD>
    <TD align=right>0 sec [pbuf.c:618]</TD></TR>
  <TR>
    <TH align=left>securityItemsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST http.c:2210(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>http.c:2223(13349)</TD>
    <TD align=right>2</TD>
    <TD align=right>2</TD>
    <TD align=right>0 sec [http.c:2205]</TD></TR></TBODY></TABLE></P><FONT 
face=Arial size=4>
<P><FONT size=2>&nbsp;</FONT><FONT face="Times New Roman" size=3> The give-away 
of a deadlock would be a line like this:</FONT></P>
<P><FONT size=2>
<TABLE border=1>
  <TBODY>
  <TR>
    <TH>Mutex Name</TH>
    <TH>State</TH>
    <TH>Last Lock</TH>
    <TH>Blocked</TH>
    <TH>Last UnLock</TH>
    <TH colSpan=2># Locks/Releases</TH>
    <TH>Max Lock</TH></TR>
  <TR>
    <TH align=left>gdbmMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST util.c:3386(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>util.c:3393(13349)</TD>
    <TD align=right>100</TD>
    <TD align=right>100</TD>
    <TD align=right>0 sec [util.c:3413]</TD></TR>
  <TR>
    <TH align=left>packetProcessMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1631(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:1646(13350)</TD>
    <TD align=right>477</TD>
    <TD align=right>477</TD>
    <TD align=right>1 sec [pbuf.c:1646]</TD></TR>
  <TR>
    <TH align=left>packetQueueMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:37 AM CST pbuf.c:1744(13344)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:1764(13344)</TD>
    <TD align=right>144</TD>
    <TD align=right>144</TD>
    <TD align=right>0 sec [pbuf.c:1689]</TD></TR>
  <TR>
    <TH align=left>purgeMutex</TH>
    <TD align=middle><FONT color=red>locked</FONT></TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST http.c:2584(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>:0(0)</TD>
    <TD align=right>1</TD>
    <TD align=right>0</TD>
    <TD align=right>0 sec [:0]</TD></TR>
  <TR>
    <TH align=left>hostsHashMutex</TH>
    <TD align=middle><FONT color=#ff0000>locked</FONT></TD>
    <TD align=right>at Fri 31 Oct 2003 10:52:17 AM CST hash.c:552(13350)</TD>
    <TD align=right>pbuf.c:1998 (13355)&nbsp;</TD>
    <TD align=right>hash.c:601 (13350)</TD>
    <TD align=right>477</TD>
    <TD align=right>477</TD>
    <TD align=right>1 sec [pbuf.c:2703]</TD></TR>
  <TR>
    <TH align=left>tcpSessionsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST 
sessions.c:584(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>sessions.c:1975(13350)</TD>
    <TD align=right>30</TD>
    <TD align=right>30</TD>
    <TD align=right>0 sec [sessions.c:1975]</TD></TR>
  <TR>
    <TH align=left>purgePortsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:597(13350)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>pbuf.c:618(13350)</TD>
    <TD align=right>57</TD>
    <TD align=right>57</TD>
    <TD align=right>0 sec [pbuf.c:618]</TD></TR>
  <TR>
    <TH align=left>securityItemsMutex</TH>
    <TD align=middle>unlocked</TD>
    <TD align=right>at Fri 31 Oct 2003 10:53:41 AM CST http.c:2210(13349)</TD>
    <TD align=right>&nbsp;</TD>
    <TD align=right>http.c:2223(13349)</TD>
    <TD align=right>2</TD>
    <TD align=right>2</TD>
    <TD align=right>0 sec 
[http.c:2205]</TD></TR></TBODY></TABLE></FONT></FONT></P></FONT>
<P><FONT face=Arial>See how the 'at' tag is OLD and there's something in the 
blocked line?&nbsp; If you refresh a few times over 30s or a minute and that 
doesn't change, you're deadlocked.</FONT></P>
<P><FONT face=Arial>The KEY to reporting these is to go into the source and pull 
up the actual lines - plus some context - where the Last Lock, Last UnLock and 
Block show up.&nbsp; (Since the cvs changes often, line #s aren't very useful 
without the corresponding code).</FONT></P>
<P><FONT face=Arial>For example, the report of the above should be:</FONT></P>
<P><FONT face=Arial>Deadlock (web server active), showMutex.html shows 
</FONT><FONT face=Arial>hostsHashMutex, locked at hash.c:552:</FONT></P>
<P><FONT face="Courier New" size=2>&nbsp; for(idx=0; 
idx&lt;myGlobals.device[actDevice].actualHashSize; idx++) {<BR>#ifdef 
CFG_MULTITHREADED<BR>&nbsp;&nbsp;&nbsp; 
accessMutex(&amp;myGlobals.hostsHashMutex, "scanIdleLoop");<BR>#endif</FONT></P>
<P><FONT face=Arial>Last Unlock at hash.c:601:</FONT></P>
<P><FONT face="Courier New" size=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if(numHosts 
&gt;= (maxHosts-1))&nbsp;break;</FONT></P>
<P><FONT face=Arial>Blocked at pbuf.c:1998:</FONT></P>
<P><FONT face="Courier New" size=2>&nbsp;&nbsp;&nbsp; 
incrementTrafficCounter(&amp;myGlobals.device[actualDeviceId].rcvdPktStats.tooLong, 
1);<BR>&nbsp; }</FONT></P>
<P><FONT face="Courier New" size=2>#ifdef CFG_MULTITHREADED<BR>&nbsp; 
accessMutex(&amp;myGlobals.hostsHashMutex, 
"processPacket");<BR>#endif</FONT></P>
<P><FONT face="Courier New" size=2>#ifdef DEBUG<BR>&nbsp; 
traceEvent(CONST_TRACE_INFO, "actualDeviceId = %d", 
actualDeviceId);<BR>#endif</FONT></P>
<P><FONT face=Arial size=2></FONT>&nbsp;</P>
<P><FONT face=Arial color=#0000ff 
size=4><STRONG><EM>-----Burton</EM></STRONG></FONT></P>
<P><FONT face=Arial size=2></FONT>&nbsp;</P></BODY></HTML>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic