[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ssic-linux-devel
Subject:    [SSI-devel] [ ssic-linux-Bugs-2095071 ] From time to time /proc
From:       "SourceForge.net" <noreply () sourceforge ! net>
Date:       2009-03-28 21:37:43
Message-ID: E1LngDn-0001zD-8X () d55xhf1 ! ch3 ! sourceforge ! com
[Download RAW message or body]

Bugs item #2095071, was opened at 2008-09-05 10:07
Message generated for change (Comment added) made by rogertsang
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2095071&group_id=32541

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Process Management
Group: None
Status: Open
Resolution: Accepted
Priority: 5
Private: No
Submitted By: John Hughes (hughesj)
Assigned to: Nobody/Anonymous (nobody)
Summary: From time to time /proc deadlocks

Initial Comment:
For quite some time (maybe even on the old 2.4 based kernel) we've occasionally seen \
a problem where the /proc filesystem seems to be deadlocked - any attempt to read \
/proc hangs.

When this happens rebooting one the nodes (not any node, it has to be the "right" \
one) will free up the system and things will continue as normal.

Today I just noticed that when I rebooted the node that was "causing" the problem I \
had the following messages on the init node:

Node 6 has gone down!!!
Assertion failed! origin_lock != ((void *)0), cluster/ssi/vproc/dvp_pvpsops.c, \
                pvpsop_get_execnode, line=376
nm_add_node: Node 6 added

Is this a clue?
 

----------------------------------------------------------------------

> Comment By: Roger Tsang (rogertsang)
Date: 2009-03-28 17:37

Message:
please try latest CVS (March 24th)

----------------------------------------------------------------------

Comment By: John Hughes (hughesj)
Date: 2008-12-03 10:59

Message:
Well. I finally found a (crazy) way to duplicate this - launch a windows
app with wine and hit control-c before it gets going.  Eventually it will
provoke the hang.

A "bta A" trace of the running processes is attached.

----------------------------------------------------------------------

Comment By: John Hughes (hughesj)
Date: 2008-11-20 05:14

Message:
Sorry, wasn't clear above - this is not straight CVS, it's my port of
current CVS to 2.6.12.  However I'm pretty sure this part of the port is
good.


----------------------------------------------------------------------

Comment By: John Hughes (hughesj)
Date: 2008-11-20 05:13

Message:
With current CVS (20/11/2008) I still see this bug, on coming in to work I
found my (non-init) node stuck, apparently in the screensaver, and when I
tried to see what was going on from the initnode each time I did a stat on
"/proc/1" it hung.  stat on other things in /proc was working - stat
/proc/self or stat /proc/$$ for example.

When I turned off the node that was stuck the hung "stat" operations on
the initnode sprang back to life and I see messages like this in the log:

Node 6 has gone down!!!
Assertion failed! origin_lock != ((void *)0),
cluster/ssi/vproc/dvp_pvpsops.c, pvpsop_get_execnode, line=379
Assertion failed! origin_lock != ((void *)0),
cluster/ssi/vproc/dvp_pvpsops.c, pvpsop_get_execnode, line=379
Assertion failed! origin_lock != ((void *)0),
cluster/ssi/vproc/dvp_pvpsops.c, pvpsop_get_execnode, line=379
nm_add_node: Node 6 added





----------------------------------------------------------------------

Comment By: Roger Tsang (rogertsang)
Date: 2008-10-02 06:24

Message:
a fix is going into CVS

----------------------------------------------------------------------

Comment By: Roger Tsang (rogertsang)
Date: 2008-09-23 22:07

Message:
Related to this origin_lock assertion is a possible race in
vproc_origin_list traversal supposedly fixed by pragma #ifdef VOD_HLIST
since SSI-1.9.x, but the fix introduced a possible deadlock bug and should
be fixed in 1.9.6.

Lock ordering pre-1.9.6 (with #ifdef VOD_HLIST):
        -> vproc_origin_cleanup         (down_read origin list)
         -> vproc_origin_fgpgrp_cleanup
          -> pvpop_getctty
           -> rpvpop_start_op
            -> pvpopsop_get_execnode
             -> vproc_lock_origin_node
              -> vproc_origin_find      (down_read origin list)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2095071&group_id=32541

------------------------------------------------------------------------------
_______________________________________________
ssic-linux-devel mailing list
ssic-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic