[prev in list] [next in list] [prev in thread] [next in thread]
List: evms-devel
Subject: Re: [Evms-devel] Daemon Patch for Large Clusters
From: "Robert Wipfel" <rawipfel () novell ! com>
Date: 2006-09-09 14:26:19
Message-ID: 45027AC0020000CF0001A1FC () sinclair ! provo ! novell ! com
[Download RAW message or body]
> Steve Dobbelstein <steved@us.ibm.com> 09/06/06 9:16 AM:
> > "Changju Gao" <CGAO@novell.com>t wrote on 09/05/2006 11:26:59 AM:
> > After further tests, I found some other cases when open/close threads
> > interfering with each other. So I added code to fend off other open/close
> > requests while bringing up or shutting down worker.
[...]
> Thanks for reporting these issues and for suggesting fixes. As you can
> tell, the EVMS support for the clustered environment has the basic
> functionality but could use more work in the area of robustness. What I
> would like to do is step back and look at this as a protocol design issue
> rather than applying patches in local places where particular scenarios
> fail. If the design is correct the code should be simpler and smaller than
> putting lots of conditional checks in various places. Fixing the design
> will, of course, take more time, but it should result in better code in the
> end. Once I have something in place I'll run it by you to make sure it
> satisfies your particular scenarios.
Hi Steve,
We found this because some higher layer code introduced a side-effect of causing all \
nodes to open the engine at more or less exactly the same time. The race between \
engines, failing to open (acquire) all workers, and then having to back out with a \
distributed close, exposed some windows. We agree that closing these windows is a \
short-term code fix and would rather consider some protocol design alternatives - \
e.g. suppose open_engine could be implemented as "acquire distributed cluster lock" \
thru the ECE. The lock would be released by close_engine, engine process failure, or \
node failure. Lock primitives would be useful for the Cluster Segment Manager (CSM) \
too...
Thanks,
Robert
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Evms-devel mailing list
Evms-devel@lists.sourceforge.net
To subscribe/unsubscribe, please visit:
https://lists.sourceforge.net/lists/listinfo/evms-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic