[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openais
Subject:    [Openais] Patch II design review revision 2 !!
From:       "Muni Bajpai" <muniba () nortel ! com>
Date:       2005-02-25 22:10:20
Message-ID: CFCE7C3BDB79204092974B5B50AD71941002A1 () zrc2hxm0 ! corp ! nortel ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hey Steve,

Could you please review the patch and comment.

Also totemsrp.c line 1524. Was that intended ? "goto error_mcast" calls a
return (0) saying everything went well. Also there is an assert(0) a few
line above. 

One more thing most node exec functions in ckpt.c always return 0 regardless
of error. Is that also intended ?

Thanks

Muni


-----Original Message-----
From: Steven Dake [mailto:sdake@mvista.com] 
Sent: Thursday, February 24, 2005 2:56 AM
To: Bajpai, Muni [NGC:B670:EXCH]
Cc: 'openais@lists.osdl.org'; Smith, Kristen [NGC:B675:EXCH]
Subject: Re: Patch II design review revision 1 !!


Muni,
My apologies for not responding earlier; I've been swawmped by my day job..
:( On Wed, 2005-02-23 at 13:18, Muni Bajpai wrote:
> Hey Steve,
> 
> So I have made the changes as itemized by you.
> Some Notes:
> 1.) Have removed saCkptCheckpointSection from 
> req_exec_ckpt_synchronize_state and added the following
>         SaCkptSectionDescriptorT sectionDescriptor;
>         SaUint32T dataOffSet;
>         SaUint32T dataSize;

I think this makes sense, but probably as another message.  See later
comments.

> 2.) Could not think of any other way to do 
> ckpt_checkpoint_remove_cleanup. Let me know if you have any 
> suggestions
> 
hmm I think its ok for now we can always refine later.

> Thanks,
> 
> Muni
> P.S I haven't tested any of this and plan to do that as soon the I 
> base changes for recovery are in. When do you estimate that to be 
> merged in ?
> 

I am hopeful I will have something available Monday.  I'd suggest, however,
trying to test without the base recovery code by adding the calls to the
locations specified in previous emails.  This should atleast allow you to
start testing your implementation..

comments on patch:
findProcessorIndex style is still somewhat wacky and has a c++ comment

"//save off the elements" c++ comment should be changed to C comment

Please do not specify variables in code body.  This is a C++ ism and only
works on newer GCC compilers but not 2.9.5 or earlier.

+               SaUint32T sync_sequence_number = 0;

The sectionData in req_exec_ckpt_synchronize_state is wrong.  It should be
void sectionData[0] and at the end of the structure.  Then ensure that when
creating the syncstate msg, to allocate sizeof syncstate + (dataSize +
dataOffset).  Then address the section data as a normal array.

Still alot of C++ style comments please axe em for c comments.

when specifying todo, use uppercase TODO instead of todo makes searching
easier if all todos are consistent

Use a memcpy when initializing the checkpoint section descriptor:
example:
// Configure checkpoint section

I dont like the overload of section data updates along with checkpoint and
refcount updates.  I believe these two things should be two seperate
messages.  Logically that makes more sense, instead of updating the
creationattributes and reference countson every message.  The refcount and
creation attributes message should only be sent once, while a section update
should be sent for each section.  It is ok to have multiple messages for
synchronization.  We have a big address space to use, so might as well use
it. :)

Also why do you need sync_msg_sequence_number?  All messages are in agreed
order, so the order you send them will be the order they arrive. 
There should be no need to do any sequencing beyond that.

Good work

Regards
-steve



> -----Original Message-----
> From: Steven Dake [mailto:sdake@mvista.com]
> Sent: Tuesday, February 22, 2005 3:27 PM
> To: Bajpai, Muni [NGC:B670:EXCH]
> Cc: openais@lists.osdl.org; Smith, Kristen [NGC:B675:EXCH]
> Subject: RE: Patch II design review !!
> 
> 
> On Tue, 2005-02-22 at 14:08, Muni Bajpai wrote:
> > Thanks Steven,
> > 
> > On the last item you mentioned for multiple checkpoints per section. 
> > The Design as it exists today is that a sync_message is sent out for 
> > every section in a checkpoint for all checkpoints. So if a
> checkpoint
> > has 2 sections then 2 sync's are sent out for that checkpoint.
> > 
> 
> ok this works
> 
> > Will start work on the remaining items.
> > 
> > Thanks
> > 
> Thanks Muni
> 
> regards
> -steve
> 
> > Muni
> > 
> > -----Original Message-----
> > From: Steven Dake [mailto:sdake@mvista.com]
> > Sent: Tuesday, February 22, 2005 1:58 PM
> > To: Bajpai, Muni [NGC:B670:EXCH]
> > Cc: openais@lists.osdl.org; Smith, Kristen [NGC:B675:EXCH]
> > Subject: Re: Patch II design review !!
> > 
> > On Mon, 2005-02-21 at 12:10, Muni Bajpai wrote:
> > > Hi Steven,
> > > 
> > > Please take a look at the patch II and advise on the design. I
> > haven't
> > > run this code yet and is in a "In Progress" state. I wanted to get 
> > > feedback before I continued down that path.
> > > 
> > >  -  Does checkpoint close get called on all the remaining
> processors
> > > when a processor fails ?
> > > 
> > 
> > No.
> > 
> > When a processor fails, the processors will receive a configuration
> > change.  The reason we keep a list of the processor identifiers
> along
> > with their checkpoint reference count is so that each processor can
> > reduce the reference count the appropriate number of times.  This is
> > similiar to closing, except that no actual close message should be 
> > sent.
> > 
> > > Thanks
> > > 
> > > Muni
> > > 
> > > <<patchtemp.txt>>
> > > 
> > 
> > Patch review:
> > 
> > Great work Muni!  I think the design is solid thus far.  I have a
> few
> > suggestions.
> > 
> > When using pointers, please use * instead of [].
> > 
> > ex:
> > +static void initialize_ckpt_refcount_array (struct ckpt_refcnt
> > ckpt_refcount[]) {
> > 
> > should be:
> > 
> > +static void initialize_ckpt_refcount_array (struct ckpt_refcnt
> > *ckpt_refcount) {
> > 
> > 
> > I know this doesn't seem to make alot of sense when looking at the
> > code, but in the past we have agreed to use mostly linux kernel
> coding
> > style for the executive code, and SA forum coding style for the
> > libraries. This makes debugging libraries for people using the APIs 
> > easier and makes working on the executive easier for us 
> > non-capitalized developers
> > to read the code.
> > 
> > Unfortunately this group decision was made after the checkpoint code 
> > was developed.  Could you change the style to match this?  A good
> > example:
> > 
> > for (checkpointList = checkpointListHead.next;
> > +        checkpointList != &checkpointListHead;
> > +        checkpointList = checkpointList->next) {
> > should be
> > 
> > for (list = checkpoint_list_head.next;
> >         list != &checkpoint_list_head;
> >         list =list->next);
> > 
> > Please do not assign structures.  Use memcpy instead.  Example:
> > +                       sync_msg->request.previous_ring_id =
> > saved_ring_id;                     
> > 
> > should be
> > memcpy (&sync_msg->request.previous_ring_id, &saved_ring_id, sizeof
> > (struct memb_ring_id));
> > 
> > totempg_token_callback_create should use
> TOTEMPG_CALLBACK_TOKEN_SENT.
> > If you queue data on received, then that data has to be queued
> before
> > the token can be processed, which introduces latency into the 
> > protocol.
> > 
> > Please do not use C++ style comments.  evil evil.  I use them for
> > TODO's which is acceptable but that is the only valid case.  We
> intend
> > to delete all those TODOs someday :)  Example:
> > +       //Check for empty list here
> > This should be
> > /*
> >  * Check for empty list here
> >  */
> > 
> > In ckpt_recovery_finalize there is an extra list_init which is
> > unnecessary:
> > +       list_init(&checkpointListHead);
> > 
> > I don't really like the structurein
> ckpt_recovery_process_members_exit
> > but if that is the only way to do it, then thats the only way to do
> > it..
> > 
> > synchronize_state needs some work..  Specifically I don't think you
> > can just do a sectioncreate..  You want to actually synchronize the
> > section
> > data.  Allocating void *data is not optimal since memory allocation
> > can
> > fail.  Nothing in the recovery path should allocate memory if at all
> > possible.  When comparing ring ids, use memcmp instead of the if
> > statement you have.  ex:
> > +       if ((req_exec_ckpt_sync_state->previous_ring_id.seq !=
> > saved_ring_id.seq)
> > +               ||
> > (req_exec_ckpt_sync_state->previous_ring_id.rep.s_addr !=
> > saved_ring_id.rep.s_addr)) {
> > +               return(0);
> > +       }
> > 
> > should be
> > 
> > if (memcmp (&req_exec_ckpt_sync_Sate->previous_ring_id,
> > &saved_ring_id,
> > sizeof (struct memb_ring_id) != 0) {
> >         return (0);
> > }
> > 
> > What should probably happen is that a new function should be created
> > (ckpt_section_create) that creates the checkpoint section
> information
> > based upon a saCkptCheckpointSection data structure.
> > 
> > This same thing should be done for the saCkptCHeckpoint data
> structure
> > as well (in case the processor adding the checkpoint doesn't yet
> have
> > it in its database).
> > 
> > I'd also suggest somehow adding more then one section to the
> > synchronize message.  It is possible a checkpoint could have alot of
> > sections.
> > 
> > Your off to a great start Muni...
> > 
> > Good work
> > -steve
> > 
> > 
> 
> 
>  




[Attachment #5 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2658.2">
<TITLE>Patch II design review revision 2 !!</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=2>Hey Steve,</FONT>
</P>

<P><FONT SIZE=2>Could you please review the patch and comment.</FONT>
</P>

<P><FONT SIZE=2>Also totemsrp.c line 1524. Was that intended ? &quot;goto \
error_mcast&quot; calls a return (0) saying everything went well. Also there is an \
assert(0) a few line above. </FONT></P>

<P><FONT SIZE=2>One more thing most node exec functions in ckpt.c always return 0 \
regardless of error. Is that also intended ?</FONT> </P>

<P><FONT SIZE=2>Thanks</FONT>
</P>

<P><FONT SIZE=2>Muni</FONT>
</P>
<BR>

<P><FONT SIZE=2>-----Original Message-----</FONT>
<BR><FONT SIZE=2>From: Steven Dake [<A \
HREF="mailto:sdake@mvista.com">mailto:sdake@mvista.com</A>] </FONT> <BR><FONT \
SIZE=2>Sent: Thursday, February 24, 2005 2:56 AM</FONT> <BR><FONT SIZE=2>To: Bajpai, \
Muni [NGC:B670:EXCH]</FONT> <BR><FONT SIZE=2>Cc: 'openais@lists.osdl.org'; Smith, \
Kristen [NGC:B675:EXCH]</FONT> <BR><FONT SIZE=2>Subject: Re: Patch II design review \
revision 1 !!</FONT> </P>
<BR>

<P><FONT SIZE=2>Muni,</FONT>
<BR><FONT SIZE=2>My apologies for not responding earlier; I've been swawmped by my \
day job.. :( On Wed, 2005-02-23 at 13:18, Muni Bajpai wrote:</FONT></P>

<P><FONT SIZE=2>&gt; Hey Steve,</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; So I have made the changes as itemized by you.</FONT>
<BR><FONT SIZE=2>&gt; Some Notes:</FONT>
<BR><FONT SIZE=2>&gt; 1.) Have removed saCkptCheckpointSection from </FONT>
<BR><FONT SIZE=2>&gt; req_exec_ckpt_synchronize_state and added the following</FONT>
<BR><FONT SIZE=2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
SaCkptSectionDescriptorT sectionDescriptor;</FONT> <BR><FONT \
SIZE=2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SaUint32T \
dataOffSet;</FONT> <BR><FONT \
SIZE=2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; SaUint32T \
dataSize;</FONT> </P>

<P><FONT SIZE=2>I think this makes sense, but probably as another message.&nbsp; See \
later comments.</FONT> </P>

<P><FONT SIZE=2>&gt; 2.) Could not think of any other way to do </FONT>
<BR><FONT SIZE=2>&gt; ckpt_checkpoint_remove_cleanup. Let me know if you have any \
</FONT> <BR><FONT SIZE=2>&gt; suggestions</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>hmm I think its ok for now we can always refine later.</FONT>
</P>

<P><FONT SIZE=2>&gt; Thanks,</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; Muni</FONT>
<BR><FONT SIZE=2>&gt; P.S I haven't tested any of this and plan to do that as soon \
the I </FONT> <BR><FONT SIZE=2>&gt; base changes for recovery are in. When do you \
estimate that to be </FONT> <BR><FONT SIZE=2>&gt; merged in ?</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
</P>

<P><FONT SIZE=2>I am hopeful I will have something available Monday.&nbsp; I'd \
suggest, however, trying to test without the base recovery code by adding the calls \
to the locations specified in previous emails.&nbsp; This should atleast allow you to \
start testing your implementation..</FONT></P>

<P><FONT SIZE=2>comments on patch:</FONT>
<BR><FONT SIZE=2>findProcessorIndex style is still somewhat wacky and has a c++ \
comment</FONT> </P>

<P><FONT SIZE=2>&quot;//save off the elements&quot; c++ comment should be changed to \
C comment</FONT> </P>

<P><FONT SIZE=2>Please do not specify variables in code body.&nbsp; This is a C++ ism \
and only works on newer GCC compilers but not 2.9.5 or earlier.</FONT></P>

<P><FONT SIZE=2>+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
SaUint32T sync_sequence_number = 0;</FONT> </P>

<P><FONT SIZE=2>The sectionData in req_exec_ckpt_synchronize_state is wrong.&nbsp; It \
should be void sectionData[0] and at the end of the structure.&nbsp; Then ensure that \
when creating the syncstate msg, to allocate sizeof syncstate + (dataSize + \
dataOffset).&nbsp; Then address the section data as a normal array.</FONT></P>

<P><FONT SIZE=2>Still alot of C++ style comments please axe em for c comments.</FONT>
</P>

<P><FONT SIZE=2>when specifying todo, use uppercase TODO instead of todo makes \
searching easier if all todos are consistent</FONT> </P>

<P><FONT SIZE=2>Use a memcpy when initializing the checkpoint section \
descriptor:</FONT> <BR><FONT SIZE=2>example:</FONT>
<BR><FONT SIZE=2>// Configure checkpoint section</FONT>
</P>

<P><FONT SIZE=2>I dont like the overload of section data updates along with \
checkpoint and refcount updates.&nbsp; I believe these two things should be two \
seperate messages.&nbsp; Logically that makes more sense, instead of updating the \
creationattributes and reference countson every message.&nbsp; The refcount and \
creation attributes message should only be sent once, while a section update should \
be sent for each section.&nbsp; It is ok to have multiple messages for \
synchronization.&nbsp; We have a big address space to use, so might as well use it. \
:)</FONT></P>

<P><FONT SIZE=2>Also why do you need sync_msg_sequence_number?&nbsp; All messages are \
in agreed order, so the order you send them will be the order they arrive. \
</FONT></P>

<P><FONT SIZE=2>There should be no need to do any sequencing beyond that.</FONT>
</P>

<P><FONT SIZE=2>Good work</FONT>
</P>

<P><FONT SIZE=2>Regards</FONT>
<BR><FONT SIZE=2>-steve</FONT>
</P>
<BR>
<BR>

<P><FONT SIZE=2>&gt; -----Original Message-----</FONT>
<BR><FONT SIZE=2>&gt; From: Steven Dake [<A \
HREF="mailto:sdake@mvista.com">mailto:sdake@mvista.com</A>]</FONT> <BR><FONT \
SIZE=2>&gt; Sent: Tuesday, February 22, 2005 3:27 PM</FONT> <BR><FONT SIZE=2>&gt; To: \
Bajpai, Muni [NGC:B670:EXCH]</FONT> <BR><FONT SIZE=2>&gt; Cc: openais@lists.osdl.org; \
Smith, Kristen [NGC:B675:EXCH]</FONT> <BR><FONT SIZE=2>&gt; Subject: RE: Patch II \
design review !!</FONT> <BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; On Tue, 2005-02-22 at 14:08, Muni Bajpai wrote:</FONT>
<BR><FONT SIZE=2>&gt; &gt; Thanks Steven,</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; On the last item you mentioned for multiple checkpoints \
per section. </FONT> <BR><FONT SIZE=2>&gt; &gt; The Design as it exists today is that \
a sync_message is sent out for </FONT> <BR><FONT SIZE=2>&gt; &gt; every section in a \
checkpoint for all checkpoints. So if a</FONT> <BR><FONT SIZE=2>&gt; \
checkpoint</FONT> <BR><FONT SIZE=2>&gt; &gt; has 2 sections then 2 sync's are sent \
out for that checkpoint.</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; ok this works</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Will start work on the remaining items.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Thanks</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; Thanks Muni</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; regards</FONT>
<BR><FONT SIZE=2>&gt; -steve</FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Muni</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; -----Original Message-----</FONT>
<BR><FONT SIZE=2>&gt; &gt; From: Steven Dake [<A \
HREF="mailto:sdake@mvista.com">mailto:sdake@mvista.com</A>]</FONT> <BR><FONT \
SIZE=2>&gt; &gt; Sent: Tuesday, February 22, 2005 1:58 PM</FONT> <BR><FONT \
SIZE=2>&gt; &gt; To: Bajpai, Muni [NGC:B670:EXCH]</FONT> <BR><FONT SIZE=2>&gt; &gt; \
Cc: openais@lists.osdl.org; Smith, Kristen [NGC:B675:EXCH]</FONT> <BR><FONT \
SIZE=2>&gt; &gt; Subject: Re: Patch II design review !!</FONT> <BR><FONT SIZE=2>&gt; \
&gt; </FONT> <BR><FONT SIZE=2>&gt; &gt; On Mon, 2005-02-21 at 12:10, Muni Bajpai \
wrote:</FONT> <BR><FONT SIZE=2>&gt; &gt; &gt; Hi Steven,</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; Please take a look at the patch II and advise on the \
design. I</FONT> <BR><FONT SIZE=2>&gt; &gt; haven't</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; run this code yet and is in a &quot;In Progress&quot; \
state. I wanted to get </FONT> <BR><FONT SIZE=2>&gt; &gt; &gt; feedback before I \
continued down that path.</FONT> <BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt;&nbsp; -&nbsp; Does checkpoint close get called on all \
the remaining</FONT> <BR><FONT SIZE=2>&gt; processors</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; when a processor fails ?</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; No.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; When a processor fails, the processors will receive a \
configuration</FONT> <BR><FONT SIZE=2>&gt; &gt; change.&nbsp; The reason we keep a \
list of the processor identifiers</FONT> <BR><FONT SIZE=2>&gt; along</FONT>
<BR><FONT SIZE=2>&gt; &gt; with their checkpoint reference count is so that each \
processor can</FONT> <BR><FONT SIZE=2>&gt; &gt; reduce the reference count the \
appropriate number of times.&nbsp; This is</FONT> <BR><FONT SIZE=2>&gt; &gt; similiar \
to closing, except that no actual close message should be </FONT> <BR><FONT \
SIZE=2>&gt; &gt; sent.</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; Thanks</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; Muni</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; &lt;&lt;patchtemp.txt&gt;&gt;</FONT>
<BR><FONT SIZE=2>&gt; &gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Patch review:</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Great work Muni!&nbsp; I think the design is solid thus \
far.&nbsp; I have a</FONT> <BR><FONT SIZE=2>&gt; few</FONT>
<BR><FONT SIZE=2>&gt; &gt; suggestions.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; When using pointers, please use * instead of [].</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; ex:</FONT>
<BR><FONT SIZE=2>&gt; &gt; +static void initialize_ckpt_refcount_array (struct \
ckpt_refcnt</FONT> <BR><FONT SIZE=2>&gt; &gt; ckpt_refcount[]) {</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; should be:</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; +static void initialize_ckpt_refcount_array (struct \
ckpt_refcnt</FONT> <BR><FONT SIZE=2>&gt; &gt; *ckpt_refcount) {</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; I know this doesn't seem to make alot of sense when \
looking at the</FONT> <BR><FONT SIZE=2>&gt; &gt; code, but in the past we have agreed \
to use mostly linux kernel</FONT> <BR><FONT SIZE=2>&gt; coding</FONT>
<BR><FONT SIZE=2>&gt; &gt; style for the executive code, and SA forum coding style \
for the</FONT> <BR><FONT SIZE=2>&gt; &gt; libraries. This makes debugging libraries \
for people using the APIs </FONT> <BR><FONT SIZE=2>&gt; &gt; easier and makes working \
on the executive easier for us </FONT> <BR><FONT SIZE=2>&gt; &gt; non-capitalized \
developers</FONT> <BR><FONT SIZE=2>&gt; &gt; to read the code.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Unfortunately this group decision was made after the \
checkpoint code </FONT> <BR><FONT SIZE=2>&gt; &gt; was developed.&nbsp; Could you \
change the style to match this?&nbsp; A good</FONT> <BR><FONT SIZE=2>&gt; &gt; \
example:</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; for (checkpointList = checkpointListHead.next;</FONT>
<BR><FONT SIZE=2>&gt; &gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; checkpointList \
!= &amp;checkpointListHead;</FONT> <BR><FONT SIZE=2>&gt; &gt; \
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; checkpointList = checkpointList-&gt;next) \
{</FONT> <BR><FONT SIZE=2>&gt; &gt; should be</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; for (list = checkpoint_list_head.next;</FONT>
<BR><FONT SIZE=2>&gt; &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; list != \
&amp;checkpoint_list_head;</FONT> <BR><FONT SIZE=2>&gt; \
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; list =list-&gt;next);</FONT> \
<BR><FONT SIZE=2>&gt; &gt; </FONT> <BR><FONT SIZE=2>&gt; &gt; Please do not assign \
structures.&nbsp; Use memcpy instead.&nbsp; Example:</FONT> <BR><FONT SIZE=2>&gt; \
&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
sync_msg-&gt;request.previous_ring_id =</FONT> <BR><FONT SIZE=2>&gt; &gt; \
saved_ring_id;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; should be</FONT>
<BR><FONT SIZE=2>&gt; &gt; memcpy (&amp;sync_msg-&gt;request.previous_ring_id, \
&amp;saved_ring_id, sizeof</FONT> <BR><FONT SIZE=2>&gt; &gt; (struct \
memb_ring_id));</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; totempg_token_callback_create should use</FONT>
<BR><FONT SIZE=2>&gt; TOTEMPG_CALLBACK_TOKEN_SENT.</FONT>
<BR><FONT SIZE=2>&gt; &gt; If you queue data on received, then that data has to be \
queued</FONT> <BR><FONT SIZE=2>&gt; before</FONT>
<BR><FONT SIZE=2>&gt; &gt; the token can be processed, which introduces latency into \
the </FONT> <BR><FONT SIZE=2>&gt; &gt; protocol.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Please do not use C++ style comments.&nbsp; evil \
evil.&nbsp; I use them for</FONT> <BR><FONT SIZE=2>&gt; &gt; TODO's which is \
acceptable but that is the only valid case.&nbsp; We</FONT> <BR><FONT SIZE=2>&gt; \
intend</FONT> <BR><FONT SIZE=2>&gt; &gt; to delete all those TODOs someday :)&nbsp; \
Example:</FONT> <BR><FONT SIZE=2>&gt; &gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
//Check for empty list here</FONT> <BR><FONT SIZE=2>&gt; &gt; This should be</FONT>
<BR><FONT SIZE=2>&gt; &gt; /*</FONT>
<BR><FONT SIZE=2>&gt; &gt;&nbsp; * Check for empty list here</FONT>
<BR><FONT SIZE=2>&gt; &gt;&nbsp; */</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; In ckpt_recovery_finalize there is an extra list_init \
which is</FONT> <BR><FONT SIZE=2>&gt; &gt; unnecessary:</FONT>
<BR><FONT SIZE=2>&gt; &gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
list_init(&amp;checkpointListHead);</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; I don't really like the structurein</FONT>
<BR><FONT SIZE=2>&gt; ckpt_recovery_process_members_exit</FONT>
<BR><FONT SIZE=2>&gt; &gt; but if that is the only way to do it, then thats the only \
way to do</FONT> <BR><FONT SIZE=2>&gt; &gt; it..</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; synchronize_state needs some work..&nbsp; Specifically I \
don't think you</FONT> <BR><FONT SIZE=2>&gt; &gt; can just do a sectioncreate..&nbsp; \
You want to actually synchronize the</FONT> <BR><FONT SIZE=2>&gt; &gt; section</FONT>
<BR><FONT SIZE=2>&gt; &gt; data.&nbsp; Allocating void *data is not optimal since \
memory allocation</FONT> <BR><FONT SIZE=2>&gt; &gt; can</FONT>
<BR><FONT SIZE=2>&gt; &gt; fail.&nbsp; Nothing in the recovery path should allocate \
memory if at all</FONT> <BR><FONT SIZE=2>&gt; &gt; possible.&nbsp; When comparing \
ring ids, use memcmp instead of the if</FONT> <BR><FONT SIZE=2>&gt; &gt; statement \
you have.&nbsp; ex:</FONT> <BR><FONT SIZE=2>&gt; &gt; \
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if \
((req_exec_ckpt_sync_state-&gt;previous_ring_id.seq !=</FONT> <BR><FONT SIZE=2>&gt; \
&gt; saved_ring_id.seq)</FONT> <BR><FONT SIZE=2>&gt; &gt; \
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
||</FONT> <BR><FONT SIZE=2>&gt; &gt; \
(req_exec_ckpt_sync_state-&gt;previous_ring_id.rep.s_addr !=</FONT> <BR><FONT \
SIZE=2>&gt; &gt; saved_ring_id.rep.s_addr)) {</FONT> <BR><FONT SIZE=2>&gt; &gt; \
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
return(0);</FONT> <BR><FONT SIZE=2>&gt; &gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; \
}</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; should be</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; if (memcmp \
(&amp;req_exec_ckpt_sync_Sate-&gt;previous_ring_id,</FONT> <BR><FONT SIZE=2>&gt; &gt; \
&amp;saved_ring_id,</FONT> <BR><FONT SIZE=2>&gt; &gt; sizeof (struct memb_ring_id) != \
0) {</FONT> <BR><FONT SIZE=2>&gt; \
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return (0);</FONT> <BR><FONT \
SIZE=2>&gt; &gt; }</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; What should probably happen is that a new function should \
be created</FONT> <BR><FONT SIZE=2>&gt; &gt; (ckpt_section_create) that creates the \
checkpoint section</FONT> <BR><FONT SIZE=2>&gt; information</FONT>
<BR><FONT SIZE=2>&gt; &gt; based upon a saCkptCheckpointSection data \
structure.</FONT> <BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; This same thing should be done for the saCkptCHeckpoint \
data</FONT> <BR><FONT SIZE=2>&gt; structure</FONT>
<BR><FONT SIZE=2>&gt; &gt; as well (in case the processor adding the checkpoint \
doesn't yet</FONT> <BR><FONT SIZE=2>&gt; have</FONT>
<BR><FONT SIZE=2>&gt; &gt; it in its database).</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; I'd also suggest somehow adding more then one section to \
the</FONT> <BR><FONT SIZE=2>&gt; &gt; synchronize message.&nbsp; It is possible a \
checkpoint could have alot of</FONT> <BR><FONT SIZE=2>&gt; &gt; sections.</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Your off to a great start Muni...</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; Good work</FONT>
<BR><FONT SIZE=2>&gt; &gt; -steve</FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; &gt; </FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt; </FONT>
<BR><FONT SIZE=2>&gt;&nbsp; </FONT>
</P>
<BR>

<P><FONT FACE="Arial" SIZE=2 COLOR="#000000"></FONT>&nbsp;

</BODY>
</HTML>


["patchtemp_rev2.txt" (text/plain)]

diff -uNr --exclude=SCCS --exclude=BitKeeper --exclude=ChangeSet --exclude=init \
--exclude=LICENSE --exclude=Makefile --exclude=man --exclude=README.devmap \
--exclude=SECURITY --exclude=TODO --exclude=CHANGELOG --exclude=conf --exclude=loc \
--exclude=Makefile.samples --exclude=QUICKSTART --exclude=test --exclude=.cdtproject \
                --exclude=.project ../latest/exec/ckpt.c ../bk_openais/exec/ckpt.c
--- ../latest/exec/ckpt.c	2005-02-25 14:07:12.000000000 -0600
+++ ../bk_openais/exec/ckpt.c	2005-02-25 15:07:03.000000000 -0600
@@ -16,7 +16,6 @@
  *   this list of conditions and the following disclaimer in the documentation
  *   and/or other materials provided with the distribution.
  * - Neither the name of the MontaVista Software, Inc. nor the names of its
- *   contributors may be used to endorse or promote products derived from this
  *   software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
@@ -57,18 +56,36 @@
 #include "totempg.h"
 
 #define LOG_SERVICE LOG_SERVICE_CKPT
+#define CKPT_MAX_SECTION_DATA_SEND (1024*400)
 #include "print.h"
 
-DECLARE_LIST_INIT(checkpointListHead);
+DECLARE_LIST_INIT(checkpoint_list_head);
 
-DECLARE_LIST_INIT(checkpointIteratorListHead);
+DECLARE_LIST_INIT(checkpoint_iterator_list_head);
+
+DECLARE_LIST_INIT(recovery_sync_list_head);
+
+DECLARE_LIST_INIT(checkpoint_recovery_list_head);
 
 struct checkpoint_cleanup {
     struct list_head list;
     struct saCkptCheckpoint *checkpoint;
 };
 
-//TODO static totempg_recovery_plug_handle ckpt_checkpoint_recovery_plug_handle;
+struct checkpoint_sync {
+	struct list_head list;
+	enum nodeexec_message_types type;
+	union {
+		struct req_exec_ckpt_synchronize_state request_state;
+		struct req_exec_ckpt_synchronize_section request_section;
+	} base;	
+};
+
+#ifdef TODO
+static void *tok_call_handle = 0;
+#endif
+
+/* TODO static totempg_recovery_plug_handle ckpt_checkpoint_recovery_plug_handle; */
 
 static int ckpt_exec_init_fn (void);
 
@@ -78,6 +95,10 @@
 
 static int message_handler_req_exec_ckpt_checkpointopen (void *message, struct \
in_addr source_addr, int endian_conversion_required);  
+static int message_handler_req_exec_ckpt_synchronize_state (void *message, struct \
in_addr source_addr, int endian_conversion_required); +
+static int message_handler_req_exec_ckpt_synchronize_section (void *message, struct \
in_addr source_addr, int endian_conversion_required); +
 static int message_handler_req_exec_ckpt_checkpointclose (void *message, struct \
in_addr source_addr, int endian_conversion_required);  
 static int message_handler_req_exec_ckpt_checkpointunlink (void *message, struct \
in_addr source_addr, int endian_conversion_required); @@ -133,24 +154,27 @@
 static int message_handler_req_lib_ckpt_sectioniteratorinitialize (struct conn_info \
*conn_info, void *message);  static int \
message_handler_req_lib_ckpt_sectioniteratornext (struct conn_info *conn_info, void \
*message);  
-static int ckpt_confchg_fn (
-	enum totempg_configuration_type configuration_type,
-	struct in_addr *member_list, void *member_list_private,
-		int member_list_entries,
-	struct in_addr *left_list, void *left_list_private,
-		int left_list_entries,
-	struct in_addr *joined_list, void *joined_list_private,
-		int joined_list_entries,
-	struct memb_ring_id *ring_id) {
+static void ckpt_recovery_inititialize ();
+static int ckpt_recovery_process (enum totempg_callback_token_type type, void *);
+static void ckpt_recovery_finalize();
+static int ckpt_recovery_abort();
+static void ckpt_recovery_process_members_exit(struct in_addr *left_list, int \
left_list_entries); +static int recovery_section_create (SaCkptSectionDescriptorT \
*sectionDescriptor, SaNameT *checkpointName); +static int \
recovery_section_write(SaCkptSectionIdT *sectionId, SaNameT *checkpointName, \
+									void *newData, SaUint32T dataOffSet, SaUint32T dataSize);  
-#ifdef TODO
-	if (configuration_type == TOTEMPG_CONFIGURATION_REGULAR) {
-		totempg_recovery_plug_unplug (ckpt_checkpoint_recovery_plug_handle);
-	}
-#endif
 
-	return (0);
-}
+static struct memb_ring_id saved_ring_id;
+
+static int ckpt_confchg_fn(
+		enum totempg_configuration_type configuration_type,
+		struct in_addr *member_list, void *member_list_private,
+			int member_list_entries,
+		struct in_addr *left_list, void *left_list_private,
+			int left_list_entries,
+		struct in_addr *joined_list, void *joined_list_private,
+			int joined_list_entries,
+		struct memb_ring_id *ring_id);
 
 struct libais_handler ckpt_libais_handlers[] =
 {
@@ -231,7 +255,7 @@
 	},
 	{ /* 15 */
 		.libais_handler_fn	= message_handler_req_lib_ckpt_checkpointsynchronizeasync,
-		.response_size		= sizeof (struct res_lib_ckpt_checkpointsynchronizeasync), // TODO \
RESPONSE +		.response_size		= sizeof (struct \
                res_lib_ckpt_checkpointsynchronizeasync), /* TODO RESPONSE */
 		.response_id		= MESSAGE_RES_CKPT_CHECKPOINT_CHECKPOINTSYNCHRONIZEASYNC,
 	},
 	{ /* 16 */
@@ -258,7 +282,9 @@
 	message_handler_req_exec_ckpt_sectionexpirationtimeset,
 	message_handler_req_exec_ckpt_sectionwrite,
 	message_handler_req_exec_ckpt_sectionoverwrite,
-	message_handler_req_exec_ckpt_sectionread
+	message_handler_req_exec_ckpt_sectionread,
+	message_handler_req_exec_ckpt_synchronize_state,
+	message_handler_req_exec_ckpt_synchronize_section
 };
 
 struct service_handler ckpt_service_handler = {
@@ -273,18 +299,395 @@
 	.exec_dump_fn				= 0
 };
 
-static struct memb_ring_id saved_ring_id;
+static int setProcessorIndex(struct in_addr *proc_addr, 
+								struct ckpt_refcnt *ckpt_refcount) {
+	int i;
+	for (i = 0; i < PROCESSOR_COUNT_MAX; i ++) {
+		/*
+		 * If the source addresses match then this processor index 
+		 * has already been set
+		 */
+		if (ckpt_refcount[i].addr.s_addr == proc_addr->s_addr) {
+			return -1;
+		} else if (ckpt_refcount[i].addr.s_addr == 0) {
+			/*
+			 * If the source addresses do not match and this element
+			 * has no stored value then store the new value and 
+			 * return the Index.
+		 	 */		
+			memcpy(&ckpt_refcount[i].addr, proc_addr, sizeof(struct in_addr));			
+			return i;
+		}
+	}
+	/* 
+	 * Could not Find an empty slot 
+	 * to store the new Processor.	
+	 */
+	return -1;
+}
+
+static int findProcessorIndex(struct in_addr *proc_addr,
+								struct ckpt_refcnt *ckpt_refcount) { 
+	int i;
+	for (i = 0; i < PROCESSOR_COUNT_MAX; i ++) {
+		/*
+		 * If the source addresses match then return the index
+		 */
+		
+		if (ckpt_refcount[i].addr.s_addr == proc_addr->s_addr) {
+			return i;
+		}				
+	}
+	/* 
+	 * Could not Find the Processor 
+	 */
+	return -1;
+}
+
+static void initialize_ckpt_refcount_array (struct ckpt_refcnt *ckpt_refcount) {
+	memset((char*)ckpt_refcount, 0, PROCESSOR_COUNT_MAX * sizeof(struct ckpt_refcnt));
+}
+
+static void ckpt_recovery_inititialize () {
+	struct list_head *checkpoint_list;
+	struct saCkptCheckpoint *checkpoint;
+	struct checkpoint_sync *sync_msg;
+	struct checkpoint_sync *sync_section_msg;
+	struct saCkptCheckpoint *savedCheckpoint;
+	struct list_head *checkpoint_section_list;
+    struct saCkptCheckpointSection *ckptCheckpointSection;
+    SaSizeT origSectionSize;  
+    SaSizeT sectionDataSent;          
+    SaSizeT newSectionSize;
+    int proc_index;
+	
+
+   	for (checkpoint_list = checkpoint_list_head.next;
+        checkpoint_list != &checkpoint_list_head;
+        checkpoint_list = checkpoint_list->next) {
+
+        checkpoint = list_entry (checkpoint_list,
+            struct saCkptCheckpoint, list);
+
+		/*
+		 * 	Save off the elements in the new list
+		 */
+		savedCheckpoint = 
+			(struct saCkptCheckpoint *) malloc (sizeof(struct saCkptCheckpoint));
+		assert(savedCheckpoint);
+		memcpy(savedCheckpoint, checkpoint, sizeof(struct saCkptCheckpoint));
+		list_init(&savedCheckpoint->list);		
+		list_add(&savedCheckpoint->list,&checkpoint_recovery_list_head);	
+		
+		for (checkpoint_section_list = checkpoint->checkpointSectionsListHead.next;
+        	checkpoint_section_list != &checkpoint->checkpointSectionsListHead;
+            checkpoint_section_list = checkpoint_section_list->next) {
+
+            ckptCheckpointSection = list_entry (checkpoint_section_list,
+            	struct saCkptCheckpointSection, list);          	
+            
+            proc_index = \
findProcessorIndex(&this_ip.sin_addr,savedCheckpoint->ckpt_refcount); +            
+            if (proc_index == -1 ) {	 
+		 		log_printf (LOG_LEVEL_ERROR, 
+		 			"CKPT: Could not find the checkpoint entry for this processor %p \n", 
+		 			checkpoint);
+		 		continue;
+		 	}		 	
+           	
+           	/*
+           	 * Create and save a new Sync message.
+           	 */
+			sync_msg = 
+				(struct checkpoint_sync *)malloc(sizeof(struct checkpoint_sync));
+			assert(sync_msg);
+			
+			sync_msg->type = MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESTATE;			
+			sync_msg->base.request_state.header.size =	sizeof (struct \
req_exec_ckpt_synchronize_state); +			sync_msg->base.request_state.header.id = \
MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESTATE; \
+			memcpy(&sync_msg->base.request_state.previous_ring_id, &saved_ring_id, \
sizeof(struct memb_ring_id)); \
+			memcpy(&sync_msg->base.request_state.checkpointName, &savedCheckpoint->name, \
sizeof(SaNameT)); +			memcpy(&sync_msg->base.request_state.checkpointCreationAttributes, \
 +					&savedCheckpoint->checkpointCreationAttributes, 
+					sizeof(SaCkptCheckpointCreationAttributesT));
+			memcpy(&sync_msg->base.request_state.sectionDescriptor,
+					&ckptCheckpointSection->sectionDescriptor,
+					sizeof(SaCkptSectionDescriptorT));						
+			memcpy(&sync_msg->base.request_state.source_addr, &this_ip.sin_addr, \
sizeof(struct in_addr)); +			memcpy(&sync_msg->base.request_state.ref_count,
+		 			&savedCheckpoint->ckpt_refcount[proc_index].count,
+		 			sizeof(SaUint32T));				
+			
+		 	list_init(&sync_msg->list);		 	
+		 	list_add(&sync_msg->list,&recovery_sync_list_head); 
+		 	
+		 	origSectionSize = ckptCheckpointSection->sectionDescriptor.sectionSize;  
+            sectionDataSent = 0;          
+            newSectionSize = 0;
+            
+            /*
+             * Now Create SyncSection messsages in chunks of \
CKPT_MAX_SECTION_DATA_SEND or less +             */
+            while (sectionDataSent < origSectionSize) {
+            	/*
+            	 * Send a Max of CKPT_MAX_SECTION_DATA_SEND of section data
+            	 */
+	        	if ((origSectionSize - sectionDataSent) > CKPT_MAX_SECTION_DATA_SEND) {    \
 +	            	newSectionSize = CKPT_MAX_SECTION_DATA_SEND;
+	            }      
+	            else {
+	            	newSectionSize = (origSectionSize - sectionDataSent);
+	            }        
+	            
+	            /*
+	             * Create and save a new Sync Section message.
+	             */
+				sync_section_msg = 
+					(struct checkpoint_sync *)malloc(sizeof(struct checkpoint_sync) + \
newSectionSize); +				assert(sync_section_msg);
+				
+				sync_section_msg->type = MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESECTION;
+				sync_section_msg->base.request_section.header.size =	sizeof (struct \
req_exec_ckpt_synchronize_section); \
+				sync_section_msg->base.request_section.header.id = \
MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESECTION;     +	            memcpy \
(&sync_section_msg->base.request_section.checkpointName, &savedCheckpoint->name, \
sizeof(SaNameT)); +	            memcpy \
(&sync_section_msg->base.request_section.sectionId, \
+						&ckptCheckpointSection->sectionDescriptor.sectionId, \
+						sizeof(SaCkptSectionIdT));	    	     +				memcpy \
(&sync_section_msg->base.request_section.dataOffSet, &sectionDataSent, \
sizeof(SaUint32T)); +				memcpy (&sync_section_msg->base.request_section.dataSize, \
&newSectionSize, sizeof(SaUint32T)); +				memcpy ((char*)sync_section_msg + \
sizeof(struct checkpoint_sync),  +						((char*)ckptCheckpointSection->sectionData + \
sectionDataSent), +						newSectionSize);
+					
+            	sectionDataSent += newSectionSize;
+            	list_init(&sync_section_msg->list);
+            	list_add(&sync_section_msg->list,&recovery_sync_list_head);				
+            }           
+            	
+        }
+	}	
+#ifdef TODO
+	int res = totempg_token_callback_create(&tok_call_handle, 
+										TOTEMPG_CALLBACK_TOKEN_SENT,
+										0,
+										ckpt_recovery_process,
+										NULL);
+#endif
+}
+
+static int ckpt_recovery_process (enum totempg_callback_token_type type, void *data \
) { +
+	struct req_exec_ckpt_synchronize_state *request_exec_sync_state;
+	struct req_exec_ckpt_synchronize_section *request_exec_sync_section;
+	struct iovec iovecs[2];
+	struct list_head *sync_list;	
+	struct checkpoint_sync *sync_element;		
+	
+	/*
+	 * Check for empty list here
+	 */
+	if (list_empty(&recovery_sync_list_head)) {
+		return (0);
+	}	
+	
+	while (1) { /*Go for as long as the oubound queue is not full*/
+		/*
+		 * Extract the element
+		 */
+		sync_list = recovery_sync_list_head.next;
+		sync_element = list_entry (sync_list,
+	            struct checkpoint_sync, list);
+	            
+	    log_printf (LOG_LEVEL_DEBUG, "callback recover_process.\n");
+	    if (sync_element->type == MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESTATE) {
+	    	/*
+		     * Populate the Sync State Request
+		     */
+	    	request_exec_sync_state = &sync_element->base.request_state;		
+			iovecs[0].iov_base = (char *)&request_exec_sync_state;
+			iovecs[0].iov_len = sizeof (struct req_exec_ckpt_synchronize_state);
+			/*
+			 * Check to see if we can queue the new message and if you can
+			 * then mcast the message else break and create callback.
+			 */
+			if (totempg_send_ok(iovecs[0].iov_len)){				 
+				assert (totempg_mcast (iovecs, 1, TOTEMPG_AGREED) == 0);
+				log_printf (LOG_LEVEL_DEBUG, "CKPT: Multicasted Sync State Message.\n");
+			}			
+			else {
+				log_printf (LOG_LEVEL_DEBUG, "CKPT: Outbound Queue full need to Create \
Callback.\n"); +				break;
+			}
+	    }
+		
+		else  {		
+		    /*
+		     * Populate the Sync Section Request
+		     */
+		    request_exec_sync_section = &sync_element->base.request_section;	
+			iovecs[0].iov_base = (char *)&request_exec_sync_section;
+			iovecs[0].iov_len = sizeof (struct req_exec_ckpt_synchronize_section);
+			/*
+			 * Populate the Section Data.
+			 */
+			iovecs[1].iov_base = (char *)&sync_element
+								+ sizeof (struct checkpoint_sync);
+			iovecs[1].iov_len = request_exec_sync_section->dataSize;
+			/*
+			 * Check to see if we can queue the new message and if you can
+			 * then mcast the message else break and create callback.
+			 */
+			if (totempg_send_ok(iovecs[0].iov_len + iovecs[1].iov_len)){				 
+				assert (totempg_mcast (iovecs, 2, TOTEMPG_AGREED) == 0);
+				log_printf (LOG_LEVEL_DEBUG, "CKPT: Multicasted Sync Section Message.\n");
+			}			
+			else {
+				log_printf (LOG_LEVEL_DEBUG, "CKPT: Outbound Queue full need to Create \
Callback.\n"); +				break;
+			}									
+		}		
+		list_del(&sync_element->list);
+		free(sync_element);
+
+		/*
+		 * Check for empty list here
+		 */
+		if (list_empty(&recovery_sync_list_head)) {
+			ckpt_recovery_finalize();
+			return (0);
+		}		
+	}
+#ifdef TODO
+	/*
+	 * We have more to send ...
+	 */
+	int res = totempg_token_callback_create(&tok_call_handle, 
+										TOTEMPG_CALLBACK_TOKEN_SENT,
+										0,
+										ckpt_recovery_process,
+										NULL);
+#endif
+}
+
+static void ckpt_recovery_finalize () {
+	struct list_head *checkpoint_list;
+    struct saCkptCheckpoint *checkpoint;
+
+	/*
+	 * Remove All elements from old checkpoint
+	 * list
+	 */
+	checkpoint_list = checkpoint_list_head.next;	
+	while (!list_empty(&checkpoint_list_head)) {
+		checkpoint = list_entry (checkpoint_list,
+            			struct saCkptCheckpoint, list);
+		list_del(&checkpoint->list);
+		free(checkpoint);
+		checkpoint_list = checkpoint_list_head.next;
+	}
+	
+	/*
+	 * Initialize the old list again.
+	 */
+	list_init(&checkpoint_list_head);
+	
+	/*
+	 * Copy the contents of the new list_head into the old list head
+	 */
+	memcpy(&checkpoint_list_head, &checkpoint_recovery_list_head, sizeof(struct \
list_head)); +
+	/*
+	 * Initialize the new list head for reuse.
+	 */
+	list_init(&checkpoint_recovery_list_head);
+	
+}
+
+static int ckpt_recovery_abort () {
+/*
+ * TODO add some code here to abort the recovery process
+ */
+	return (0);
+}
+
+static void ckpt_recovery_process_members_exit(struct in_addr *left_list, int \
left_list_entries) { +	struct list_head *checkpoint_list;
+	struct saCkptCheckpoint *checkpoint;
+	struct in_addr *member;
+	int index;
+	int i;
+	
+	if (left_list_entries == 0) {
+		return;
+	}
+	
+	/*
+	 *  Iterate left_list_entries. 
+	 */
+	member = left_list;
+	for (i = 0; i < left_list_entries; i++) {
+		for (checkpoint_list = checkpoint_list_head.next;
+			checkpoint_list != &checkpoint_list_head;
+			checkpoint_list = checkpoint_list->next) {
+
+			checkpoint = list_entry (checkpoint_list,
+				struct saCkptCheckpoint, list);			
+			index = findProcessorIndex(member, checkpoint->ckpt_refcount);			
+			if (index == -1) {
+				continue;
+			}		
+			/*
+			 * Decrement
+			 * 
+			 */
+			if (checkpoint->referenceCount > 0) {
+				checkpoint->referenceCount -= checkpoint->ckpt_refcount[index].count;
+			} else {
+				/*TODO Log Error here*/				
+			}						
+			checkpoint->ckpt_refcount[index].count = 0;
+			memset((char*)&checkpoint->ckpt_refcount[index].addr, 0, sizeof(struct \
in_addr));			 +		}
+		member++;
+	}
+	return;
+}
+
+static int ckpt_confchg_fn (
+	enum totempg_configuration_type configuration_type,
+	struct in_addr *member_list, void *member_list_private,
+		int member_list_entries,
+	struct in_addr *left_list, void *left_list_private,
+		int left_list_entries,
+	struct in_addr *joined_list, void *joined_list_private,
+		int joined_list_entries,
+	struct memb_ring_id *ring_id) {
+
+	if (configuration_type == TOTEMPG_CONFIGURATION_REGULAR) {
+#ifdef TODO	
+		totempg_recovery_plug_unplug (ckpt_checkpoint_recovery_plug_handle);	
+#endif
+		memcpy (&saved_ring_id, ring_id, sizeof(struct memb_ring_id));
+	}	
+
+	else if (configuration_type == TOTEMPG_CONFIGURATION_TRANSITIONAL) {
+		ckpt_recovery_process_members_exit(left_list, left_list_entries);
+		ckpt_recovery_inititialize ();
+	}
+	
+	return (0);
+}
 
 static struct saCkptCheckpoint *ckpt_checkpoint_find_global (SaNameT *name)
 {
-	struct list_head *checkpointList;
+	struct list_head *checkpoint_list;
 	struct saCkptCheckpoint *checkpoint;
 
-   for (checkpointList = checkpointListHead.next;
-        checkpointList != &checkpointListHead;
-        checkpointList = checkpointList->next) {
+   for (checkpoint_list = checkpoint_list_head.next;
+        checkpoint_list != &checkpoint_list_head;
+        checkpoint_list = checkpoint_list->next) {
 
-        checkpoint = list_entry (checkpointList,
+        checkpoint = list_entry (checkpoint_list,
             struct saCkptCheckpoint, list);
 
 		if (name_match (name, &checkpoint->name)) {
@@ -320,15 +723,15 @@
 	char *id,
 	int idLen)
 {
-	struct list_head *checkpointSectionList;
+	struct list_head *checkpoint_section_list;
 	struct saCkptCheckpointSection *ckptCheckpointSection;
 
 	log_printf (LOG_LEVEL_DEBUG, "Finding checkpoint section id %s %d\n", id, idLen);
-	for (checkpointSectionList = ckptCheckpoint->checkpointSectionsListHead.next;
-		checkpointSectionList != &ckptCheckpoint->checkpointSectionsListHead;
-		checkpointSectionList = checkpointSectionList->next) {
+	for (checkpoint_section_list = ckptCheckpoint->checkpointSectionsListHead.next;
+		checkpoint_section_list != &ckptCheckpoint->checkpointSectionsListHead;
+		checkpoint_section_list = checkpoint_section_list->next) {
 
-		ckptCheckpointSection = list_entry (checkpointSectionList,
+		ckptCheckpointSection = list_entry (checkpoint_section_list,
 			struct saCkptCheckpointSection, list);
 	
 		log_printf (LOG_LEVEL_DEBUG, "Checking section id %*s\n", 
@@ -404,9 +807,12 @@
 
 static int ckpt_exec_init_fn (void)
 {
-	// Initialize the saved ring ID.
+	/*
+	 *  Initialize the saved ring ID.
+	 */
 	saved_ring_id.seq = 0;
-	saved_ring_id.rep.s_addr = this_ip.sin_addr.s_addr;	
+	saved_ring_id.rep.s_addr = this_ip.sin_addr.s_addr;		
+	
 #ifdef TODO
 	int res;
 	res = totempg_recovery_plug_create (&ckpt_checkpoint_recovery_plug_handle);
@@ -445,9 +851,11 @@
 	}
 
 #ifdef TODO
-/* todo close section iterators
+/* TODO close section iterators
+ */
+/* 
+ * TODO what about exit of open checkpoints
  */
-// TODO what about exit of open checkpoints
 
 	if (conn_info->ais_ci.u.libckpt_ci.sectionIterator.sectionIteratorEntries) {
 		free (conn_info->ais_ci.u.libckpt_ci.sectionIterator.sectionIteratorEntries);
@@ -511,10 +919,11 @@
 		ckptCheckpoint->unlinked = 0;
 		list_init (&ckptCheckpoint->list);
 		list_init (&ckptCheckpoint->checkpointSectionsListHead);
-		list_add (&ckptCheckpoint->list, &checkpointListHead);
+		list_add (&ckptCheckpoint->list, &checkpoint_list_head);
 		ckptCheckpoint->referenceCount = 0;
 		ckptCheckpoint->retention_timer = 0;
 		ckptCheckpoint->expired = 0;
+		initialize_ckpt_refcount_array(ckptCheckpoint->ckpt_refcount);
 
 		/*
 		 * Add in default checkpoint section
@@ -530,7 +939,7 @@
 		ckptCheckpointSection->sectionDescriptor.sectionSize = 0;
 		ckptCheckpointSection->sectionDescriptor.expirationTime = SA_TIME_END;
 		ckptCheckpointSection->sectionDescriptor.sectionState = SA_CKPT_SECTION_VALID;
-		ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; // current time
+		ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; /*current time*/
 		ckptCheckpointSection->sectionData = 0;
 		ckptCheckpointSection->expiration_timer = 0;
 	}
@@ -548,7 +957,26 @@
 	 */
 	log_printf (LOG_LEVEL_DEBUG, "CHECKPOINT opened is %p\n", ckptCheckpoint);
 	ckptCheckpoint->referenceCount += 1;
-
+	
+	/*
+	 * Add the connection reference information to the Checkpoint to be
+	 * sent out later as a part of the sync process.
+	 * 
+	 */
+	 
+	 int proc_index = findProcessorIndex(&source_addr,ckptCheckpoint->ckpt_refcount);
+	 if (proc_index == -1) {/* Could not find, lets set the processor to an index.*/
+	 	proc_index = setProcessorIndex(&source_addr,ckptCheckpoint->ckpt_refcount);
+	 }
+	 if (proc_index != -1 ) {	 
+	 	ckptCheckpoint->ckpt_refcount[proc_index].addr = source_addr;
+	 	ckptCheckpoint->ckpt_refcount[proc_index].count++;
+	 }
+	 else {
+	 	log_printf (LOG_LEVEL_ERROR, 
+	 				"CKPT: MAX LIMIT OF PROCESSORS reached. Cannot store new proc %p info.\n", 
+	 				ckptCheckpoint);
+	 }
 	/*
 	 * Reset retention duration since this checkpoint was just opened
 	 */
@@ -581,10 +1009,62 @@
 			sizeof (struct res_lib_ckpt_checkpointopen));
 	}
 
-//	return (error == SA_AIS_OK ? 0 : -1);
+/*	return (error == SA_AIS_OK ? 0 : -1); */
+	return (0);
+}
+
+/**/
+static int message_handler_req_exec_ckpt_synchronize_state (void *message, struct \
in_addr source_addr, int endian_conversion_required) { +	int retcode;
+	struct req_exec_ckpt_checkpointopen request_open_exec;	
+	struct req_lib_ckpt_checkpointopen request_open_lib;		
+	struct req_exec_ckpt_synchronize_state *req_exec_ckpt_sync_state 
+					= (struct req_exec_ckpt_synchronize_state *)message;
+					
+	/*
+	 * If the Incoming message's previous ring id == saved_ring_id
+	 * Ignore because we have seen this message before.
+	 */
+	if (memcmp (&req_exec_ckpt_sync_state->previous_ring_id, &saved_ring_id,sizeof \
(struct memb_ring_id)) != 0) { +			return(0);
+	}
+	request_open_lib.checkpointName = req_exec_ckpt_sync_state->checkpointName;
+	request_open_lib.checkpointCreationAttributes = \
req_exec_ckpt_sync_state->checkpointCreationAttributes;	 \
+	request_open_exec.req_lib_ckpt_checkpointopen = request_open_lib;	 \
+	message_handler_req_exec_ckpt_checkpointopen(&request_open_exec, \
req_exec_ckpt_sync_state->source_addr, 0); +	
+	retcode = recovery_section_create (&req_exec_ckpt_sync_state->sectionDescriptor,
+										&req_exec_ckpt_sync_state->checkpointName);
+	if (retcode != SA_AIS_OK) {
+		log_printf(LOG_LEVEL_ERROR, "CKPT: \
message_handler_req_exec_ckpt_synchronize_state\n"); +		log_printf(LOG_LEVEL_ERROR, \
"CKPT: recovery_section_create returned %d\n",retcode);		 +	}	
+	
 	return (0);
 }
 
+static int message_handler_req_exec_ckpt_synchronize_section (void *message, struct \
in_addr source_addr, int endian_conversion_required) { +	int retcode;
+	struct req_exec_ckpt_synchronize_section *req_exec_ckpt_sync_section 
+					= (struct req_exec_ckpt_synchronize_section *)message;
+	/*
+	 * Write the contents of the section to the checkpoint section.
+	 */
+	retcode = recovery_section_write(&req_exec_ckpt_sync_section->sectionId, 
+							&req_exec_ckpt_sync_section->checkpointName,
+							(char*)req_exec_ckpt_sync_section
+								+ sizeof (struct req_exec_ckpt_synchronize_section), 
+							req_exec_ckpt_sync_section->dataOffSet, 
+							req_exec_ckpt_sync_section->dataSize);
+	if (retcode != SA_AIS_OK) {
+		log_printf(LOG_LEVEL_ERROR, "CKPT: \
message_handler_req_exec_ckpt_synchronize_section\n"); +		log_printf(LOG_LEVEL_ERROR, \
"CKPT: recovery_section_write returned %d\n",retcode);		 +	}
+	
+	return (0);
+}
+
+
 unsigned int abstime_to_msec (SaTimeT time)
 {
 	struct timeval tv;
@@ -644,6 +1124,20 @@
 	}
 
 	checkpoint->referenceCount--;
+	/*
+	 * Modify the connection reference information to the Checkpoint to be
+	 * sent out later as a part of the sync process.	 
+	 */
+	
+	int proc_index = findProcessorIndex(&source_addr, checkpoint->ckpt_refcount);
+	if (proc_index != -1 ) {	 		
+	 	checkpoint->ckpt_refcount[proc_index].count--;
+	}
+	else {
+		log_printf (LOG_LEVEL_ERROR, 
+	 				"CKPT: Could Not find Processor Info %p info.\n", 
+	 				checkpoint);
+	}
 	assert (checkpoint->referenceCount >= 0);
 	log_printf (LOG_LEVEL_DEBUG, "disconnect called, new CKPT ref count is %d\n", 
 		checkpoint->referenceCount);
@@ -652,10 +1146,10 @@
 	 * If checkpoint has been unlinked and this is the last reference, delete it
 	 */
 	if (checkpoint->unlinked && checkpoint->referenceCount == 0) {
-		log_printf (LOG_LEVEL_DEBUG, "Unlinking checkpoint.\n");
+		log_printf (LOG_LEVEL_DEBUG, "Unlinking checkpoint.\n");		
 		checkpoint_release (checkpoint);
 	} else
-	if (checkpoint->referenceCount == 0) {
+	if (checkpoint->referenceCount == 0) {		
 		poll_timer_add (aisexec_poll_handle,
 			checkpoint->checkpointCreationAttributes.retentionDuration / 1000000,
 			checkpoint,
@@ -791,6 +1285,96 @@
 	return (0);
 }
 
+static int recovery_section_create (SaCkptSectionDescriptorT *sectionDescriptor,
+									SaNameT *checkpointName) {
+	struct saCkptCheckpoint *ckptCheckpoint;
+	struct saCkptCheckpointSection *ckptCheckpointSection;
+	void *initialData;
+	void *sectionId;
+	SaErrorT error = SA_AIS_OK;		
+	
+	ckptCheckpoint = ckpt_checkpoint_find_global (checkpointName);
+	if (ckptCheckpoint == 0) {		
+		error = SA_AIS_ERR_NOT_EXIST;
+		goto error_exit;
+	}
+
+	/*
+	 * Determine if user-specified checkpoint ID already exists
+	 */	
+	ckptCheckpointSection = ckpt_checkpoint_find_globalSection (ckptCheckpoint,
+								((char *)sectionDescriptor->sectionId.id),
+								(int)sectionDescriptor->sectionId.idLen);
+	if (ckptCheckpointSection) {
+		error = SA_AIS_ERR_EXIST;
+		goto error_exit;
+	}
+
+	/*
+	 * Allocate checkpoint section	
+	 */
+	ckptCheckpointSection = malloc (sizeof (struct saCkptCheckpointSection));
+	if (ckptCheckpointSection == 0) {
+		error = SA_AIS_ERR_NO_MEMORY;
+		goto error_exit;
+	}
+	/*
+	 * Allocate checkpoint section data
+	 */
+	initialData = malloc (sectionDescriptor->sectionSize);
+	if (initialData == 0) {
+		free (ckptCheckpointSection);
+		error = SA_AIS_ERR_NO_MEMORY;
+		goto error_exit;
+	}
+	/*
+	 * Allocate checkpoint section id
+	 */
+	sectionId = malloc ((int)sectionDescriptor->sectionId.idLen);
+	if (sectionId == 0) {
+		free (ckptCheckpointSection);
+		free (initialData);
+		error = SA_AIS_ERR_NO_MEMORY;
+		goto error_exit;
+	}
+	/*
+	 * Copy checkpoint section ID and initialize data.
+	 */
+	memcpy (sectionId, ((char *)sectionDescriptor->sectionId.id),
+		(int)sectionDescriptor->sectionId.idLen);
+	
+	memset (initialData, 0, sectionDescriptor->sectionSize);
+	
+	/*
+	 * Configure checkpoint section
+	 */
+	memcpy(&ckptCheckpointSection->sectionDescriptor, 
+			sectionDescriptor,
+			sizeof(SaCkptSectionDescriptorT));	
+	ckptCheckpointSection->sectionDescriptor.sectionState = SA_CKPT_SECTION_VALID;	
+	ckptCheckpointSection->sectionData = initialData;
+	ckptCheckpointSection->expiration_timer = 0;
+
+	if (sectionDescriptor->expirationTime != SA_TIME_END) {
+		poll_timer_add (aisexec_poll_handle,
+			abstime_to_msec (ckptCheckpointSection->sectionDescriptor.expirationTime),
+			ckptCheckpointSection,
+			timer_function_section_expire,
+			&ckptCheckpointSection->expiration_timer);
+	}
+
+	/*
+	 * Add checkpoint section to checkpoint
+	 */
+	list_init (&ckptCheckpointSection->list);
+	list_add (&ckptCheckpointSection->list,
+		&ckptCheckpoint->checkpointSectionsListHead);
+
+error_exit:
+	return (error);				
+
+}
+
 static int message_handler_req_exec_ckpt_sectioncreate (void *message, struct \
in_addr source_addr, int endian_conversion_required) {  struct \
req_exec_ckpt_sectioncreate *req_exec_ckpt_sectioncreate = (struct \
req_exec_ckpt_sectioncreate *)message;  struct req_lib_ckpt_sectioncreate \
*req_lib_ckpt_sectioncreate = (struct req_lib_ckpt_sectioncreate \
*)&req_exec_ckpt_sectioncreate->req_lib_ckpt_sectioncreate; @@ -804,7 +1388,7 @@
 	log_printf (LOG_LEVEL_DEBUG, "Executive request to create a checkpoint \
section.\n");  ckptCheckpoint = ckpt_checkpoint_find_global \
(&req_exec_ckpt_sectioncreate->checkpointName);  if (ckptCheckpoint == 0) {
-		error = SA_AIS_ERR_LIBRARY; // TODO find the right error for this
+		error = SA_AIS_ERR_LIBRARY; /* TODO find the right error for this*/
 		goto error_exit;
 	}
 
@@ -867,7 +1451,7 @@
 	ckptCheckpointSection->sectionDescriptor.sectionSize = \
req_lib_ckpt_sectioncreate->initialDataSize;  \
ckptCheckpointSection->sectionDescriptor.expirationTime = \
req_lib_ckpt_sectioncreate->expirationTime;  \
                ckptCheckpointSection->sectionDescriptor.sectionState = \
                SA_CKPT_SECTION_VALID;
-	ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; // TODO current time
+	ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; /* TODO current time */
 	ckptCheckpointSection->sectionData = initialData;
 	ckptCheckpointSection->expiration_timer = 0;
 
@@ -1015,6 +1599,58 @@
 	return (0);
 }
 
+static int recovery_section_write(SaCkptSectionIdT *sectionId,
+									SaNameT *checkpointName,
+									void *newData,
+									SaUint32T dataOffSet,
+									SaUint32T dataSize) {
+	struct saCkptCheckpoint *ckptCheckpoint;
+	struct saCkptCheckpointSection *ckptCheckpointSection;
+	int sizeRequired;	
+	SaErrorT error = SA_AIS_OK;
+	char *sd;	
+	
+	log_printf (LOG_LEVEL_DEBUG, "CKPT: recovery_section_write.\n");
+	ckptCheckpoint = ckpt_checkpoint_find_global (checkpointName);
+	if (ckptCheckpoint == 0) {
+		error = SA_AIS_ERR_NOT_EXIST;
+		goto error_exit;
+	}
+
+	/*
+	 * Find checkpoint section to be written
+	 */
+	ckptCheckpointSection = ckpt_checkpoint_find_globalSection (ckptCheckpoint,
+								((char *)sectionId->id),
+								(int)sectionId->idLen);
+	if (ckptCheckpointSection == 0) {		
+		error = SA_AIS_ERR_NOT_EXIST;
+		goto error_exit;
+	}
+
+	/*
+	 * If write would extend past end of section data, return error;
+	 */
+	sizeRequired = dataOffSet + dataSize;
+	if (sizeRequired > ckptCheckpointSection->sectionDescriptor.sectionSize) {
+		error = SA_AIS_ERR_ACCESS;
+		goto error_exit;		
+	}
+	
+	/*
+	 * Write checkpoint section to section data
+	 */
+	if (dataSize > 0) {			
+		sd = (char *)ckptCheckpointSection->sectionData;
+		memcpy (&sd[dataOffSet],
+			newData,
+			dataSize);
+	}	
+error_exit:	
+	return (error);	
+}
+
+
 static int message_handler_req_exec_ckpt_sectionwrite (void *message, struct in_addr \
source_addr, int endian_conversion_required) {  struct req_exec_ckpt_sectionwrite \
*req_exec_ckpt_sectionwrite = (struct req_exec_ckpt_sectionwrite *)message;  struct \
req_lib_ckpt_sectionwrite *req_lib_ckpt_sectionwrite = (struct \
req_lib_ckpt_sectionwrite *)&req_exec_ckpt_sectionwrite->req_lib_ckpt_sectionwrite; \
@@ -1032,7 +1668,9 @@  goto error_exit;
 	}
 
-//printf ("writing checkpoint section is %s\n", ((char *)req_lib_ckpt_sectionwrite) \
+ sizeof (struct req_lib_ckpt_sectionwrite)); +/*
+	printf ("writing checkpoint section is %s\n", ((char *)req_lib_ckpt_sectionwrite) + \
sizeof (struct req_lib_ckpt_sectionwrite)); +*/
 	/*
 	 * Find checkpoint section to be written
 	 */
@@ -1147,7 +1785,7 @@
 	 */
 	ckptCheckpointSection->sectionDescriptor.sectionSize = \
req_lib_ckpt_sectionoverwrite->dataSize;  \
                ckptCheckpointSection->sectionDescriptor.sectionState = \
                SA_CKPT_SECTION_VALID;
-	ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; // TODO current time
+	ckptCheckpointSection->sectionDescriptor.lastUpdate = 0; /* TODO current time */
 	ckptCheckpointSection->sectionData = sectionData;
 
 	/*
@@ -1178,7 +1816,7 @@
 
 	ckptCheckpoint = ckpt_checkpoint_find_global \
(&req_exec_ckpt_sectionread->checkpointName);  if (ckptCheckpoint == 0) {
-		error = SA_AIS_ERR_LIBRARY; // TODO find the right error for this
+		error = SA_AIS_ERR_LIBRARY; /* TODO find the right error for this */
 		goto error_exit;
 	}
 
@@ -1258,7 +1896,7 @@
 		conn_info->ais_ci.u.libckpt_ci.sectionIterator.iteratorCount = 0;
 		conn_info->ais_ci.u.libckpt_ci.sectionIterator.iteratorPos = 0;
 		list_add (&conn_info->ais_ci.u.libckpt_ci.sectionIterator.list,
-			&checkpointIteratorListHead);
+			&checkpoint_iterator_list_head);
 		list_init (&conn_info->ais_ci.u.libckpt_ci.checkpoint_list);
 		error = SA_AIS_OK;
 	}
@@ -1400,7 +2038,7 @@
 	struct saCkptCheckpoint *checkpoint;
 	int memoryUsed = 0;
 	int numberOfSections = 0;
-	struct list_head *checkpointSectionList;
+	struct list_head *checkpoint_section_list;
 	struct saCkptCheckpointSection *checkpointSection;
 
 	log_printf (LOG_LEVEL_DEBUG, "in status get\n");
@@ -1410,11 +2048,11 @@
 	 */
 	checkpoint = ckpt_checkpoint_find_global \
(&req_lib_ckpt_checkpointstatusget->checkpointName);  
-	for (checkpointSectionList = checkpoint->checkpointSectionsListHead.next;
-		checkpointSectionList != &checkpoint->checkpointSectionsListHead;
-		checkpointSectionList = checkpointSectionList->next) {
+	for (checkpoint_section_list = checkpoint->checkpointSectionsListHead.next;
+		checkpoint_section_list != &checkpoint->checkpointSectionsListHead;
+		checkpoint_section_list = checkpoint_section_list->next) {
 
-		checkpointSection = list_entry (checkpointSectionList,
+		checkpointSection = list_entry (checkpoint_section_list,
 			struct saCkptCheckpointSection, list);
 
 		memoryUsed += checkpointSection->sectionDescriptor.sectionSize;
@@ -1614,7 +2252,9 @@
 	iovecs[1].iov_base = ((char *)req_lib_ckpt_sectionwrite) + sizeof (struct \
req_lib_ckpt_sectionwrite);  iovecs[1].iov_len = \
req_lib_ckpt_sectionwrite->header.size - sizeof (struct req_lib_ckpt_sectionwrite);  
-//printf ("LIB writing checkpoint section is %s\n", ((char \
*)req_lib_ckpt_sectionwrite) + sizeof (struct req_lib_ckpt_sectionwrite)); +/*
+	printf ("LIB writing checkpoint section is %s\n", ((char \
*)req_lib_ckpt_sectionwrite) + sizeof (struct req_lib_ckpt_sectionwrite)); +*/
 	if (iovecs[1].iov_len > 0) {
 		assert (totempg_mcast (iovecs, 2, TOTEMPG_AGREED) == 0);
 	} else {
@@ -1730,7 +2370,7 @@
 	struct saCkptCheckpointSection *ckptCheckpointSection;
 	struct saCkptSectionIteratorEntry *ckptSectionIteratorEntries;
 	struct saCkptSectionIterator *ckptSectionIterator;
-	struct list_head *checkpointSectionList;
+	struct list_head *checkpoint_section_list;
 	int addEntry = 0;
 	int iteratorEntries = 0;
 	SaErrorT error = SA_AIS_OK;
@@ -1747,11 +2387,11 @@
 	/*
 	 * Iterate list of checkpoint sections
 	 */
-	for (checkpointSectionList = ckptCheckpoint->checkpointSectionsListHead.next;
-		checkpointSectionList != &ckptCheckpoint->checkpointSectionsListHead;
-		checkpointSectionList = checkpointSectionList->next) {
+	for (checkpoint_section_list = ckptCheckpoint->checkpointSectionsListHead.next;
+		checkpoint_section_list != &ckptCheckpoint->checkpointSectionsListHead;
+		checkpoint_section_list = checkpoint_section_list->next) {
 
-		ckptCheckpointSection = list_entry (checkpointSectionList,
+		ckptCheckpointSection = list_entry (checkpoint_section_list,
 			struct saCkptCheckpointSection, list);
 
 		addEntry = 1;
diff -uNr --exclude=SCCS --exclude=BitKeeper --exclude=ChangeSet --exclude=init \
--exclude=LICENSE --exclude=Makefile --exclude=man --exclude=README.devmap \
--exclude=SECURITY --exclude=TODO --exclude=CHANGELOG --exclude=conf --exclude=loc \
--exclude=Makefile.samples --exclude=QUICKSTART --exclude=test --exclude=.cdtproject \
                --exclude=.project ../latest/include/ipc_ckpt.h \
                ../bk_openais/include/ipc_ckpt.h
--- ../latest/include/ipc_ckpt.h	2005-02-25 14:07:12.000000000 -0600
+++ ../bk_openais/include/ipc_ckpt.h	2005-02-24 14:55:42.000000000 -0600
@@ -37,6 +37,8 @@
 #include "../include/ipc_gen.h"
 #include "../include/ais_types.h"
 #include "../include/saCkpt.h"
+#include "../exec/totemsrp.h"
+#include "../exec/ckpt.h"
 
 enum req_lib_ckpt_checkpoint_types {
 	MESSAGE_REQ_CKPT_CHECKPOINT_CHECKPOINTOPEN = 1,
@@ -326,4 +328,24 @@
 	struct res_header header;
 };
 
+struct req_exec_ckpt_synchronize_state {
+	struct req_header header;
+	struct memb_ring_id previous_ring_id;
+	SaNameT checkpointName;
+	SaCkptCheckpointCreationAttributesT checkpointCreationAttributes;
+	SaCkptSectionDescriptorT sectionDescriptor;	
+	struct in_addr source_addr;
+	SaUint32T ref_count;	
+};
+
+struct req_exec_ckpt_synchronize_section {
+	struct req_header header;
+	SaNameT checkpointName;
+	SaCkptSectionIdT sectionId;	
+	SaUint32T dataOffSet;
+	SaUint32T dataSize;	
+};
+
+
+
 #endif /* IPC_CKPT_H_DEFINED */
diff -uNr --exclude=SCCS --exclude=BitKeeper --exclude=ChangeSet --exclude=init \
--exclude=LICENSE --exclude=Makefile --exclude=man --exclude=README.devmap \
--exclude=SECURITY --exclude=TODO --exclude=CHANGELOG --exclude=conf --exclude=loc \
--exclude=Makefile.samples --exclude=QUICKSTART --exclude=test --exclude=.cdtproject \
                --exclude=.project ../latest/include/ipc_gen.h \
                ../bk_openais/include/ipc_gen.h
--- ../latest/include/ipc_gen.h	2005-02-25 14:07:12.000000000 -0600
+++ ../bk_openais/include/ipc_gen.h	2005-02-24 13:42:21.000000000 -0600
@@ -69,6 +69,8 @@
 	MESSAGE_REQ_EXEC_CKPT_SECTIONWRITE,
 	MESSAGE_REQ_EXEC_CKPT_SECTIONOVERWRITE,
 	MESSAGE_REQ_EXEC_CKPT_SECTIONREAD,
+	MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESTATE,
+	MESSAGE_REQ_EXEC_CKPT_SYNCHRONIZESECTION,
 	MESSAGE_REQ_EXEC_EVT_EVENTDATA,
 	MESSAGE_REQ_EXEC_EVT_CHANCMD,
 	MESSAGE_REQ_EXEC_EVT_RECOVERY_EVENTDATA



_______________________________________________
Openais mailing list
Openais@lists.osdl.org
http://lists.osdl.org/mailman/listinfo/openais


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic