[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    RE: [Linux-ha-dev] data checkpoint design document (draft)
From:       "Pan, Deng" <deng.pan () intel ! com>
Date:       2003-07-17 3:56:08
[Download RAW message or body]

The definition of "partial update" in the spec is not clear enough and I
will not distinguish between SA_CKPT_WR_ACTIVE_REPLICAS and
SA_CKPT_WR_ACTIVE_REPLICAS_WEAK. The following is what I try to
implement.

1. The checkpoint is created with SA_CKPT_WR_ALL_REPLICAS
If any update failure on all the replicas (both active and standby), it
will return error to the client. For those replicas that have been
updated successfully, they have to be rolled back. Otherwise, return OK
to the client.

2. The checkpoint is created with SA_CKPT_WR_ACTIVE_REPLICAS
If any update failure on the active replica, it will be rolled back and
return error to the client. Otherwise, return OK to the client. 
If any update failure on the standby replica, the standby replica will
try to recover the error by getting a full copy of data from active
replica. If the recovery is successful, everything will be OK. If the
recovery is failed, the section is marked as "corrupted". But the
checkpoint can still be read and written because the active replica is
available. If the active replica fails and the replica that holds the
corrupted section becomes the active replica, that corrupted section
cannot be read or written any more.


As for version control, there is a version number in every message, so
that the software can be upgrade lively.


Pan, Deng
------------------------------------------------
Opinions are of my own, not of my employer
 
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.community.tummy.com
http://lists.community.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic