[prev in list] [next in list] [prev in thread] [next in thread]
List: drbd-user
Subject: [DRBD-user] Is it normal that we can't directly remove the secondary node when fencing is set?
From: <mzlld1988 () 163 ! com>
Date: 2016-09-10 6:46:01
Message-ID: 7976b4af.38c6.15712d88101.Coremail.mzlld1988 () 163 ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
[Attachment #4 (text/plain)]
Hi everyone,
I have a question about removing the secondary node of DRBD9.
When fencing is set, is it normal that we can't remove the secondary node of DRBD9, \
but the operation is successful of DRBD8.4.6?
Version of DRBD kernel source is the newest version(9.0.4-1).Version of DRBD utils is \
8.9.6. Description:
3 nodes, one of the nodes is primary,disk state is UpToDate.Fencing is set.
I got an error message 'State change failed: (-7) State change was refused by \
peer node' when executing the command 'drbdadm down <res-name>' on any of the \
secondary nodes.
Analysis:
When executing the down command on one of the secondary nodes.
The secondary node will execute the methods 'change_cluster_wide_state' of \
drbd_state.c. change_cluster_wide_state()
{
...
if (have_peers) {
if (wait_event_timeout(resource->state_wait,
cluster_wide_reply_ready(resource),
twopc_timeout(resource))){-------------¢ÙWaiting for \
peer node to reply, the thread will sleep until the peer node \
replies.
rv = get_cluster_wide_reply(resource);------------¢ÚGet the reply \
info. }else{
}
...
}
Process ¢Ù
Primary node will execute the following methods.
..->try_state_change->is_valid_soft_transition->__is_valid_soft_transition
Finally,__is_valid_soft_transition will return error code \
SS_PRIMARY_NOP¡£
if (peer_device->connection->fencing_policy >= FP_RESOURCE &&
!(role[OLD] == R_PRIMARY && repl_state[OLD] < L_ESTABLISHED && \
!(peer_disk_state[OLD] <= D_OUTDATED)) &&
(role[NEW] == R_PRIMARY && repl_state[NEW] < L_ESTABLISHED && \
!(peer_disk_state[NEW] <= D_OUTDATED)))
return SS_PRIMARY_NOP;
Primary node will set drbd_packet to P_TWOPC_NO, seconday node will get \
the reply to set connection status to TWOPC_NO¡£ At this time,Process ¢Ù will \
finish.
Process ¢Ú
rv will be set to SS_CW_FAILED_BY_PEER
====8.4.6°æ====
One is primary, the next one is secondary.
When executing 'drbdadm down <res-name>' on seconday node, the same error \
message will be recorded in the log file for the first time to change the peer disk \
to D_UNKNOWN¡£
But the command will succeed by changing peer disk to D_OUTDATED for the \
second time.
The following code that report the error.
is_valid_state()
{
...
if (fp >= FP_RESOURCE &&
ns.role == R_PRIMARY && ns.conn < C_CONNECTED && ns.pdsk >= \
D_UNKNOWN¢Ù){ rv = SS_PRIMARY_NOP;
}
...
}
After executing the command 'drbdadm down <res-name>' on secondary node, the \
status of the primary node is: [root@drbd846 drbd-8.4.6]# cat /proc/drbd
version: 8.4.6 (api:1/proto:86-101)
GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by \
root@drbd846.node1, 2016-09-08 08:51:45 0: cs:StandAlone ro:Primary/Unknown \
ds:UpToDate/Outdated r-----
ns:1048508 nr:0 dw:0 dr:1049236 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f \
oos:0
The peer disk state is OutDated, not DUnknown.
[Attachment #5 (text/html)]
<div style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial"><div><span \
style="font-size: 18px;"><b>Hi everyone,</b></span><span style="font-size: \
16px;"><br>I have a question about removing the secondary node of DRBD9.<br>When \
fencing is set, is it normal that we can't remove the secondary node of DRBD9, but \
the operation is successful of DRBD8.4.6?<br></span><ul><li><span style="font-size: \
16px; color: rgb(255, 0, 0);">Version of DRBD kernel source is the newest \
version(9.0.4-1).Version of DRBD utils is 8.9.6.</span></li></ul></div><div><span \
style="font-size: 18px;"><b>Description:</b></span><span style="font-size: \
16px;"></span></div><span style="font-size: 16px;"> 3 nodes, one of \
the nodes is primary,disk state is UpToDate.Fencing is set.<br> I \
got an error message '</span><span style="font-size: 16px; color: rgb(255, 0, \
0);">State change failed: (-7) State change was refused by peer node</span><span \
style="font-size: 16px;">' when executing the command 'drbdadm down <res-name>' \
on any of the secondary nodes.<br><br></span><span style="font-size: \
18px;"><b>Analysis:</b></span><span style="font-size: 16px;"><br> \
When executing the down command on one of the secondary nodes.<br> \
The secondary node will execute the methods 'change_cluster_wide_state' of \
drbd_state.c.<br> change_cluster_wide_state()<br> \
{<br> \
...<br> if (have_peers) \
{<br> \
if (wait_event_timeout(resource->state_wait,<br>   \
; \
cluster_wide_reply_ready(resource),<br>   \
; \
twopc_timeout(resource))){-------------¢ÙWaiting for peer node to reply, the thread \
will sleep until the peer node \
replies.<br> \
rv = get_cluster_wide_reply(resource);------------¢ÚGet the reply \
info. \
<br> \
}else{<br> \
}<br> ...<br> \
}<br><br> Process ¢Ù<br> \
Primary node will execute the following \
methods.<br> \
..->try_state_change->is_valid_soft_transition->__is_valid_soft_transition<br></span><div><span \
style="font-size: 16px;"> \
Finally,__is_valid_soft_transition will return error code \
SS_PRIMARY_NOP¡£</span></div><div><span style="font-size: \
16px;"></span><br></div><span style="font-size: \
16px;"> if \
(peer_device->connection->fencing_policy >= FP_RESOURCE \
&&<br> \
!(role[OLD] == R_PRIMARY && repl_state[OLD] < L_ESTABLISHED && \
!(peer_disk_state[OLD] <= D_OUTDATED)) \
&&<br> \
(role[NEW] == R_PRIMARY && repl_state[NEW] < L_ESTABLISHED && \
!(peer_disk_state[NEW] <= D_OUTDATED)))<br></span><div><span style="font-size: \
16px;"> \
return SS_PRIMARY_NOP;</span></div><div><span style="font-size: \
16px;"></span><br></div><div><span style="font-size: \
16px;"> Primary \
node will set drbd_packet to P_TWOPC_NO, seconday node will get the reply to set \
connection status to \
TWOPC_NO¡£<br> At \
this time,Process ¢Ù will finish. <br><br></span></div><span style="font-size: \
16px;"> Process ¢Ú<br> \
rv will be set to \
SS_CW_FAILED_BY_PEER<br> \
<br> \
====8.4.6°æ====<br> One is primary, the \
next one is secondary.<br> When executing \
'drbdadm down <res-name>' on seconday node, the same error message will be \
recorded in the log file for the first time to change the peer disk to \
D_UNKNOWN¡£<br> But the command will succeed by \
changing peer disk to D_OUTDATED for the second \
time.<br> <br> \
The following code that report the error.<br> \
is_valid_state()<br> \
{<br> \
...<br> if (fp \
>= FP_RESOURCE &&<br> \
ns.role == R_PRIMARY && ns.conn < C_CONNECTED && ns.pdsk >= \
D_UNKNOWN¢Ù){<br> \
rv = SS_PRIMARY_NOP;<br> \
}<br> \
...<br> }<br> \
<br> After executing the command 'drbdadm \
down <res-name>' on secondary node, the status of the primary node \
is:<br> [root@drbd846 drbd-8.4.6]# cat \
/proc/drbd<br> version: 8.4.6 \
(api:1/proto:86-101)<br> GIT-hash: \
833d830e0152d1e457fa7856e71e11248ccf3f70 build by root@drbd846.node1, 2016-09-08 \
08:51:45<br> 0: cs:StandAlone \
ro:Primary/Unknown ds:UpToDate/</span><span style="font-size: 16px; color: rgb(255, \
0, 0);">Outdated</span><span style="font-size: 16px;"> \
r-----<br> \
ns:1048508 nr:0 dw:0 dr:1049236 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f \
oos:0<br><br> </span><span style="font-size: 16px; color: rgb(255, \
0, 0);"><b> The peer disk state is OutDated, not \
DUnknown.</b></span></div><br><br><span title="neteasefooter"><p> </p></span>
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic