[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gluster-users
Subject:    [Gluster-users] rpc/glusterd-locks error
From:       Vineet Khandpur <khandpur () ualberta ! ca>
Date:       2018-02-26 14:41:34
Message-ID: CAOy1j0gNc+h=H2FEbq7mCZfF3v2SqdKnNtXYZ5EXH3w06YV+eQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Good morning.

We have a 6 node cluster. 3 nodes are participating in a replica 3 volume.
Naming convention:
xx01 - 3 nodes participating in ovirt_vol
xx02 - 3 nodes NOT particpating in ovirt_vol

Last week, restarted glusterd on each node in cluster to update (one at a
time).
The three xx01 nodes all show the following in glusterd.log:

[2018-02-26 14:31:47.330670] E [socket.c:2020:__socket_read_frag] 0-rpc:
wrong MSG-TYPE (29386) received from 172.26.30.9:24007
[2018-02-26 14:31:47.330879] W
[glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a)
[0x7f46020e922a]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198)
[0x7f46020f3198]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755)
[0x7f46021aa755] ) 0-management: Lock for vol ovirtprod_vol not held
[2018-02-26 14:31:47.331066] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] )))))
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
at 2018-02-26 14:31:47.330496 (xid=0x72e0)
[2018-02-26 14:31:47.333993] E [socket.c:2020:__socket_read_frag] 0-rpc:
wrong MSG-TYPE (84253) received from 172.26.30.8:24007
[2018-02-26 14:31:47.334148] W
[glusterd-locks.c:843:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a)
[0x7f46020e922a]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198)
[0x7f46020f3198]
-->/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755)
[0x7f46021aa755] ) 0-management: Lock for vol ovirtprod_vol not held
[2018-02-26 14:31:47.334317] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] )))))
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
at 2018-02-26 14:31:47.333824 (xid=0x1494b)
[2018-02-26 14:31:48.511390] E [socket.c:2632:socket_poller]
0-socket.management: poll error on socket

Additionally, all show connectivity to 2 of the three hosts (itself, and a
second). None of the 3 show connectivity to the same host (xx01 show
connectivity to itself and yy01, yy01 show connectivity to itself and zz01,
zz01 shows itself and xx01).

However, xx02 hosts (non-volume participating, same cluster) show volume
info as being fine, and all xx01 hosts participating in volume.

In our dev environment, had to stop the volume, and restart glusterd on all
hosts, however for prod, that would mean a system wide outage and down
time, which needs to be avoided.

Any suggestions? Thanks.

vk
--------------------------------
Vineet Khandpur
UNIX System Administrator
Information Technology Services
University of Alberta Libraries
+1-780-492-4718

[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_default" \
style="font-family:monospace,monospace">Good morning.<br><br></div><div \
class="gmail_default" style="font-family:monospace,monospace">We have a 6 node \
cluster. 3 nodes are participating in a replica 3 volume.<br></div><div \
class="gmail_default" style="font-family:monospace,monospace">Naming \
convention:<br></div><div class="gmail_default" \
style="font-family:monospace,monospace">xx01 - 3 nodes participating in \
ovirt_vol<br></div><div class="gmail_default" \
style="font-family:monospace,monospace">xx02 - 3 nodes NOT particpating in \
ovirt_vol<br><br></div><div class="gmail_default" \
style="font-family:monospace,monospace">Last week, restarted glusterd on each node in \
cluster to update (one at a time).<br></div><div class="gmail_default" \
style="font-family:monospace,monospace">The three xx01 nodes all show the following \
in glusterd.log:<br><br>[2018-02-26 14:31:47.330670] E \
[socket.c:2020:__socket_read_frag] 0-rpc: wrong MSG-TYPE (29386) received from <a \
href="http://172.26.30.9:24007">172.26.30.9:24007</a><br>[2018-02-26 14:31:47.330879] \
W [glusterd-locks.c:843:glusterd_mgmt_v3_unlock] \
(--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a) [0x7f46020e922a] \
--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198) [0x7f46020f3198] \
--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755) [0x7f46021aa755] \
) 0-management: Lock for vol ovirtprod_vol not held<br>[2018-02-26 14:31:47.331066] E \
[rpc-clnt.c:350:saved_frames_unwind] (--&gt; \
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (--&gt; \
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (--&gt; \
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (--&gt; \
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (--&gt; \
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] ))))) 0-management: \
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2018-02-26 \
14:31:47.330496 (xid=0x72e0)<br>[2018-02-26 14:31:47.333993] E \
[socket.c:2020:__socket_read_frag] 0-rpc: wrong MSG-TYPE (84253) received from <a \
href="http://172.26.30.8:24007">172.26.30.8:24007</a><br>[2018-02-26 14:31:47.334148] \
W [glusterd-locks.c:843:glusterd_mgmt_v3_unlock] \
(--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2322a) [0x7f46020e922a] \
--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0x2d198) [0x7f46020f3198] \
--&gt;/usr/lib64/glusterfs/3.12.5/xlator/mgmt/glusterd.so(+0xe4755) [0x7f46021aa755] \
) 0-management: Lock for vol ovirtprod_vol not held<br>[2018-02-26 14:31:47.334317] E \
[rpc-clnt.c:350:saved_frames_unwind] (--&gt; \
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f460d64dedb] (--&gt; \
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f460d412e6e] (--&gt; \
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f460d412f8e] (--&gt; \
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7f460d414710] (--&gt; \
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7f460d415200] ))))) 0-management: \
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2018-02-26 \
14:31:47.333824 (xid=0x1494b)<br>[2018-02-26 14:31:48.511390] E \
[socket.c:2632:socket_poller] 0-socket.management: poll error on \
socket<br><br></div><div class="gmail_default" \
style="font-family:monospace,monospace">Additionally, all show connectivity to 2 of \
the three hosts (itself, and a second). None of the 3 show connectivity to the same \
host (xx01 show connectivity to itself and yy01, yy01 show connectivity to itself and \
zz01, zz01 shows itself and xx01).<br><br></div><div class="gmail_default" \
style="font-family:monospace,monospace">However, xx02 hosts (non-volume \
participating, same cluster) show volume info as being fine, and all xx01 hosts \
participating in volume.<br><br></div><div class="gmail_default" \
style="font-family:monospace,monospace">In our dev environment, had to stop the \
volume, and restart glusterd on all hosts, however for prod, that would mean a system \
wide outage and down time, which needs to be avoided.<br><br></div><div \
class="gmail_default" style="font-family:monospace,monospace">Any suggestions? \
Thanks.</div><div class="gmail_default" \
style="font-family:monospace,monospace"><br></div><div><div \
class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div \
dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div \
dir="ltr"><div><div dir="ltr"><font face="monospace, \
monospace">vk<br></font><div><font face="monospace, \
monospace">--------------------------------<br>Vineet Khandpur<br></font></div><font \
face="monospace, monospace">UNIX System Administrator<br>Information Technology \
Services<br>University of Alberta \
Libraries<br>+1-780-492-4718</font></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>
 </div>



_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic