[prev in list] [next in list] [prev in thread] [next in thread]
List: spread-users
Subject: Re: [Spread-users] daemon crash
From: John Schultz <jschultz () spreadconcepts ! com>
Date: 2011-07-17 22:00:44
Message-ID: CAD794F7-9FD5-46F4-982C-87728AC6C83B () spreadconcepts ! com
[Download RAW message or body]
[Attachment #2 (multipart/signed)]
First, it looks like one or more of your daemons was connecting and disconnecting \
repeatedly.
Second, it looks like you tripped the infinite-EVS state bug that we tried to work \
around. We still aren't sure what causes this and most times simply restarting the \
protocol seems to fix it, which is the work around.
The state at the end looks like a bug. It has the ARU as 167 but the highest seq as \
135. Then in the next membership it does establish it tries to issue packet #136, \
but that is lower than the ARU, so that packet id already exists. I'm not 100% sure \
on this because during memberships the token fields can mean different things than \
during regular operation.
The infinite-EVS bug and this exit might be related or caused by the same logic \
issue.
Cheers!
-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200
On Jul 17, 2011, at 2:38 PM, Matt Garman wrote:
Hello,
We're using spread 4.0.0 on 64-bit CentOS 4 (Linux).
The other day a daemon crashed. Below is what was logged just prior
to the crash. I was wondering if anyone could help shed some light
on this?
Thanks,
Matt
Membership id is ( -1407973572, 1308835744)
[Thu 23 Jun 2011 08:29:03] --------------------
[Thu 23 Jun 2011 08:29:03] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:29:03] Num Segments 1
[Thu 23 Jun 2011 08:29:03] 4 172.20.7.63 4803
[Thu 23 Jun 2011 08:29:03] lnxsvr1 172.20.7.60
[Thu 23 Jun 2011 08:29:03] lnxsvr2 172.20.7.61
[Thu 23 Jun 2011 08:29:03] lnxsvr6 172.20.7.62
[Thu 23 Jun 2011 08:29:03] lnxsvr5 172.20.7.58
[Thu 23 Jun 2011 08:29:03] =====================
Membership id is ( -1407973572, 1308835795)
[Thu 23 Jun 2011 08:29:54] --------------------
[Thu 23 Jun 2011 08:29:54] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:29:54] Num Segments 1
[Thu 23 Jun 2011 08:29:54] 4 172.20.7.63 4803
[Thu 23 Jun 2011 08:29:54] lnxsvr1 172.20.7.60
[Thu 23 Jun 2011 08:29:54] lnxsvr2 172.20.7.61
[Thu 23 Jun 2011 08:29:54] lnxsvr6 172.20.7.62
[Thu 23 Jun 2011 08:29:54] lnxsvr5 172.20.7.58
[Thu 23 Jun 2011 08:29:54] =====================
Membership id is ( -1407973572, 1308835801)
[Thu 23 Jun 2011 08:30:00] --------------------
[Thu 23 Jun 2011 08:30:00] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:30:00] Num Segments 1
[Thu 23 Jun 2011 08:30:00] 4 172.20.7.63 4803
[Thu 23 Jun 2011 08:30:00] lnxsvr1 172.20.7.60
[Thu 23 Jun 2011 08:30:00] lnxsvr2 172.20.7.61
[Thu 23 Jun 2011 08:30:00] lnxsvr6 172.20.7.62
[Thu 23 Jun 2011 08:30:00] lnxsvr5 172.20.7.58
[Thu 23 Jun 2011 08:30:00] =====================
[Thu 23 Jun 2011 08:30:01] Prot_handle_token: BUG WORKAROUND: Too many rounds in EVS \
state; swallowing token; state: [Thu 23 Jun 2011 08:30:01] Aru: 167
[Thu 23 Jun 2011 08:30:01] My_aru: 167
[Thu 23 Jun 2011 08:30:01] Highest_seq: 135
[Thu 23 Jun 2011 08:30:01] Highest_fifo_seq: 84
[Thu 23 Jun 2011 08:30:01] Last_discarded: 0
[Thu 23 Jun 2011 08:30:01] Last_delivered: 167
[Thu 23 Jun 2011 08:30:01] Last_seq: 3468
[Thu 23 Jun 2011 08:30:01] Token_rounds: 501
[Thu 23 Jun 2011 08:30:01] Last Token:
[Thu 23 Jun 2011 08:30:01] type: 0x80040080
[Thu 23 Jun 2011 08:30:01] transmiter_id: -1407973572
[Thu 23 Jun 2011 08:30:01] seq: 0
[Thu 23 Jun 2011 08:30:01] proc_id: -1407973572
[Thu 23 Jun 2011 08:30:01] aru: 167
[Thu 23 Jun 2011 08:30:01] aru_last_id: -1407973572
[Thu 23 Jun 2011 08:30:01] flow_control: 0
[Thu 23 Jun 2011 08:30:01] rtr_len: 0
[Thu 23 Jun 2011 08:30:01] conf_hash: 1007608523
Membership id is ( -1407973572, 1308835805)
[Thu 23 Jun 2011 08:30:01] --------------------
[Thu 23 Jun 2011 08:30:01] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:30:01] Num Segments 1
[Thu 23 Jun 2011 08:30:01] 4 172.20.7.63 4803
[Thu 23 Jun 2011 08:30:01] lnxsvr1 172.20.7.60
[Thu 23 Jun 2011 08:30:01] lnxsvr2 172.20.7.61
[Thu 23 Jun 2011 08:30:01] lnxsvr6 172.20.7.62
[Thu 23 Jun 2011 08:30:01] lnxsvr5 172.20.7.58
[Thu 23 Jun 2011 08:30:01] =====================
[Thu 23 Jun 2011 08:30:01] Send_new_packets: created packet 136 already exist 2
Exit caused by Alarm(EXIT)
_______________________________________________
Spread-users mailing list
Spread-users@lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users
["smime.p7s" (smime.p7s)]
0 *H
010 + 0 *H
0m0U Fuc. 76>A0
*H
010 UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G20 100323000000Z
130322235959Z010 UUS10U
U.S. Government10
UECA10UVeriSign, Inc.10USpread Concepts LLC10UJohn \
Schultz0"0 *H
0
yEx9`
wQ\ F]nټ6?,5!]-AȣYM7%z$ ~.,T JSKFxL(
6 Vw
h.?#ud?IP\fAmߞMi+Z4W
80sDKi?e͝)-12PRiqb-]*25i9rWX"gz0c?o/pT'"9u9ϻ_)@eeEpyAƿ 00QUJ0H0F \
D B@http://eca-client-crl.verisign.com/VeriSignECA2048/LatestCRL.crl0U0U:)qG0U#0
O "P\
!Kr(0&U0jschultz@spreadconcepts.com0+t0r0?+03https:// \
eca2048.verisign.com/CA/VeriSignECA2048.cer0/+0#http://eca-client-ocsp.verisign.com0RU \
K0I0G `He0907++https://www.verisign.com/repository/eca/cps0U 00+ 1US0
*H
;u*)}L-' xoR=vcxVhM`$4 C$.#lg- j1:]13|)8+@2aGlVTD@ \
xVAt9o@>HhxpQ3Rjh|F!'vOV?Iפ2;yBe=ZNQOH3u \
3Oz5)S,UQd P0m0U i<T: #yh00 *H
010 UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G20 100323000000Z
130322235959Z010 UUS10U
U.S. Government10
UECA10UVeriSign, Inc.10USpread Concepts LLC10UJohn \
Schultz0"0 *H
0
ڞ(6*m\yw칊S0P mybB;9HQw~zRaC_m<
m`RƬDm{`egcSZv
;Ljcc*~cUGp'}0QuPEZKƼ*Ptg=U=oBD[|j{f%RjLExn?1c #^xyDT> \
0 P.re.b_a 00QUJ0H0F D \
B@http://eca-client-crl.verisign.com/VeriSignECA2048/LatestCRL.crl0U \
0UcO'2k?0U#0 O "P\
!Kr(0&U0jschultz@spreadconcepts.com0+t0r0?+03https:// \
eca2048.verisign.com/CA/VeriSignECA2048.cer0/+0#http://eca-client-ocsp.verisign.com0RU \
K0I0G `He0907++https://www.verisign.com/repository/eca/cps0U 00+ 1US0
*H
h:fq4ϓȲ \
v%Qf}@D|èAr?INJEuzUiF;|:7 LX
^}+H[s;M
X6
/㶪;n+j.W !0*>e|1]Q㿬;]i
2ێv]#X#EqE'zl3ͦLv
\M5qqbrxaQl100010 UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2Fuc. 76>A0 + 0 *H 1 *H
0 *H
1
110717220045Z0# *H
1bU`!5 )0 +710010 UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2i<T: #yh00*H 1 \
010 UUS10U U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2i<T: #yh00 *H
^T:hM/w9k($u]3l"a\̦ \
g^(d됩nIFWȚԟEVV``2 ]7Qiy)j@@}wC>yF|y_ \
}˖L/4|m>L+>oA'SX 5]PR^rM0 #[x#z㋴
NjOw?6 @RC3iت2 pg
_______________________________________________
Spread-users mailing list
Spread-users@lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic