[prev in list] [next in list] [prev in thread] [next in thread] 

List:       spread-users
Subject:    Re: [Spread-users] daemon crash
From:       John Schultz <jschultz () spreadconcepts ! com>
Date:       2011-07-17 22:00:44
Message-ID: CAD794F7-9FD5-46F4-982C-87728AC6C83B () spreadconcepts ! com
[Download RAW message or body]

[Attachment #2 (multipart/signed)]


First, it looks like one or more of your daemons was connecting and disconnecting \
repeatedly.

Second, it looks like you tripped the infinite-EVS state bug that we tried to work \
around.  We still aren't sure what causes this and most times simply restarting the \
protocol seems to fix it, which is the work around.

The state at the end looks like a bug.  It has the ARU as 167 but the highest seq as \
135.  Then in the next membership it does establish it tries to issue packet #136, \
but that is lower than the ARU, so that packet id already exists.  I'm not 100% sure \
on this because during memberships the token fields can mean different things than \
during regular operation.

The infinite-EVS bug and this exit might be related or caused by the same logic \
issue.

Cheers!

-----
John Lane Schultz
Spread Concepts LLC
Phn: 301 830 8100
Cell: 443 838 2200

On Jul 17, 2011, at 2:38 PM, Matt Garman wrote:


Hello,

We're using spread 4.0.0 on 64-bit CentOS 4 (Linux).

The other day a daemon crashed.  Below is what was logged just prior
to the crash.  I was wondering if anyone could help shed some light
on this?

Thanks,
Matt


Membership id is ( -1407973572, 1308835744)
[Thu 23 Jun 2011 08:29:03] --------------------
[Thu 23 Jun 2011 08:29:03] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:29:03] Num Segments 1
[Thu 23 Jun 2011 08:29:03]      4       172.20.7.63       4803
[Thu 23 Jun 2011 08:29:03]              lnxsvr1                 172.20.7.60
[Thu 23 Jun 2011 08:29:03]              lnxsvr2                 172.20.7.61
[Thu 23 Jun 2011 08:29:03]              lnxsvr6                 172.20.7.62
[Thu 23 Jun 2011 08:29:03]              lnxsvr5                 172.20.7.58
[Thu 23 Jun 2011 08:29:03] =====================
Membership id is ( -1407973572, 1308835795)
[Thu 23 Jun 2011 08:29:54] --------------------
[Thu 23 Jun 2011 08:29:54] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:29:54] Num Segments 1
[Thu 23 Jun 2011 08:29:54]      4       172.20.7.63       4803
[Thu 23 Jun 2011 08:29:54]              lnxsvr1                 172.20.7.60
[Thu 23 Jun 2011 08:29:54]              lnxsvr2                 172.20.7.61
[Thu 23 Jun 2011 08:29:54]              lnxsvr6                 172.20.7.62
[Thu 23 Jun 2011 08:29:54]              lnxsvr5                 172.20.7.58
[Thu 23 Jun 2011 08:29:54] =====================
Membership id is ( -1407973572, 1308835801)
[Thu 23 Jun 2011 08:30:00] --------------------
[Thu 23 Jun 2011 08:30:00] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:30:00] Num Segments 1
[Thu 23 Jun 2011 08:30:00]      4       172.20.7.63       4803
[Thu 23 Jun 2011 08:30:00]              lnxsvr1                 172.20.7.60
[Thu 23 Jun 2011 08:30:00]              lnxsvr2                 172.20.7.61
[Thu 23 Jun 2011 08:30:00]              lnxsvr6                 172.20.7.62
[Thu 23 Jun 2011 08:30:00]              lnxsvr5                 172.20.7.58
[Thu 23 Jun 2011 08:30:00] =====================
[Thu 23 Jun 2011 08:30:01] Prot_handle_token: BUG WORKAROUND: Too many rounds in EVS \
state; swallowing token; state: [Thu 23 Jun 2011 08:30:01]      Aru:              167
[Thu 23 Jun 2011 08:30:01]      My_aru:           167
[Thu 23 Jun 2011 08:30:01]      Highest_seq:      135
[Thu 23 Jun 2011 08:30:01]      Highest_fifo_seq: 84
[Thu 23 Jun 2011 08:30:01]      Last_discarded:   0
[Thu 23 Jun 2011 08:30:01]      Last_delivered:   167
[Thu 23 Jun 2011 08:30:01]      Last_seq:         3468
[Thu 23 Jun 2011 08:30:01]      Token_rounds:     501
[Thu 23 Jun 2011 08:30:01] Last Token:
[Thu 23 Jun 2011 08:30:01]      type:             0x80040080
[Thu 23 Jun 2011 08:30:01]      transmiter_id:    -1407973572
[Thu 23 Jun 2011 08:30:01]      seq:              0
[Thu 23 Jun 2011 08:30:01]      proc_id:          -1407973572
[Thu 23 Jun 2011 08:30:01]      aru:              167
[Thu 23 Jun 2011 08:30:01]      aru_last_id:      -1407973572
[Thu 23 Jun 2011 08:30:01]      flow_control:     0
[Thu 23 Jun 2011 08:30:01]      rtr_len:          0
[Thu 23 Jun 2011 08:30:01]      conf_hash:        1007608523
Membership id is ( -1407973572, 1308835805)
[Thu 23 Jun 2011 08:30:01] --------------------
[Thu 23 Jun 2011 08:30:01] Configuration at lnxsvr1 is:
[Thu 23 Jun 2011 08:30:01] Num Segments 1
[Thu 23 Jun 2011 08:30:01]      4       172.20.7.63       4803
[Thu 23 Jun 2011 08:30:01]              lnxsvr1                 172.20.7.60
[Thu 23 Jun 2011 08:30:01]              lnxsvr2                 172.20.7.61
[Thu 23 Jun 2011 08:30:01]              lnxsvr6                 172.20.7.62
[Thu 23 Jun 2011 08:30:01]              lnxsvr5                 172.20.7.58
[Thu 23 Jun 2011 08:30:01] =====================
[Thu 23 Jun 2011 08:30:01] Send_new_packets: created packet 136 already exist 2
Exit caused by Alarm(EXIT)


_______________________________________________
Spread-users mailing list
Spread-users@lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users


["smime.p7s" (smime.p7s)]

0	*H
 010	+0	*H
 
0m0U Fuc.	76>A0
	*H
010	UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G20 100323000000Z
130322235959Z010	UUS10U
U.S. Government10
UECA10UVeriSign, Inc.10USpread Concepts LLC10UJohn \
Schultz0"0 	*H
0
yEx9`
wQ\ F]nټ6?,5!]-AȣYM7%z$ ~.,T JSKFxL(
6  Vw
h.?#ud?IP\fAmߞMi+Z4W
80sDKi?e͝)-12PRiqb-]*25i9rWX"gz0c?o/pT'"9u9ϻ_)@eeEpyAƿ00QUJ0H0F \
D B@http://eca-client-crl.verisign.com/VeriSignECA2048/LatestCRL.crl0U0U:)qG0U#0
 O "P\
!Kr(0&U0jschultz@spreadconcepts.com0+t0r0?+03https:// \
eca2048.verisign.com/CA/VeriSignECA2048.cer0/+0#http://eca-client-ocsp.verisign.com0RU \
K0I0G `He0907++https://www.verisign.com/repository/eca/cps0U	00+	1US0
 	*H
;u*)}L-'xoR=vcxVhM`$4C$.#lg-j1:]13|)8+@2aGlVTD@ \
xVAt9o@>HhxpQ3Rjh|F!'vOV?Iפ2;yBe=ZNQOH3u \
3Oz5)S,UQd P0m0U i<T:	#yh00 	*H
010	UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G20 100323000000Z
130322235959Z010	UUS10U
U.S. Government10
UECA10UVeriSign, Inc.10USpread Concepts LLC10UJohn \
Schultz0"0 	*H
0
ڞ(6*򳜎m\yw칊S0P mybB;9HQw~zRaC_m<
m`RƬDm{`egcSZv
;Ljcc*~cUGp'}0QuPEZKƼ*Ptg=U=oBD[|j{f%RjLExn?1c#^xyDT> \
0 P.re.b_a00QUJ0H0F D \
B@http://eca-client-crl.verisign.com/VeriSignECA2048/LatestCRL.crl0U \
0UcO'2k?0U#0 O "P\
!Kr(0&U0jschultz@spreadconcepts.com0+t0r0?+03https:// \
eca2048.verisign.com/CA/VeriSignECA2048.cer0/+0#http://eca-client-ocsp.verisign.com0RU \
K0I0G `He0907++https://www.verisign.com/repository/eca/cps0U	00+	1US0
 	*H
h:fq4ϓȲ \
v%Qf}@D|èAr?INJEuzUiF;|:7		LX
 ^}+H[s;M
X6
/㶪;n+j.W!0*>e|1]Q㿬;]i
2ێv]#X#EqE'zl3ͦLv
\M5qqbrxaQl100010	UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2Fuc.	76>A0	+ 0	*H 	1	*H
0	*H
	1
110717220045Z0#	*H
	1bU`!5 )0	+710010	UUS10U
U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2i<T:	#yh00*H 	1 \
010	UUS10U U.S. Government10
UECA1"0 UCertification Authorities1>0<U5VeriSign Client External \
Certification Authority - G2i<T:	#yh00 	*H
^T:hM/w9k($u]3l"a\̦ \
g^(d됩nIFWȚԟEVV``2]7Qiy)j@@}wC>yF|y_ \
}˖L/4|m󾙻>L+>oA'SX 5]PR^rM0 #[x#z㋴
NjOw?6@RC3iت2 pg



_______________________________________________
Spread-users mailing list
Spread-users@lists.spread.org
http://lists.spread.org/mailman/listinfo/spread-users


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic