[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openbsd-bugs
Subject:    Re: Relayd crash when disabling host
From:       Camiel Dobbelaar <cd () sentia ! nl>
Date:       2011-11-23 19:40:33
Message-ID: 4ECD4C31.1030500 () sentia ! nl
[Download RAW message or body]

On 23-11-2011 9:17, aniyokshuffle wrote:
> i've got same problem than here :
> 
> http://old.nabble.com/system-6627%3A-relayd---desynchronization-on-host-disable-during-a-running-check-td31783978.html
>  
> with this message :
> 
> "pfe_dispatch_imsg: desynchronized "

I'm looking into this too on 5.0, but not making much progress anymore.
 I'm not entirely sure it's related to your problem as the configuration
mechanism got a big rewrite between 4.9 and 5.0.

It looks like relayd has a nasty race when the configuration is (re)loaded.

The hce is usually finished first and when its checks run fast as well
it might send status messages about hosts that the pfe and relays do not
know about yet (because those are still loading the new configuration).

I've experimented with an extra sleep in the hce, but that still does
not guarantee that the pfe and relays are reloaded yet.  The problem
could still be triggered, only less often.  I guess it needs some kind
of synchronization mechanism...

The following relayd.conf triggers it quite easily on my system.  But
I've seen it crash too with only 2 servers in the pool when relayd is
started on bootup.


s01="127.0.0.1"
s02="127.0.0.1"
s03="127.0.0.1"
s04="127.0.0.1"
s05="127.0.0.1"
s06="127.0.0.1"
s07="127.0.0.1"
s08="127.0.0.1"
s09="127.0.0.1"
s10="127.0.0.1"
s11="127.0.0.1"
s12="127.0.0.1"
s13="127.0.0.1"
s14="127.0.0.1"
s15="127.0.0.1"
s16="127.0.0.1"
s17="127.0.0.1"
s18="127.0.0.1"
s19="127.0.0.1"
s20="127.0.0.1"
s21="127.0.0.1"
s22="127.0.0.1"
s23="127.0.0.1"
s24="127.0.0.1"
s25="127.0.0.1"
s26="127.0.0.1"
s27="127.0.0.1"
s28="127.0.0.1"
s29="127.0.0.1"
s30="127.0.0.1"
s31="127.0.0.1"
s32="127.0.0.1"

s_vip = "1.1.1.1"

prefork 10

table <spool> {
        $s01 $s02 $s03 $s04 $s05 $s06 $s07 $s08
        $s09 $s10 $s11 $s12 $s13 $s14 $s15 $s16
        $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24
        $s25 $s26 $s27 $s28 $s29 $s30 $s31 $s32
}

redirect smtp {
        listen on $s_vip port 25
        forward to <spool> check icmp
}


camield@xmts $ sudo /usr/src/usr.sbin/relayd/obj/relayd -dvf relayd.conf
startup
host 127.0.0.1, check icmp (4ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (6ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (6ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (6ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (7ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (7ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (7ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (7ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (8ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (8ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (9ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (9ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (9ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (10ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (10ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (10ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (10ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (11ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (11ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (12ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (12ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (12ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (12ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (13ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (13ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (14ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (14ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (15ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (15ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (15ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (15ms), state unknown -> up, availability 100.00%
host 127.0.0.1, check icmp (16ms), state unknown -> up, availability 100.00%
fatal: relay_dispatch_pfe: invalid host id
pfe exiting, pid 24741
hce exiting, pid 28610
relay exiting, pid 32765
relay exiting, pid 16820
relay exiting, pid 8584
relay exiting, pid 15381
relay exiting, pid 12431
relay exiting, pid 23773
relay exiting, pid 4384
relay exiting, pid 16606
relay exiting, pid 18721
parent terminating, pid 23880


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic