[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgp-keyserver-folk
Subject:    Re: pks 092 corruption problem
From:       Ruben Martinez <ruben () rediris ! es>
Date:       1998-04-29 11:37:13
[Download RAW message or body]

> I have changed pksdctl to return an error code when it couldn't
> contact the keyserver. If the pksdctl fails, pksd is restarted by
> the mail delivery script.
>
> But it might be better to change the pksd startup into an endless
> loop (pksd luckily doesn't background itself).

Our approach, since our last database corruption, is something like this:=

<run pksd>
while true
do
	<log a server crash>
	<backup the database>
	<restart pksd>
	<force a queue run>
	<run the following in background, with a high 'nice' value>
		<pkscheck the backup database>
		<if OK, store it as 'last reliable DB backup'>
		<if not, store it as 'last DB backup'>
		<send the result by email>
done

That in addition to regular filesystem backups. It has several
performance problems and we might drop the whole pkscheck stuff
in the future, given the current rate of database growth.

Examining the crash log, we find something curious: the server usually
runs for days without a problem. But whenever something makes it die,
then it crashes several times in the 2-3 following days. After that
it regains stability. We have found this to happen even when
pkscheck didn't detect any DB problem, and thought it might have to
do with the queue runs. Now, when we process the queue, we force a
1-second delay between messages, and that seems to have helped reduce
the number of crashes (and the pksdctl 'can't write to socket' messages).=

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic