[prev in list] [next in list] [prev in thread] [next in thread]
List: intermezzo-devel
Subject: Dieing lentos
From: "Peter J. Braam" <braam () cs ! cmu ! edu>
Date: 2000-03-19 18:26:18
[Download RAW message or body]
> > > > > "Andreas" == Andreas J Koenig <andreas.koenig@anima.de> writes:
Andreas> Is it possible, all errors are due to clock skew? I had a
Andreas> working intermezzo for many, many hours when the two
Andreas> clocks were in sync. I changed one clock by several
Andreas> minutes and from then on I could watch four lentos
Andreas> die.
Hi,
Clock skew between the machines is ok, however, changing the time on the machine can \
cause a disconnect and that can have bad consequences.
Oh boy, looking at the logs below, there really is some bug eh?
- Peter -
Andreas> Interestingly, the errors I had reported earlier all
Andreas> happened with unsynced clocks.
Andreas> Here are the memos of the four dieing lentos:
Andreas> After hours of flawless operation, I had a lento dieing
Andreas> at line 55 of Reintegrate. The line reads
Andreas> open(FILE, ">$filename") || die;
Andreas> I had no debugging on, I wasn't even watching, so no
Andreas> information available:-( Please change these die()s to
Andreas> contain at least something.
Andreas> ----------------------------------------------------------------------
Andreas> I had a server die with debuglevel=1. Last words were:
Andreas> [Lento/Replicator.pm: 168] Replicator p66 shared:
Andreas> SENDING SET 0 (CURRENT STATE 3)
Andreas> POE::Session::POE/Session.pm (line 241) [Lento/List.pm:
Andreas> 94] Iterator on (POE::Session=HASH(0x8adf11c)) class
Andreas> Lento::Replicator, file Lento/Replicator.pm, line 145
>>>>>> POE::Session=HASH(0x8a958a4); POE::Session=HASH(0x8a51260);
>>>>>> POE::Session=HASH(0x8ad1540); POE::Session=HASH(0x8aeee00);
>>>>>> POE::Session=HASH(0x8acc014); POE::Session=HASH(0x8938868);
>>>>>> POE::Session=HASH(0x8b22170); POE::Session=HASH(0x8a513b0);
>>>>>> POE::Session=HASH(0x8ae25ac); POE::Kernel=ARRAY(0x83c8940);
>>>>>> POE::Session=HASH(0x8ad73e0) <<<<<
Andreas> -- No target - SOURCE: shared to p66 CML for send_done
Andreas> to POE::Session=HASH(0x8adf11c) -- Poll Journal -
Andreas> POE::Session=HASH(0x8a958a4) -- Acceptor -- FSDB shared
Andreas> to p66 CML -- ReqDispatcher -- Journal -- shared to p66
Andreas> CML -- Connection (from from ) -- GetML No such
Andreas> pseudo-hash field "name" at POE/Kernel.pm line 898.
Andreas> -------------------------------------------------------------------------
Andreas> A client died with
Andreas> ==> [06:20:55] Reintegrate -> do_one_record [sender:
Andreas> Reintegrate] [Lento/InterMezzo/ReqHandler.pm: 513]
Andreas> record: 953443625, 1024, 0, 41952, -1072506109,
Andreas> 953443631, 953443631, 953443631, 16895, 136136136, 65,
Andreas> /koenigs/t/84>.D@4DB-6, SETATTR bad lstat -
Andreas> /izo0/koenigs/t/84>.D@4DB-6
Andreas> Yes, I am using funny filenames during testing.
Andreas> --------------------------------------------------------------------------
Andreas> Another client died with
Andreas> ==> [06:41:47] Reintegrate -> do_one_record [sender:
Andreas> Reintegrate] [Lento/InterMezzo/ReqHandler.pm: 513]
Andreas> record: 953444874, 1024, /koenigs/t/232, 7D94, UNLINK
Andreas> [Lento/Reintegrate.pm: 147] Executing:
Andreas> unlink(/izo0/koenigs/t/232/7D94) ==> [06:41:47]
Andreas> Reintegrate -> complete_record [sender: Reintegrate]
Andreas> ***RH*** NTS= 0 NTE= 9505 ==> [06:41:47] Reintegrate ->
Andreas> do_one_record [sender: Reintegrate]
Andreas> [Lento/InterMezzo/ReqHandler.pm: 513] record: 953444882,
Andreas> 1024, /koenigs/t, 232, RMDIR [Lento/Reintegrate.pm: 159]
Andreas> Executing: rmdir(/izo0/koenigs/t/232) Died at
Andreas> Lento/Reintegrate.pm line 160.
Andreas> --------------------------------------------------------------------------
Andreas> Logfiles are available on request, but they are big:
Andreas> # ls -l /home/lento@71.* -rw-r--r-- 1 root root 2057772
Andreas> Mar 19 06:27 /home/lento@71.poe-kernel-name.out
Andreas> -rw-r--r-- 1 root root 88118077 Mar 19 07:10
Andreas> /home/lento@71.server_of_rmdir.out # ls -l
Andreas> /usr/raid/lento@66.* -rw-r--r-- 1 root root 1288225 Mar
Andreas> 19 06:20 /usr/raid/lento@66.bad-lstat.out -rw-r--r-- 1
Andreas> root root 49500471 Mar 19 06:41
Andreas> /usr/raid/lento@66.rmdir.out
Andreas> -- andreas
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic