[prev in list] [next in list] [prev in thread] [next in thread] 

List:       amanda-users
Subject:    RE: client was working, now suddenly is getting self check "host down?" errors
From:       "Ron Bauman" <RBauman () HatterasNetworks ! com>
Date:       2003-05-30 19:04:48
[Download RAW message or body]


I have a random problem like this as well running RH Linux.  The client occasionally \
fails amcheck in the afternoon. (Backups run at nite.)  When I look at portland, the \
client, I find the selfcheck task "stuck" and I am unable to kill it, even with kill \
-9.  See if you have the same problem.  On the client, try

ps -ef | grep amand

or grep with whatever your amanda user account is.

If you see selfcheck running, you'll be unable to get amcheck on the server to finish \
until it's gone.  Just something to check.

Ron Bauman
Hatteras Networks, Inc.

-----Original Message-----
From: Martin, Jeremy [mailto:jmartin@gsi-kc.com]
Sent: Friday, May 30, 2003 2:14 PM
To: amanda-users@amanda.org
Subject: client was working, now suddenly is getting self check "host
down?" errors


Hi,

This is confusing me a bit, I hope someone hear has an idea of what might be \
happening.

I have been running an amanda server, backing itself up + one other amanda client \
(jayhawker), for about a week now. It works great every night when I have the amdump \
run. Yesterday I added a third amanda client, "bcc1". bcc1 and jayhawker are both \
fresh RedHat 9 installs.

I configured bcc1 exactly the same was as jayhawker, with the same entries in \
hosts.allow / hosts.deny / xinetd.conf / /home/amanda/.amandahosts etc. Both mybox \
(by name and by ip just in case) and localhost (by localhost / localhost.localdomain \
/ 127.0.0.1) are allowed in hosts.allow for the user amanda... I know a lot of that \
is redundant but I wanted to be 100% sure I allowed the right things, since at least \
the .amandahosts file has been a bit picky. Also of course my amanda server "mybox" \
is set up ok in /etc/hosts. 

At first "mybox" could back up bcc1 just fine. I ran amcheck and there were 0 \
problems in 3 clients found. The first amdump worked yesterday afternoon. Then \
overnight amdump ran from cron and was unable to connect to bcc1. Actually 95% of the \
DLEs were backed up ok but /usr on bcc1 failed:

  192.168.2. /usr lev 0 FAILED 20030530[could not connect to 192.168.2.200]

This morning after reading that in the amanda report, I ran amcheck and it said \
selfcheck host down when trying bcc1 . Just to see if I could get to it, I tried \
"ping bcc1" which started pinging the right IP immediately, no problems at all. I ran \
amcheck again without changing anything else and it found 0 problems. Then I ran \
amdump and somehow by the time it had finished, the problem came back, because *all* \
of the DLEs had FAILED messages saying could not connect. I had to leave the building \
for a bit, and when I came back, amcheck repeatedly says host down, even after I ping \
bcc1 (which still works great). I checked the /var/log/secure and /var/log/messages \
but I don't see anything strange at all, as far as I can tell. the amanda service is \
still running on the client and nothing has changed in the firewalls etc. I double \
checked all the things mentioned in the FAQ but everything seems to still be set up \
just fine. 

My disklist file on the server uses "bcc1" for the client name, but just for kicks I \
tried changing it to the client's IP, and now it's saying that ip won't do a \
selfcheck either. Also my timeouts were set to at least 30 seconds in amanda.conf, \
amcheck was waiting a good long while before giving up, plus the boxes are all on a \
LAN so when amcheck works it usually only takes it less than a second to finish. 

Any ideas of why a client would work for a while then randomly not be able to do a \
selfchecK? The other amanda client is still working great...

Thanks!
_______________________
Jeremy Martin
Network Technician
http://www.gsi-kc.com
mailto:jmartin@gsi-kc.com


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic