[prev in list] [next in list] [prev in thread] [next in thread] 

List:       amd-dev
Subject:    6.0.1s5 sometimes clones itself
From:       Rune Mossige <Rune.Mossige () waii ! com>
Date:       1999-04-20 9:09:27
[Download RAW message or body]


>From time to time, maybe once a month, maybe slightly more often, we
notice that amd on one of the 14 hosts we run it on appears to clone
itself, and then stops behaving properly.

I'm not sure if this is the correct description, but that is what we
see...

A typical 'ps -ef' then show:
ps -ef | grep amd
    root  10974      1   0   Mar 26      - 97:47 /usr/local/sbin/amd -F
/etc/amd.conf -- /preview /etc/amd_maps/amd.direct.preview -type:=direct /work
/etc/amd_maps/amd.direct.work -type:=direct /work1 /etc/amd_maps/amd.direct.work1
-type:=direct /home1 /etc/amd_maps/amd.direct.home1 -type:=direct /user
/etc/amd_maps/amd.direct.user -type:=direct 
    root  71884  10974   0 10:01:12      -  0:00 /usr/local/sbin/amd -F
/etc/amd.conf -- /preview /etc/amd_maps/amd.direct.preview -type:=direct /work
/etc/amd_maps/amd.direct.work -type:=direct /work1 /etc/amd_maps/amd.direct.work1
-type:=direct /home1 /etc/amd_maps/amd.direct.home1 -type:=direct /user
/etc/amd_maps/amd.direct.user -type:=direct

Apparently, at 10:01:12 amd cloned itself, and from that point onwards no
new mounts will be performed.

So far, the only 'workaround' I have found, is to reboot the host.

We are now running AIX 4.2.1, but I have seen this on AIX 4.1.5 also, and
with several am-utils versions.

I do not know where to start investigating this, or how to describe the
behaviour more precise. Some mounts work, while others just never returns.

If I telnet into the node and logon as root, I can then 'cd' into all the
suspect directories, but I still can not 'rlogin' to the machine as an
ordinary user!

I have started amd with:

log_options =                   all

but no debug options. In the amd.log file, I do not see any errors. The
only suspicious thing I see, is entries like:

Apr 20 10:02:53 svs04 amd[10974]/map:   Trying mount of
svnfs02:/homes/st01 on /home/st01fb fstype nfs
Apr 20 10:02:53 svs04 amd[10974]/map:   key st01fb: map selector hostd
(=svs04.norway.waii.com) did not match svnfs02.norway.waii.com
Apr 20 10:02:53 svs04 amd[10974]/map:   Trying mount of (error-hook) on
/home/st01fb fstype nfs
Apr 20 10:02:55 svs04 amd[10974]/map:   Trying mount of
svnfs02:/homes/st01 on /home/st01fb fstype nfs
Apr 20 10:02:55 svs04 amd[10974]/map:   key st01fb: map selector hostd
(=svs04.norway.waii.com) did not match svnfs02.norway.waii.com
Apr 20 10:02:55 svs04 amd[10974]/map:   Trying mount of (error-hook) on
/home/st01fb fstype nfs

The time stamp on these entries are very close to the time amd cloned
itself. Could this help indicate where the problem is?

Could someone please give some hints as to what is going on, and what
might be wrong?

-------------------------------------------------------------------
(-: Hiroshima 45, Tsjernobyl 86, Windows 95 :-)
Our ultimate goal is to make overloaded machines appear to be idle.
High performance, High reliability, Low cost -------- Pick any two.
-------------------------------------------------------------------
Rune Mossige, Systems Support, Western Geophysical, Stavanger, Norway
Tel: (+47)51598922    Fax:(+47)51598999    Mobile:(+47)90871024

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic