[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha
Subject:    Re: [Linux-HA] crm_mon equivalent for serial and udp communications
From:       Dejan Muhamedagic <dejanmm () fastmail ! fm>
Date:       2008-05-30 18:40:34
Message-ID: 20080530184033.GA31355 () rondo ! suse ! de
[Download RAW message or body]

Hi,

On Fri, May 30, 2008 at 11:14:14AM -0500, Matt Zagrabelny wrote:
> On Fri, 2008-05-30 at 07:01 -0500, Matt Zagrabelny wrote:
> > On Fri, 2008-05-30 at 11:47 +0200, Dejan Muhamedagic wrote:
> > > On Fri, May 30, 2008 at 09:34:34AM +0200, Robert Heinzmann (ml) wrote:
> > > > >We just developed the tool what you say.
> > > > >See attached. It's a screen shot of it.
> > > > >How about it?
> > > > 
> > > > Is there a command line tool on the current code base showing this
> > > > information (maybe not as integrated in crm_mon as your tool) ?
> > > 
> > > There's cl_status hblinkstatus.
> > 
> > Perfect. Thanks Dejan.
> 
> Well *almost* perfect.
> 
> It looks as though there is some discrepancies in the reporting (or in
> my understanding).
> 
> I have a two node cluster (squash and turnip).
> 
> +------------+           +------------+
> | /dev/ttyS1 |  <====>   | /dev/ttyS1 |
> |            |           |            |
> |  squash    |           |  turnip    |
> |            |           |            |
> |     eth2   |  <====>   |   eth2     |
> +------------+           +------------+
> 
> From squash I run:
> ???
> squash% cl_status hblinkstatus squash /dev/ttyS1
> dead
> 
> squash% cl_status hblinkstatus turnip /dev/ttyS1
> up
> 
> squash% cl_status hblinkstatus squash eth2
> up
> 
> squash% cl_status hblinkstatus turnip eth2
> up
> 
> From turnip I run:
> 
> turnip% cl_status hblinkstatus squash /dev/ttyS1
> up
> 
> turnip% cl_status hblinkstatus turnip /dev/ttyS1
> dead
> 
> turnip% cl_status hblinkstatus squash eth2
> up
> 
> turnip% cl_status hblinkstatus turnip eth2
> up
> 
> So, a node will report that its own serial line is 'dead' but report
> that its partner's is 'up'. Is this by design?
> 
> I would think that if heartbeat can see the correct data on the serial
> line that *both* serial links would report 'up'.

Both? There's only one serial link. I really can't say how does
it work exactly, but my guess is that if there are actual
heartbeats on the link, then it is reported live (i.e. up). For
example, if you consider /dev/ttyS1 on host turnip to host
turnip, you won't see any heartbeats there. As for eth2, I guess
that it is configured as bcast or mcast.

Do you still find it confusing?

Thanks,

Dejan

> From the logs I get:
> 
> (on squash)
> heartbeat[2677]: 2008/05/30_07:43:52 info: Link turnip:/dev/ttyS1 up.
> 
> (on turnip)
> heartbeat[6429]: 2008/05/30_07:43:52 info: Link squash:/dev/ttyS1 up.
> 
> Thanks for any information,
> 
> -- 
> Matt Zagrabelny - mzagrabe@d.umn.edu - (218) 726 8844
> University of Minnesota Duluth
> Information Technology Systems & Services
> PGP key 1024D/84E22DA2 2005-11-07
> Fingerprint: 78F9 18B3 EF58 56F5 FC85  C5CA 53E7 887F 84E2 2DA2
> 
> He is not a fool who gives up what he cannot keep to gain what he cannot
> lose.
> -Jim Elliot



> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic