[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    [Linux-ha-dev] Re: Heartbeat API
From:       Alan Robertson <alanr () suse ! com>
Date:       2000-04-22 9:49:46
[Download RAW message or body]

Lars Marowsky-Bree wrote:
> 
> On 2000-04-21T11:29:24,
>    Alan Robertson <alanr@suse.com> said:
> 
> > The general idea is to make an API which would allow programs to do
> > things like these:
> >
> >       Receive notification of asynchronous events:
> >
> >               like a machine becoming accessible, or becoming inaccessible,
> >               interfaces going quiet (failing?), or becoming active again
> 
> Ok.
> 
> >
> >       Get the current list of nodes in the cluster
> 
> Ok.
> 
> >
> >       Get the information about any given node in the cluster
> >               status (up/down)
> >               load average (/proc/loadavg info)
> >               other info?
> 
> I would remove the "get information about node" except for the
> up/down/undefined part - supplying additional information should be part of a
> module on top of this infrastructure.

Actually, the heartbeat code carries a certain number of attributes with
every heartbeat.  Load average is the only one currently defined.  I'm
intending this "other info" to be one that's sent with every heartbeat -
so it's readily accessible.  One could define it as the name of an
arbitrary field in a heartbeat message - like time of last send, or time
of last receive, or something like this.  In any case, Wensong said he'd
like to see load average - which I can perfectly understand...

> >       Send a message to the whole cluster
> >       Send a message to a node in the cluster
> 
> Maybe this could be generalised to "send a message to nodes with a specific
> attribute". This may be "nodename == foo" or "attributes includes
> CONNECTED_TO_SAN_1"...

Right now the only attribute that heartbeat knows about is the node's
name.  In general, not all messages are applicable to the whole cluster
now, and they are often implicit in this respect.

For example, when the current code sends out a ip-addr-request message,
it only applies to nodes which own that ip-address (really now a
resource group), but heartbeat (at this level) has no idea who has this
resource group, so it gets *routed* to all nodes, and they ignore it if
it isn't applicable.

One *could* use the node name to do routing, and it's a common special
case (for replies) that's why I make a special case of it, and *do* use
it to ignore packets that aren't meant for me (on reception).
 
> >       Receive a message from the cluster
> 
> > This would allow lots of different uses for heartbeat, in addition to
> > what it does now.  It would allow you to write the management code in
> > "C", and not be restricted to shell or some kludgy combination of C and
> > shell.
> >
> > Next step:  Defining these APIs.
> >
> >       Comments?  Suggestions?
> 
> In essence, you are replacing the Cluster Membership Services and even parts
> of the Group Messaging Services in FailSafe with heartbeat by this, right?

I'm just making an API for what heartbeat already does.  I'm making sure
that it has sufficient power to replace the existing heartbeat scripts
with a new layer that does more interesting things.

As of today, I'm not intending to replace anything that FailSafe does
with it.  This is what's necessary for Luis and Horms to do what they've
been wanting to do with it.

However, I suspect that if one wanted to, that one could probably do
pretty much what you've described with it.  I have not made the decision
that I want to do that with heartbeat.

You could also see this as a replacement for some of the things in
Stephen's proposal - that doesn't mean that I'm going to propose it to
Stephen ;-)

Adding/removing nodes sounds like a possibly good idea.  Links a little
less so, since it's potentially a reasonably tramautic experience for
heartbeat to carry out, and you may have to provide some more syntax in
the message.  Not to say that it's a bad idea, just that it's probably
not the place to start.  Other people have argued for allowing any node
that knows the secret handshake to join from the network side, and that
might make more sense.  I haven't decided in either case.

One can always add to an API to do new things with it (especially when
they're orthogonal - like nodes vs links, etc).  I'm going to start out
without it, and then see if anyone wants it, and what they're trying to
accomplish with it.

It also seems like we're going to eventually want guaranteed packet
delivery order as well.  It shouldn't be too hard, actually.  The
protocol on the wire doesn't have to change.

	-- Alan Robertson
	   alanr@suse.com

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.tummy.com
http://lists.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic