[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-cluster
Subject: re[2]: HA and process migration (tru64)
From: Greg Freemyer <freemyer () NorcrossGroup ! com>
Date: 2001-07-24 18:11:01
[Download RAW message or body]
> > That's it? I thought there was more to it, esp. since supposedly
> > VMS job queues can survive being moved to a different node.
> > I'd think the functionality described below can be done with a few shell
> > scripts.
VMS clustering is still ahead of Tru64 clustering, so yes the below is all that Tru64 \
does today (VMS does more but I am not VMS knowledgable).
Tru64 is one of the top rated UNIX HA cluster solutions, but it is not yet supporting \
migration (just failover/restart), and even the below functionality is less than 2 \
years old under Tru64.
(Prior to that it was just one socket listener per (IP:port) per cluster, not one \
listener per (IP:port) per node. i.e. It was a pure HA cluster with HP capability \
only available thru custom written apps like Oracle (OPS). Most commercial HA \
clusters are of this type (i.e. pure HA with no parallelism built-in).)
The good news is that the below makes it very easy to create a HA/HP configuration of \
a "stateless" application like Apache without the addition of a separate director \
(LVS, load balancing router, etc.).
Given that you also have a Cluster File System (i.e. common root), which Tru64 has, \
all you have to do is:
1) Create a new service IP for Apache and configure Apache to use it.
2) Invoke Apache on all nodes.
Now the automatically created and maintained IP director will distribute the sockets \
between the nodes round-robin fashion and if a node dies the IP director will quit \
sending it new sockets.
Tru64 has this so automated and seamless that many Tru64 cluster administrator's are \
not even aware it is happening. In particular, I have seen many high-level Tru64 \
cluster drawings which leave off the IP director.
On the otherhand with a standalone/separate director (LVS, router, etc.) the director \
is one of the key conceptual pieces of functionality and is shown even on high-level \
diagrams.
My thoughts on why process migration would be nice:
One negative with the above is that even Apache is not truly "stateless", it \
maintains sockets for a brief duration, and those sockets timeout when a node is \
shutdown in an uncontrolled manner (and end users have to click the refresh button).
For basic Apache webserving, this is easily handled in the controlled shutdown case, \
by shutting down Apache, and having all new sockets getting routed to the other nodes \
and thus there is zero end-user observable behavior.
Unfortunately, many commercial applications have long lived sockets (i.e. hours/days) \
so the above technique doesn't work for them.
Thus, in my opinion, the ability to migrate the process/socket pair prior to a \
controlled shutdown would be highly beneficial.
Many of you will now be thinking about keepalives, and retry logic. I have been down \
that road several times, and have found it very distasteful.
Retries work pretty good on active sockets, but many sockets sit idle for extended \
periods, and the TCP/IP keepalive mechanism leaves much to be desired. It is not \
even supported in many popular O/Ses if I recall correctly.
(The times I have tried to use it, I have had to take it back out because the TCP/IP \
stack just did not have it keepalive working correctly.)
Greg Freemyer
Internet Engineer
Deployment and Integration Specialist
The Norcross Group
www.NorcrossGroup.com
> > Greg Freemyer wrote:
> >
> > > As far as Compaq's TruCluster's, they may have the infrastructure to
> > support
> > > moving an open socket, but they don't yet have process migration, nor
> > socket
> > > migration available in their released product. Process migration is in
> > the
> > > roadmap. I'm not sure about Socket migration.
> > >
> > > What Compaq TruClusters does have is the following, and it may make
> > future
> > > socket migration easier:
> > >
> > > Given a cluster of several nodes (max. of 8 for now).
> > > They elect one of them to be the service IP director. (They use HA
> > > technology to make this reliable.)
> > > All service IP traffic goes to the service IP director.
> > > The director then forwards it across the interconnect to the
> > appropriate
> > > node.
> > > For each open socket the director maintains a mapping to the node
> > the
> > > traffic goes to.
> > >
> > > The director also maintains a list of listeners on each node.
> > > Then when a SYNC comes in for a specific port, it distributes it
> > round
> > > robin fashion between the listening nodes.
> > >
> > > They have a separate director for each service IP and a separate
> > election
> > > process for each director.
> > >
> > > Another interesting aspect is that on outbound SYNCs, you have the
> > choice
> > > to either identify
> > > yourself by your local nodes IP, or by a service IP. I'm not sure
> > if they
> > > support this to make
> > > admin of external systems easier, or if their is some HA aspect to
> > it, or
> > > maybe it is to allow the
> > > future process/socket migration to work.
> > >
> > > I ASSUME that much of the above is implemented in the kernel, and even
> > in the
> > > tcp/ip stack.
> > >
> > > Greg Freemyer
> > --
> > David Nicol 816.235.1187
> > Linux-cluster: generic cluster infrastructure for Linux
> > Archive: http://mail.nl.linux.org/linux-cluster/
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic