'Re: HA, SSI and system administration'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-cluster
Subject:    Re: HA, SSI and system administration
From:       David Brower <david.brower () oracle ! com>
Date:       2001-07-19 18:05:48
[Download RAW message or body]

Greg Lindahl wrote:
> 
> On Wed, Jul 18, 2001 at 04:25:01PM -0700, Bruce Walker wrote:
> 
> > While I agree that transparency can in some cases be confusing, SSI
> > clustering (with or without process migration)
> > should be the saviour for HA clustering, which to date suffers from
> > significant complexity  ("n" systems to administer as well as
> > administering the HA part).
> 
> The last bit is why people who administer "n" systems often use tools
> that make such administration simpler. I realize that there's a
> marketing battle among companies that don't like that (Compaq)
> vs. companies that do (IBM SP), but I would hope that marketing
> wouldn't intrude too much on this mailing list.

There are degrees of SSI as well.  Bruce's flavor is towards the
extreme end of the spectrum.  The TruCluster flavor does not go
that far, but still presents a good single system image from the
management point of view, and seems less complicated in many
ways.   They key to obtaining good manageability SSI seems to
me to be a decent CFS, so that there are single installs, etc;
to do that you need appropriate membership and concurrency
control mechanisms.

I still haven't heard a compelling case for process migration.
The way I'm hearing it is that it is good for a class of
parallel programs that
	- are abstracted to be neutral about communication
		requirements between processes;
	- are insensitive to the performance of IPC, since
		sometimes it will be remote, and other
		times local;
	- do not interact with unmodified programs outside
		the cluster, as it is not possible to recover
		tcp state in the event of a crash.

I think the these conditions are only true for some applications;
most parallel work is likely to be sensitive to placement for
latency and bandwidth reasons.  Migrating seems only useful for
a class of applications where it is cheaper to throw more boxes
at the problem than writing the apps to use a smaller number of
boxes more efficiently.  Once you write for efficiency, I think
processes become difficult to migrate effectively.   It is the 
case that UNIX pipelines and Make processing, which are trivially
parallel operations, can work well spreading load across the 
available CPUs.   Those remain the perfect SMP apps as well.
But such applications are not particularly fault tolerant, 
except as they accidentally checkpoint their progress through the filesystem
(Make).

I cannot imagine a database server running efficiently on a cluster
that would usefully have a process moved from one node to another;
I have a hard time imagining moving the collection of processes forming
the image of a server on a node to another node as being particularly
interesting.

The first uncrossed hurdle for cluster tcp improvement in my mind
would be sending a viable RST to the client when a surviving node
takes over the address of the one that died.   Then the client
won't be sitting in a 10 minute tcp timeout and can at least
get an error and know to reconnect.  Doing transparent
failover of tcp would require checkpoint of every ack-ed packet.
Once you send the ack, you aren't going to get the other end to
resend you the data, so you can't lose it.  If you can do that
efficiently, then we can talk about generalized process 
checkpoint/recovery for fault tolerance.

-dB

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic