'Re: [Linux-ha-dev] [ANN] New Clustering System'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] [ANN] New Clustering System
From:       Ian McKellar <yakk-linux-ha-dev () yakk ! net ! au>
Date:       2000-10-10 10:04:33
[Download RAW message or body]

On Tue, Oct 10, 2000 at 01:48:38AM +0200, Michael Moerz wrote:
> >HOW IT WORKS:
> 
> Who owns the cluster ip, where is it located?

No machine's IP stack binds to the cluster IP. All cluster members use
libpcap to listen for traffic to this IP.
> 
> > ...
> > This packet, the claim packet has IP
> > protocol number 150 and contains the source address, source port,
> > destination port and protocol of the new session. 
> Til this point I am understanding, now it's getting a bit confusing for me

I'm using a new protocol over IP (not UDP or TCP because they provide things
I don't need). Since this packet is destined for a local IP it won't be
routed out of the local network.

> 
> > When a claim packet is
> > recieved, the cluster member checks to see if the claim packet is for a
> > session that has already been claimed (if it is, then it is ignored) and

> How can the cluster member determinde that a `session` has been
> claimed.

A session has been claimed if the cluster member has seen a claim packet.

> Actually what do you define to be a `session` ?

For TCP its a TCP connection. For UDP its a remote host:port addres. These
might need some refining - especially for UDP based protocols.

> How often are claim packets for a `session` sent by the cluster member?

Currently only when a new incoming session (ie: TCP packets with the SYN
flag set or UDP packets from a source that hasn't been seen before). I
intend to extend the system so that cluster members might try to claim
sessions that seem to have stalled or failed (using a timeout of some
sort).
> 
> > if the packet originated from itself. If the first claim packet for a
> > particular session originated from the recieving server, then that
> > server
> > handles the session.

> Ah, now I am completely lost. Where do we receive that packet ? 
> (Somehow this description is a bit lacking the important details.)

Yeah, I realise that I didn't mention how packets are recieved. The code
currently uses libpcap (what tcpdump is based on). We're watching for:
 a) Traffic to the cluster address (so that as appropriate we can forward
    it to the real local services)
 b) Traffic from the real address to the virtual address (so that we can
    forward it to the appropriate remote address)

An example:
We have servers A and B and client C.

A has IP address 10.0.0.1
B has IP address 10.0.0.2
C has IP address 10.0.0.3

We allocate an IP for the cluster (10.0.0.10) And virual IPs for A 
(10.0.0.11) and B (10.0.0.12)

C wants to retreive a web page. It opens a TCP connection from 10.0.0.3:1234
to 10.0.0.10:80.

The clusterd process on A sees a request go by on the ethernet for 10.0.0.10
and it knows that thats the cluster IP. It adds an entry to its table of
TCP sessions for 10.0.0.3:1234 and marks it unclaimed. It then sends a
claim packet to the network (destination will be 10.0.0.10) which is 
basically the tuple:
  (10.0.0.3,1234,80,IPPROTO_TCP)
Machine B does the same.

Machines A and B will both recieve both those claim packets in the same
order (because ethernet is a shared broadcast medium). Lets assume that
A's packet arrives first. A will take an active role in this session and B
will take a passive role.

A rewrites the headers of the IP and TCP packets that it has recieved for
the session from being 10.0.0.3:1234->10.0.0.10:80 to being
10.0.0.11:2000->10.0.0.1:80 (where 2000 is a unique port that clusterd
allocates) and sends the packets back out onto the wire.

Until it sees the end of the connection A will translate packets being
sent from 10.0.0.1:80 to 10.0.0.11:2000 into packets from 10.0.0.10:80 to
10.0.0.3:1234.

> 
> If had a look at your webaddress, but there isn't any more detailed 
> documentation about the programm you are describing. I have also
> downloaded your tar.gz but the source files do also lack comments.
> Actually I am not willing to read your source and try to understand it,
> cause I have to do enough on my own.
> I would really appretiate if you could be so kind and write something
> more detailed about your daemon, otherwise I and perhaps some other 
> people won't be able to do anything with your work.

Yeah, I'm sorry about that. I made this release late at night and my
documentation is lacking. Perhaps I can better explain my technique through
this discussion and then I can turn that into documentation.

Ian

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.tummy.com
http://lists.tummy.com/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread]