[prev in list] [next in list] [prev in thread] [next in thread] 

List:       9fans
Subject:    Re: [9fans] xcpu fix for Plan9.c
From:       Ronald G Minnich <rminnich () lanl ! gov>
Date:       2005-10-25 20:56:58
Message-ID: 435E9C1A.8000308 () lanl ! gov
[Download RAW message or body]

Kenji Okamoto wrote:

> term% xcpusrv -sxcpu
> ....
> term% mount -ac /srv/xcpu /mnt/xcpu
> term% openclone /mnt/xcpu
> (then on a newly created rio window)
> term% cat /mnt/xcpu/clone
> 65537term%    <=============here! not 65536!
> term% ls -l /mnt/xcpu
> --rw-rw-rw- M 66 ssh ssh 0 Jan  1  1970 /mnt/xcpu/65536  <========= not 65537!
> --rw-rw-rw- M 66 ssh ssh 0 Jan  1  1970 /mnt/xcpu/clone

weird. I will have to try this on plan 9. I think I know what I am doing 
wrong but want to verify it.

> 
> On the point why only once copy of execution file, I didn't noticed
> the danger of your example /bin/sh execution.  Now, I think why
> you chose that way.   So, in xcpu we should do only batch job, right?

no, if you look at xsh you can see it can be used for interactive jobs 
(in future).

The real issues are these:
Suppose I
cp /bin/date exec
echo exec >ctl
cp /bin/uname exec
echo exec > ctl
cat stdout

OK, what does the output of stdout mean in this case? how do I 
distinguish the two things in a reasonable way? If I want to control the 
process, which one am I controlling? If I get an eof on the stdout file, 
which one did I get an EOF for? Should stdout deliver more than one EOF 
for each process that ends? The whole situation turns into a confusing 
mess if you let more than one process run for each 'exec' file.

> On the xsh.c, I still have some problem.
> Isn't this a program to be used for a cluster environment?

yes, but I make no guarantees. I am still fighting what might be a p9p 
issue on Linux, so xsh is not quite ready yet. I am trying to get this 
thing ready for SC '05, but I am running out of time :-)

> 
> In the file, there is a line of
> 	dirno[nodeno++] = smprint("/%s/%s/xcpu/%s", base, s, buf);
> 
> here, I suppose we should name our cluster's cp server by s,
> such as
> /mnt/xcpu/"s"/xcpu/"number", right?

yes, exactly, I had no idea of how to name things, so this is what I 
came up with. This is done this way to make it fit linux as well. On 
linux, we will have
/mnt/xcpu/"s"/xcpu ==> xcpu server for "s"
/mnt/xcpu/"s"/fs   ==> u9fs server for that node, so we can access files 
on the node "s"

note that in this model, the node is the server, and it exports both the 
xcpu service and its own file system.
> 
> So, we have only one cpu server, then we should use the xsh command
> like 
> term% xsh 0 
> ?

yes.

> 
> Or if we have many cpu server, then
> 
> xsh 0 1 2 3 4 5 6 7
> ?

yes. so to run date on all those nodes:
xsh 0 1 2 3 4 5 6 7 -- /bin/date

> 
> By the way, in your mkfile-plan9
> 
> $O.xsh: xsh.$O P9pshell.$O
> 	$LD -o $target $prereq $LDFLAGS
> 
> should be
> $O.xsh: xsh.$O Plan9shell.$O    <============
> 	$LD -o $target $prereq $LDFLAGS

thanks for that fix, I just installed it.

thanks again

ron
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic