[prev in list] [next in list] [prev in thread] [next in thread] 

List:       beowulf
Subject:    Re: [Beowulf] first cluster [was [OMPI users] trouble using
From:       Douglas Guptill <douglas.guptill () dal ! ca>
Date:       2010-07-13 22:05:38
Message-ID: 20100713220538.GA15163 () sopalepc
[Download RAW message or body]

Hello Gus, list:

On Fri, Jul 09, 2010 at 07:06:05PM -0400, Gus Correa wrote:
> Douglas Guptill wrote:
> > On Thu, Jul 08, 2010 at 09:43:48AM -0400, Gus Correa wrote:
> > > Douglas Guptill wrote:
> > > > On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
> > > > 
> > > > > No....afraid not. Things work pretty well, but there are places
> > > > > where things just don't mesh. Sub-node allocation in particular is
> > > > > an issue as it implies binding, and slurm and ompi have conflicting
> > > > > methods.
> > > > > 
> > > > > It all can get worked out, but we have limited time and nobody cares
> > > > > enough to put in the effort. Slurm just isn't used enough to make it
> > > > > worthwhile (too small an audience).
> > > > I am about to get my first HPC cluster (128 nodes), and was
> > > > considering slurm.  We do use MPI.
> > > > 
> > > > Should I be looking at Torque instead for a queue manager?
> > > > 
> > > Hi Douglas
> > > 
> > > Yes, works like a charm along with OpenMPI.
> > > I also have MVAPICH2 and MPICH2, no integration w/ Torque,
> > > but no conflicts either.
> > 
> > Thanks, Gus.
> > 
> > After some lurking and reading, I plan this:
> > Debian (lenny)
> > + fai                   - for compute-node operating system install
> > + Torque                - job scheduler/manager
> > + MPI (Intel MPI)       - for the application
> > + MPI (OpenMP)          - alternative MPI
> > 
> > Does anyone see holes in this plan?
> > 
> > Thanks,
> > Douglas
> 
> 
> Hi Douglas
> 
> I never used Debian, fai, or Intel MPI.
> 
> We have two clusters with cluster management software, i.e.,
> mostly the operating system install stuff.
> 
> I made a toy Rocks cluster out of old computers.
> Rocks is a minimum-hassle way to deploy and maintain a cluster.
> Of course you can do the same from scratch, or do more, or do better,
> which makes some people frown at Rocks.
> However, Rocks works fine, particularly if your network(s)
> is (are) Gigabit Ethernet,
> and if you don't mix different processor architectures (i.e. only i386  
> or only x86_64, although there is some support for mixed stuff).
> It is developed/maintained by UCSD under an NSF grant (I think).
> It's been around for quite a while too.
> 
> You may want to take a look, perhaps experiment with a subset of your
> nodes before you commit:
> 
> http://www.rocksclusters.org/wordpress/

I am sure Rocks suits many, but not me, at first glance.  I am too
much of a tinkerer.  That comes, partially, from starting this
business too earlier; my first computer was a Univac II - vacuum
tubes, no operating system.

> What is the interconnect/network hardware you have for MPI?
> Gigabit Ethernet?  Infiniband?  Myrinet? Other?

Infiniband - QLogic 12300-BS18

> If Infiniband you may need to add the OFED packages,

Gotcha.  Thanks.

> If you are going to handle a variety of different compilers, MPI  
> flavors, with various versions, etc, I recommend using the
> "Environment module" package.

My one user has requested that.

> I hope this helps.

A Big help.  Much appreciated.

Douglas.
-- 
  Douglas Guptill                       voice: 902-461-9749
  Research Assistant, LSC 4640          email: douglas.guptill@dal.ca
  Oceanography Department               fax:   902-494-3877
  Dalhousie University
  Halifax, NS, B3H 4J1, Canada

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit \
http://www.beowulf.org/mailman/listinfo/beowulf


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic