[prev in list] [next in list] [prev in thread] [next in thread] 

List:       npaci-rocks-discussion
Subject:    [Rocks-Discuss] Re: Reinstall Confusion
From:       Aaron Carr <aaronhcarr () gmail ! com>
Date:       2015-06-26 2:44:08
Message-ID: CANyxMSrfMNY5ukzEnNj_HqYtRFmPFXNzyy8g3pHFnrsSPHWRGQ () mail ! gmail ! com
[Download RAW message or body]

If you don't set the nas nodes to reinstall, they won't.

If you're in doubt, do "rocks list host boot" and it will tell you exactly
what each node is set to do the next time it boots up, assuming that each
of the nodes is set to PXE boot first.

Here's a loop example.

First, I don't have a rocks machine handy at the moment, so you might have
to play with the rocks bit a little.

What you want is *just* the hostname of the compute nodes.

So something like this:
rocks list host|grep compute|awk -F ' ' '{print $1}'

Once that command is showing you just a list of the names you want (change
the grep if your names are different, awk is looking for the first space),
then use that to set the nodes to install.

for i in `rocks list host|grep compute|awk -F ' ' '{print $1}'`; do rocks
set host boot $i action=install; done

Verify by doing "rocks list host boot".  The nodes you wanted should be set
to install now.

Now run the reboot loop.
for i in `rocks list host|grep compute|awk -F ' ' '{print $1}'`; do ssh $i
reboot; done

If you need to add time in there, use this:

for i in `rocks list host|grep compute|awk -F ' ' '{print $1}'`; do ssh $i
reboot;sleep 10; done





On Thu, Jun 25, 2015 at 5:28 PM, Michael Hsu <Michael.Hsu@faradayfuture.com>
wrote:

> Can you show me an example of using a for loop to reboot the nodes?  I
> also have NAS nodes, how can I make sure the reinstall is only done on the
> compute nodes?
> 
> 
> -----Original Message-----
> From: npaci-rocks-discussion-bounces+michael.hsu=
> faradayfuture.com@sdsc.edu [mailto:
> npaci-rocks-discussion-bounces+michael.hsu=faradayfuture.com@sdsc.edu] On
> Behalf Of Aaron Carr
> Sent: Thursday, June 25, 2015 5:18 PM
> To: Discussion of Rocks Clusters
> Subject: [Rocks-Discuss] Re: Reinstall Confusion
> 
> Compute nodes should be set to PXE boot first.
> 
> When a node PXE boots, Rocks tells it what to do.
> 
> If you've set a node for reinstall:
> rocks set host boot compute-1-01 action=install
> 
> Then the node will run the installer, along with any extra customization
> you've added via extend-compute.xml
> 
> If the node is NOT set for reinstall, Rocks will tell it to boot to the
> local disk.
> 
> When you execute "rocks create distro", that updates the files that *will*
> be installed on the nodes.  It doesn't do so at that time.
> 
> For example, if I add a piece to my extend-compute.xml to install the
> Mellanox OFED on the compute nodes, then cd to /export/rocks/install and
> run "rocks create distro", any node that I install (or reinstall) after
> that point will have the Mellanox OFED.  Running nodes will be exactly as
> they were.
> 
> So do a for loop that sets each node to install after you've made changes,
> then reboot the nodes.
> 
> ALWAYS test your changes on a single node prior to doing all nodes.
> 
> If your cluster is large enough, rebuild the nodes in sections, by rack,
> or blade chassis for example.  I've also done a 1000+ node cluster by
> executing a for loop to reboot the nodes, with a 10 second sleep in the
> command.  That way it just gracefully rolled through the entire cluster
> until all the nodes had been rebuilt.
> 
> On Thu, Jun 25, 2015 at 4:53 PM, Michael Hsu <
> Michael.Hsu@faradayfuture.com>
> wrote:
> 
> > I'm not sure if I'm over thinking things here but I'm very confused
> > about the entire reinstall process.
> > 
> > When I perform a reinstall should the compute nodes be connecting
> > using PXE boot?
> > 
> > I notice if it doesn't do a PXE reinstall, the nodes will still go to
> > re-install screen and request the kickstart file from the head node
> right?
> > Is that the same thing booting through PXE?
> > 
> > I also know that each node has a local copy of the base roll so when I
> > do a rocks create distro is it updating all the distros on all the nodes?
> > 
> > When I flag boot action=install on my nodes and then run shutdown -r
> > now on the nodes it's not reinstalling.  Instead if I run
> > /boot/kickstart/cluster-kickstart it will then reinstall.  The
> > documentation shows both methods but which one is right?
> > -------------- next part -------------- An HTML attachment was
> > scrubbed...
> > URL:
> > http://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/201
> > 50625/d1ccf69c/attachment.html
> > 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20150625/52b3bce1/attachment.html
>  
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20150625/86884bef/attachment.html \



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic