[prev in list] [next in list] [prev in thread] [next in thread] 

List:       npaci-rocks-discussion
Subject:    Re: [Rocks-Discuss] second NIC configuration problem
From:       Bob Ball <ball () umich ! edu>
Date:       2007-03-29 17:19:34
Message-ID: 460BF526.1030604 () umich ! edu
[Download RAW message or body]

This has worked on the problematic node.  Thanks.

bob

Mason J. Katz wrote:
> Sorry, to minimize the effort here do the following:
>
> $ mysql --user=apache cluster
>> delete from networks where device != "eth0";
>
> This will remove everything but the private network from the compute
> nodes.  At this point just re-install them (do not need to run
> insert-ethers).  The re-installation will populate the eth1 devices
> for all the nodes.  At this point you can try add-extra-nic again.
>
> You need to remove the inconsistency of multiple rows for the extra
> NIC.  The above will do this.  We will try to reproduce this
> inconsistency in the lab.  Thanks.
>
> -mjk
>
> On 3/21/07, Bob Ball <ball@umich.edu> wrote:
>> I just want to check on the command below and what it does as it broke
>> over the lines as you can see.  The command should be:
>> mysql --user=apache cluster delete from networks;
>>
>> Is that correct?  You seem to imply that all info on all the cluster
>> clients will be removed by this command, so then we go back to square
>> one in this case and re-install them all via "insert-ethers", right?
>>
>> No one has done any direct mods to the tables, at least, not that I'm
>> aware of.  What I noticed is that the extra eth1 entry came after I did
>> the "shoot-node".  It was not there before then.  I'm going to try one
>> other thing today, and see if it also happens if I just reset the node
>> and force a PXE boot.
>>
>> Also interesting, "dbreport ifcfg eth1 <node>" performed by hand on the
>> head node gives the correct output, but the install on the client picks
>> the bogus eth1 entry to configure from.
>>
>> bob
>>
>> Mason J. Katz wrote:
>> > There may be a bug in add-extra-nic and insert-ethers that you
>> > triggered.  Something left the database in an inconsistent state.
>> > Please do the following:
>> >
>> > # mysql --user=apache cluster
>> >> delete from networks;
>> >
>> > Now re-install the compute nodes.  This will re-populate the database
>> > with the correct NIC information.  Once this is done you can redo the
>> > add-extra-nic commands for the second NIC.
>> >
>> > Is there any chance someone modified the database tables directly?
>> >
>> > -mjk
>> >
>> > On 3/20/07, Bob Ball <ball@umich.edu> wrote:
>> >> We are having a problem in that I can't seem to get eth1 to come up
>> >> correctly on our public nets.  Some history: we moved and 
>> re-racked all
>> >> our machines, and are now in the process of re-naming them all, and
>> >> re-addressing them as well.  I used add-extra-nic to delete eth1,
>> >> rocks-partition to delete partition info, insert-ethers to remove the
>> >> node, did an "insert-ethers --update" to be safe, blew away the disk
>> >> content entirely, then PXE booted the node with insert-ethers 
>> running to
>> >> pick up the request.  All fine so far.  The node came up fine, as
>> >> expected, without eth1.  I then added eth1 according to the User's 
>> Guide
>> >> 5.4 directions, confirmed it was in place, and  re-built the 
>> node.  Not
>> >> only did the NIC not come up, but the table got modified as shown 
>> below
>> >> with the extra entry.  Any ideas on this one?
>> >>
>> >> Note that this IP has not yet appeared in DNS as this is still
>> >> propagating through the hierarchy.  Should this make any difference?
>> >>
>> >> [umopt1:bobdist]# add-extra-nic --list ci-1-12
>> >> -------------------------------------------------------
>> >> |                       ci-1-12                       |
>> >> -------------------------------------------------------
>> >> | Adapter |       IP      |    Netmask    |    Name   |
>> >> -------------------------------------------------------
>> >> |    eth1 | 192.41.230.50 | 255.255.255.0 | umt3int03 |
>> >> |    eth0 |    10.10.1.50 | 255.255.254.0 |   ci-1-12 |
>> >> |    eth1 |               |               |           |
>> >> -------------------------------------------------------
>> >>
>> >>
>> >> bob
>> >>
>> >
>>
>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic