[prev in list] [next in list] [prev in thread] [next in thread]
List: npaci-rocks-discussion
Subject: Re: [Rocks-Discuss] Ganglia Problem w/ new switch
From: Jim Kusznir <jkusznir () gmail ! com>
Date: 2011-07-28 20:34:49
Message-ID: CAA3eeYCUQtq9i2keS1JFefEsg-+id0nbaD_seOX3Y8SGzGEdOw () mail ! gmail ! com
[Download RAW message or body]
Interestingly enough, IGMP snooping was on, and when I turned it off,
I didn't notice anything. I did find that when I removed an uplink to
another switch of mine (which I recall has IGMP snooping turned on),
then ganglia lost ALL hosts immediately, and they never came back...
So, I went the other way and turned all the IGMP snooping (and a few
other generic IGMP) options on, and after a few moments, everything
came back. I can't say I understand what the different options did,
nor which one(s) specifically fixed the problem, but I am now
operational.
Thanks!
--Jim
On Thu, Jul 28, 2011 at 10:23 AM, Philip Papadopoulos
<philip.papadopoulos@gmail.com> wrote:
> Ganglia uses multicast to send up dates. You may have to turn igmp snooping
> off on your switches, especially if you see up/down/up/down .....
>
> -P
>
>
> On Thu, Jul 28, 2011 at 9:15 AM, Jim Kusznir <jkusznir@gmail.com> wrote:
>
> > Hi all:
> >
> > I just performed a major overhaul of my network infastructure on our
> > cluster in preparation for some upgrades/expansions. After the dust
> > settled from this, I ended up with one very strange bug, and I'm not
> > exactly sure what to look to as potential causes. If you take a look
> > at our ganglia page, you should see what's up:
> >
> > https://aeolus.wsu.edu/ganglia
> >
> > On here, you'll notice that some percentage of our nodes are always
> > coming and going. Pinging to those nodes appears nice and stable, but
> > ganglia doesn't see it that way. As a side note, after I bought the
> > nodes online, I had some switch trauma that ended up forcing me to
> > reboot the switch. When it came back up, ganglia was showing all
> > nodes as down. When I reset gmetad on the head node, nodes started
> > re-appearing, but in their present state of up/down randomness. I'm
> > still 95% this is a switch problem, but I don't know what to look for
> > on the switch. Suggestions?
> >
> > Thanks!
> > -Jim
> >
> >
>
>
> --
> Philip Papadopoulos, PhD
> University of California, San Diego
> 858-822-3628 (Ofc)
> 619-331-2990 (Fax)
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20110728/186c78dc/attachment.html
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic