'Re: [Evms-devel] RAID5 problems - EVMS engine cannot see'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       evms-devel
Subject:    Re: [Evms-devel] RAID5 problems - EVMS engine cannot see
From:       "Bryn Hughes" <bhughes () vcc ! ca>
Date:       2003-06-30 20:17:01
[Download RAW message or body]

I never ended up having time to do more troubleshooting with this array,
I needed the server back up and running. *However I now suspect that the
mainboard in the system may have been bad as it completely failed a week
or so later (started having serious trouble identifying drives)
This server is now moved to another board (an old one I had in storage).
*After identifying some bad memory and replacing that I ended up with a
dead RAID array again. *I had rebooted serveral times while
reconstruction of the array was underway and EVMS cannot start up the
array now. *Here's what I see in my boot messages:
evms: EVMS v1.2.1 initializing .... info level(5)
evms: md core: OUT OF DATE, freshest: hdb
evms: md core: kicking non-fresh hdd from array!
evms: md core: kick_rdev_from_array: (hdd)
evms: md core: kicking non-fresh hdc from array!
evms: md core: kick_rdev_from_array: (hdc)
of course at that point I've got 1 out of 3 disks available and the
array cannot be started.
I'm assuming this happened because of the reboots during reconstruction?
*Any hints for recussitation? *I managed to get a backup of MOST of the
array before this problem (it was the backup software that identified
trouble with the memory in this box) but I don't want to go through
restoring everything again if at all avoidable!
Thanks in advance,
Bryn
>>> "Steve Dobbelstein" <steved@us.ibm.com> - 6/10/03 10:11 AM >>>
Bryn Hughes wrote:
> So I rebooted into single user mode and took a stab at things with
evms
> (clui)... I was able to apply the changes to /dev/md0 but now I have
> NOTHING! */proc/evms/mdstat shows NO raid devices. *Running evms says:
>
> MDRaid5RegMgr: region md/md0 object index 1 is faulty. Array may be
> degraded.
>
> Just caught a look at my boot logs, it appears that evms has decided
> that /dev/hdc1 has failed and is trying to use /dev/hdd1 as a spare
disk
> as well as the third disk in the array (hence the reason it can't get
> the array up at all). *This is proving to be a great 'worst case'
> example of what can go wrong with evms...
>
> ______________________________
> Bryn Hughes
>
> Computer Support Analyst
> Macintosh/Linux Support
> Information and Computing Services
> Vancouver Community College
>
> ph: (604) 871-7007
> email: bhughes@vcc.ca
> ______________________________
> >>> "Bryn Hughes" <bhughes@vcc.ca> 06/09/03 10:27 AM >>>
> After much thrashing about I finally got EVMS working on Mandrake 9.1.
> I have managed to mount all my old volumes (created under Mandrake
8.1,
> evms 1.2.1) but I can't administer anything.
>
> md0 is a RAID5 array made up of hdb1, hdc1 and hdd1 (3 x 20g
> thereabouts). *I've been using this array for some time with good
luck.
> For whatever reason my array has started coming up in degraded mode
> after rebuilding this server. *When I start evmsn or evmsgui I get a
> message saying that md0 has inconsistent metadata and I am given the
> option to repair it. *Saying 'yes' lets me in to the engine but then I
> get a string of messages saying that my volumes were discovered by the
> kernel but not by the engine. *Attempting to save changes (so the
> changes to md0 are applied) gives me an error saying that files are in
> use (the server is running with /usr and /var in lvm containers on
md0)
>
> Attached is a log file from evmsgui...
>
> Any advice?
>
> Bryn
Argh! *My apologies for not posting sooner. *It was a good thing that
your
original attempts failed to update the MD metadata. *EVMS should not
have
flagged any errors. *I was hoping to debug EVMS to find out why it
thought
the MD superblocks were messed up before it tried to "fix" them.
Looks like you found a way to make it do the "fix" by booting in single
user mode. *Sorry about the lost array. *Now that the superblocks have
been
modified it will be hard to tell why it thought they were bad and what
actions it took to "correct" the problems.
I had a similar problem reported to me off the mailing list -- a three
disk
array that was "fixed" with the result that disk index 1 was marked
faulty
and disk index 2 was marked a spare. *It looks like this may be a
repeatable bug.
I know you have already put a lot of work into getting EVMS to work on
Mandrake. *Would it be possible to try and recreate the problem? *I
believe
the recreation scenario is to create the MD device under Mandrake
without
using EVMS, then install EVMS. *EVMS should then complain that the MD
device has inconsistent metadata. *At that point the "Fix" option should
*not* be selected so that we can delve into why EVMS thinks the metadata
are bad on a device that obviously worked before EVMS was installed.
Steve D.


______________________________
Bryn Hughes

Computer Support Analyst 
Macintosh/Linux Support
Information and Computing Services
Vancouver Community College

ph: (604) 871-7007
email: bhughes@vcc.ca 
______________________________


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
Evms-devel mailing list
Evms-devel@lists.sourceforge.net
To subscribe/unsubscribe, please visit:
https://lists.sourceforge.net/lists/listinfo/evms-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic