[prev in list] [next in list] [prev in thread] [next in thread] 

List:       evms-devel
Subject:    Re: [Evms-devel] Help... please! It's trying to kill me! (Serious
From:       Itai Tavor <itai () tavor ! net>
Date:       2006-11-22 0:39:11
Message-ID: A4C94CCC-54F8-42AE-A782-47ECE1013F3C () tavor ! net
[Download RAW message or body]

On 22/11/2006, at 7:35 AM, Steve Dobbelstein wrote:
>
> Itai Tavor <itai@tavor.net> wrote on 11/20/2006 07:43:40 AM:
>
>> Hi...
>
>> I got some serious problems with my server... for some reason that I
>> can't figure out, things suddenly started going wrong a couple of
>> days ago, and since then every time I tried to change anything,
>> everything got worse...
>
>> An example - I had a /dev/evms/duchess volume of 5GB. I created a
>> new /dev/evms/nas-data volume of 200GB. I kept getting errors trying
>> to access it, so I deleted it. But now, in some way I absolutely
>> can't explain, nas-data is back and duchess is a compatibility volume
>> (/dev/evms/lvm2/main/duchess) which has the same size and content of
>> nas-data...
>
>> Also, the biggest volume, which is a 900GB lvm2 compatibility volume
>> formatted with XFS, now returns error 990 whenever I try to access  
>> it.
>
>> The full output of evms_gather_info is at http://insentiv.net/
>> evms_gather_info . Please please please could someone tell me if
>> there's anything I can do to get my data back?
>
>> TIA, Itai
>
> Hello, Itai.
>
> One hint of what is going wrong is at the beginning of the output of
> evms_gather_info:
>
> LVM2: The PV with index 23 was not found when discovering container
> lvm2/main.
> An "error" object will be created in it's place. Any regions in this
> container
> that map to this PV will return I/O errors if they attempt to read  
> or write
> to
> this PV. Regions that don't map to this PV will work normally.
>
> In case you are not up on your LVM terms, a PV (Physical Volume) is an
> object that the volume group ("container" in EVMS terminology)  
> comprises.
> For example, in your setup sda1, sda2, sda3, sda5, sda6, sda7,  
> sda8, sda9,
> sdb2, sdb3, sdb7, sdb8, sdb9, sdc5, sdc6, sdc7, sdc8, sdc9, sdc10,  
> sdc13,
> sdd5, sdd6, sdd7, sdd8, sdd9, and sdd10 are all PVs that belong to  
> volume
> group lvm2/main.
>
> I don't know which PV had index 23, but whichever one it is, it's  
> missing,
> which would explain why you get errors reading  volume /dev/evms/ 
> nas-data.
> From your output, part of region lvm2/main/nas-data resides on sda1 --
> segment sda1 lists region lvm2/main/nas-data as one of its  
> parents.  There
> are no other segments that have region lvm2/main/nas-data as a parent.
> Seeing that segment sda1 is 37.26 GB and region lvm2/main/nas-data is
> 200.00 GB, a good chunk of lvm2/main/nas-data is missing and is  
> probably on
> the missing PV.  The missing PV was replaced with an "error" object  
> -- one
> that returns errors whenever it is accessed.
>
> The metadata for an EVMS volume are kept at the end of the object  
> that is
> made into the volume.  If the end of region lvm2/main/nas-data is  
> mapped to
> the error object, then EVMS will not discover the /dev/evms/nas- 
> data volume
> since it won't be able to find the metadata for the volume.  It  
> will make a
> compatibility volume instead.
>
> A similar story could be made for volume duchess if part of region
> lvm2/main/duchess was mapped to the error object for the missing PV.
>
> On first inspection, it appears to me that your main problem is  
> that you
> are missing a disk or segment for the container lvm2/main.  The  
> rest of the
> problems can be explained as side effects of the missing PV.   
> Looking at
> the list of PVs above, do you notice one that is missing?  The list  
> above
> was made by hand from looking at your evms_gather_info output.  For  
> a more
> accurate list run evmsn or evmsgui and see which objects container
> lvm2/main consumes.  (Or run the command "q:chi,lvm2/main" in the evms
> Command Line Interpreter.) The missing object is most likely at  
> index 23 in
> the LVM2 volume group.  If that object still exists, then it's  
> likely that
> the LVM2 metadata on it are corrupt.  You may be able to rebuild it by
> hand, but that skill is outside my area of expertise.
>
> Hope this helps.
>
> Steve D.

Hi Steve,

Thanks for the very detailed answer... however I'm afraid you were  
led down a wrong path by that missing PV 23... I should have  
mentioned it in my original post. That PV has been missing forever -  
at least since I converted the server from LVM2 to EVMS. I never  
managed to figure out why it was there and it wasn't used by any LV.  
The problems I've asked about couldn't be related to it, they just  
started last week.

Anyway... I guess it doesn't matter anymore. I gave up on getting  
anything out of nas-data or duchess, and decided to try to get at  
least some of the data off pluto-home to another system, so I ran  
xfs_repair on it, and all the data ended up in lost+found -  
disordered, but it was there. Then, I needed to free up a hard disk  
to copy the data to... so I wrote down the list of PVs used by lvm2/ 
main/pluto-home, and then deleted some PVs that weren't in that list...

So, guess what? Now there are 4 PVs missing from pluto-home. How the  
$#%^ did EVMS manage to do that? And I thought it was supposed to be  
smart. Now, unless there's some way to figure out where those deleted  
PVs where on the disks and recreate the partitions, it's all over.

Thanks anyway... Itai

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Evms-devel mailing list
Evms-devel@lists.sourceforge.net
To subscribe/unsubscribe, please visit:
https://lists.sourceforge.net/lists/listinfo/evms-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic