[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-raid
Subject:    Re: RAID reconstruction
From:       Gadi Oxman <gadio () netvision ! net ! il>
Date:       1998-01-15 8:53:47
[Download RAW message or body]

Rob,

> Hey there... We've got a 30G RAID-5 running on our Dual PPro, etc, etc,
> and running the latest RAID alpha patches (RAID-5 background
> reconstruction).  On boot of an unclean raid, the machine will do fine for
> a while, checking nice and slowly, then will stop with read error: 
> 
> raid5: raid set 09:00 not clean; re-constructing parity
> md: updating raid superblock on device 08:01, sb_offset == 6323904
> ... resync log
>  ....   mddev->nb_dev: 6
>  ....   raid array: 09:00
>  ....   max_blocks: 31619520 blocksize: 1024
> md: syncing RAID array 09:00
> md: updating raid superblock on device 08:11, sb_offset == 6323904
> md: updating raid superblock on device 08:21, sb_offset == 6323904
> .md: updating raid superblock on device 08:31, sb_offset == 6323904
> md: updating raid superblock on device 08:41, sb_offset == 6323904
> md: updating raid superblock on device 08:51, sb_offset == 6323904
> ......<6>Swansea University Computer Society IPX 0.34 for NET3.035
> IPX Portions Copyright (c) 1995 Caldera, Inc.
> Appletalk 0.17 for Linux NET3.035
> eth0: MII monitoring tick: CSR12 ffffff37, Link partner report 01e1.
> eth0: Setting full-duplex based on MII Xcvr #8 partner capability of 01e1.
> .<1>read error, stopping reconstruction.
> 
> We've had zero problems with the RAID or the drives anywhere else, could
> this be a bug in the driver? It's happened consistantly since 2.0.31 and
> RedHat 4.2 up to today's 2.0.33 and RedHat 5.0...
> 								-Rob

We need to look at the kernel-based reconstruction support more closely (my
guess would be a race which happens when the current device block size is
changed from under us).

Meanwhile, perhaps the best option would be to disable kernel-based
reconstruction and use ckraid instead. We disabled it by default on
the 2.1.x version of the RAID patch and added a "SUPPORT_RECONSTRUCTION"
compile time flag, but it is enabled unconditionally for RAID-4/RAID-5
in the latest 2.0.x ALPHA version.

In drivers/block/md.c, we have the following code in analyze_sbs():

       /*
        * We need to add this as a superblock option.
        */
       if (sb->state != (1 << MD_SB_CLEAN)) {
               if (sb->level == 1) {
                       printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
                       goto abort;
               } else
                       printk (NOT_CLEAN_IGNORE, devname(MKDEV(MD_MAJOR, minor )));
       }

Meanwhile, we can change it to:

	if (sb->state != (1 << MD_SB_CLEAN)) {
		printk (NOT_CLEAN, kdevname(MKDEV(MD_MAJOR, minor)));
		goto abort;
	}

Gadi

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic