[prev in list] [next in list] [prev in thread] [next in thread]
List: freebsd-scsi
Subject: Re: Vinum 29160 detaches drives, invalidates RAID.
From: David Gilbert <dgilbert () velocet ! ca>
Date: 2000-08-25 14:42:44
[Download RAW message or body]
>>>>> "Greg" == Greg Lehey <grog@lemis.com> writes:
>> First of all, I'm very pleased with the speed. The system easily
>> beats the AMI MegaRAID 1500 (same drives) with a whopping 35Mbyte/s
>> in RAID-5 (vs. the 1500's 14Mbyte/s) for read. (They both score a
>> dead heat of 4Mbyte/s write.)
Greg> Nice to hear :-)
In general, I'm an advocate of the vinum system. I've been hammering
it for months now on the test RAID-5 system. Besides this
disconnecting problem, the system is performing very well.
>> Now... if I reboot, and "vinum setstate up" all these drives,
Greg> They all go down, do they?
Not all... Sometimes 2 sometimes 4. I suppose I should have said that
I setstate up all the drives that are down.
>> fsck completes without any complaint. I then generally have to
>> "vinum rebuild parity" ... but I suppose that I'd expect that.
Greg> Hmm. rebuildparity is a dangerous command. Basically, a parity
Greg> error means that *one* (or more) of the drives has incorrect
Greg> data. rebuildparity simply assumes that the error is in the
Greg> data block and "corrects" it. It's a serious problem, one that
Greg> is very difficult to solve.
Well... at the point of failure, we're doing the nightly finds on the
disk. I do the fsck (usually) before I do the rebuildparity. I
suspect that the only information being written to the disk at this
point is the access time updates. I would expect, then, that corrupt
data is likely limited to an update of this nature.
>> The problem I'm having here (and I've had it before) is that the
>> FreeBSD SCSI system seems to "give up" under conditions that others
>> would keep retrying or resetting/retrying.
>> It seems really, really, really important to me that we try harder
>> to get a drive back online. This seems as if it could affect the
>> long-term viability of a vinum-based raid server... not because
>> vinum is bad, but because the SCSI subsystem is too fragile.
Greg> Hmm. I can't really comment on that, but it would be nice if
Greg> the SCSI system could recover from these problems.
I think this is a critical thing. I can accept that it may be hard to
discern if the device has been yanked from the bus or had gone into
some other bad state --- but this is definately not the case. The
FreeBSD SCSI subsystem as-it-stands is very fragile. I realize that
cabling must be 100% for many different reasons;
... But by the same token, we need things to keep retrying and
resetting far longer before loosing all hope.
Dave.
--
============================================================================
|David Gilbert, Velocet Communications. | Two things can only be |
|Mail: dgilbert@velocet.net | equal if and only if they |
|http://www.velocet.net/~dgilbert | are precisely opposite. |
=========================================================GLO================
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic