'Re: [D-LP] how to create > 2 TB volume'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-poweredge
Subject:    Re: [D-LP] how to create > 2 TB volume
From:       Eberhard Moenkeberg <emoenke () gwdg ! de>
Date:       2004-11-30 1:26:59
Message-ID: Pine.LNX.4.58.0411300132400.551 () gwdu05 ! gwdg ! de
[Download RAW message or body]

Hi Jason,

On Tue, 30 Nov 2004, jason andrade wrote:
> On Mon, 29 Nov 2004, Eberhard Moenkeberg wrote:

> > SUSE-9.2 is based on kernel 2.6, but I did not start an experience
> > with volumes > 2 TB yet.
> > I decided to do some tests before planning "production" with it because
> > many layers are involved, and indeed my first hurdle was unexpectedly a
> > hardware raid firmware which does not allow a "LUN size" above 2 TB
> > (Triplestor/Transtec "Recall" IDE raid).
> 
> that's interesting.  i pretty much have the same issue here with proware
> which will 'split' autocreation into chunks of less than 2TB.  however
> there isn't anything in the manual which says whether i can't do this
> manually and i have a spare array for the moment so i might try and create
> a 3TB array and see how i go..

This is the maximum possible with the Recall (ADTX) firmware:

(scsi1:A:3): 160.000MB/s transfers (80.000MHz DT, offset 62, 16bit)
  Vendor: ADTX      Model: AXRR-LH000S-F     Rev: L67A
  Type:   Direct-Access                      ANSI SCSI revision: 03
scsi1:A:3:0: Tagged Queuing enabled.  Depth 32
SCSI device sdc: 4294967040 512-byte hdwr sectors (2199023 MB)
SCSI device sdc: drive cache: write back
 sdc: sdc1

It had firmware L670 when I had created the "LUNs", and maybe firmware 
L67A would be able to go beyond 2 TB (I don't know, they don't tell), but 
now I need to keep the data and can't test it again...

Interesting feature of the Recall IDE raid firmware: you can set the 
"write back" cache policy for the battery-backed controller cache, but 
force "write through" for the disk caches. Maybe this is the secret of the 
better performance against the Infortrend controllers, but the read 
performance also seems better.

http://ftp.gwdg.de/pub/linux/people/emoenke/ide-raid/bonnie++/bonnie.all.2.html

shows some bonnie++ data. rc15-x.x is the Recall, satalis is an AXUS made 
thing which has fallen through my sieve (poor firmware comfort, and low 
"feeled" performance), all others are older and newer Triplestor/Transtec 
Masscope arrays with Infortrend controllers.
The sorting is by "feeled" performance.

> > I hope to get a chance to test a different IDE raid next week (Transtec
> > 6100, aka Triplestor Masscope+, aka Infortrend EonStor).
> 
> i've had fairly good experiences with infortrend controllers in the past.

Yes, they had the best throughput and most advanteged firmware all the 
time, but now the Recall seems to have a better througput.
I can't tell about Recall's firmware quality yet because that would need 
a major accidance to judge it against the Infortrend firmware...

After a cooling outage in our server room for 90 minutes which raised the 
air temperature to 37.3 Centigrades (body temperature!), I have seen the 
Infortrend firmware fulfilling a raid5 disk rebuild while two other disks 
were showing bad blocks - the rebuild process continued and just marked 
the non-recoverable blocks as bad. If you see it happening, you just say 
"OK, that's like it should", but I guess most raid firmware would not 
handle it this straight way. So I can't really tell about Recall firmware 
quality before the next accident.

> > My goal is to define the whole raid5 set of 15+1 disks of 250 GB as a
> > single volume and then use it as a single partition with ext3. This will
> > be a filesystem with about 3.3 TB "available" space.
> 
> i assume the +1 is a hot spare ?

Yes.
It may seem hazardous to configure a single raid5 over 15 active disks...
Infortrend (and probably ADTX, too) is working on raid6 which is a raid5 
with each XOR block twice, so it can tolerate two disks failing at once.
I will try that if the new array already comes with the raid6 feature - 
this would reduce the "available" space to about 3.1 TB.
I prefer Hitachi disks because they have traditionally 8 centigrades lower 
temperature than say Maxtor, and indeed all my Hitachi disks have survived
our latest temperature maximum, just 3 Maxtor disks failed.

> > I am in a pretty good hope - SUSE was supporting 2 TB partitions for a
> > long time with the 2.4 kernels while RedHat and others were stuck at 1 TB.
> > They simply and traditionally have the better engineer crew...
> 
> well, they were stuck at 1TB for support but between 1-2TB actually did
> work.. (RHEL etc)

OK, that was only something like a sign bit flaw not correctly handled by 
the RedHat engineers for more than one year, and now we have a SCSI 
command class raise, but the SUSE engineers are "trusted" precision 
workers.

Cheers -e
-- 
Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org)

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
[prev in list] [next in list] [prev in thread] [next in thread]