[prev in list] [next in list] [prev in thread] [next in thread] 

List:       smartmontools-support
Subject:    [smartmontools-support] Test taking a very long time  to complete,
From:       "David Mathog" <mathog () caltech ! edu>
Date:       2007-08-17 19:57:25
Message-ID: E1IM7wj-0002LJ-C6 () mendel ! bio ! caltech ! edu
[Download RAW message or body]

Solaris 5.9 (Sparc)
Smartctl 5.36

The system has 6 FC-AL disks, all like this:
FUJITSU  MAN3735F SUN72G  Version: 0704

One is the system disk, one is a scratch disk, and 4 are in a volume
set, described this way in the /etc/lvm/md.tab file:

d50     1 4 /dev/dsk/c1t2d0s0 \
            /dev/dsk/c1t3d0s0 \
            /dev/dsk/c1t4d0s0 \
            /dev/dsk/c1t5d0s0

and /etc/vfstab for d50 shows

/dev/md/dsk/d50       /dev/md/rdsk/d50      /vol01    ufs     2   yes  -

Additionally there are some SCSI disks on another controller.
Now here's the weird part:  a '-t long' completed in
under an hour for all disks except c1t4d0s0.  After 100 minutes it
was still showing "Self test in progress ...".  At that point I
stopped it with a "-X" command, and told it do a short test.  The short
test had not finished in 15 minutes.  Then this disk was sent
 '-s off' and '-s on', and told to do another
short test, with c1t3d0s0 running a short test at the same time 
as a control.  The c1t3d0s0 disk finished quickly (I only looked
after 5 minutes) but the slow one was still trundling.  This time
I waited it out and it finished in 30 minutes.

While that d50 is mounted, it isn't at all busy right now.  "iostat 1"
shows zero IO on half the intervals.   That said, there is an Oracle
database spread across that and another logical volume, and the Oracle
programs are all on the d50 volume.  The values shown by "-a" are
ballpark the same for the slow disk and the 3 others in the logical
volume.

So why is this one disk so incredibly slow completing
these tests, whereas the other identical disks, in the same logical
volume, complete quickly?  Is this just bad luck, with something in
Oracle duking it out with the self tests,  or is this disk on the way
out?  (I can't shut down Oracle right now to do the obvious test, maybe
next week.)

The system has been up for 108 days.  Nothing
worrisome has appeared in /var/adm/messages.

Thanks,

David Mathog
mathog@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic