'Re: HBA Errors'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aix-l
Subject:    Re: HBA Errors
From:       Jonathan Fosburgh <syjef () MAIL ! MDANDERSON ! ORG>
Date:       2006-03-23 21:14:56
Message-ID: 200603231515.02593.syjef () mail ! mdanderson ! org
[Download RAW message or body]


On Thursday 23 March 2006 14:53, Andrew.Townsend@bisys.com wrote:
> We received adapter errors in our error report. They are all  FSCSI_ERR4
> errors. At this time, it appears that the volume group is still OK, but we
> are seeing errors in the errorlog.
>  I've called IBM and they are coming out to replace the adapter.
> Here is what I'm am proposing to do:

> [root @ uxsybrpt] /mnt/uxsybproa/exports1 >>datapath query adapter
>
> Active Adapters :2
>
> Adpt#     Adapter Name   State     Mode     Select     Errors  Paths
> Active
>     0           fscsi0  NORMAL   ACTIVE 2428630040          0      4
> 4
>     1           fscsi1  DEGRAD   ACTIVE 2424384374          9      4
> 2
> [root @ uxsybrpt] /mnt/uxsybproa/exports1 >>datapath query device
>
> Total Devices : 2
>
>
> DEV#:   0  DEVICE NAME: vpath0  TYPE: 2105800         POLICY:    Optimized
> SERIAL: 20124055
> ==========================================================================
> Path#      Adapter/Hard Disk          State     Mode     Select     Errors
>     0          fscsi0/hdisk2           OPEN   NORMAL  722935964          0
>     1          fscsi0/hdisk4           OPEN   NORMAL  722887613          0
>     2          fscsi1/hdisk6           DEAD   NORMAL  708340421          5
>     3          fscsi1/hdisk8           OPEN   NORMAL  733551035          0
>

I don't think its the card, at least not in the AIX server.  Your setup is 
something like this? Your AIX server has two connections into the fabric and 
there are either two connections to the Shark (if single fabric) or four 
connections if there are two fabrics.  It appears that the same path to each 
LUN is down.  I find the following in your sense data:

5005 0763 00D0 9B8F 5005 0763 00C0 9B8F

If memory serves, that is the WWPN of the HBA on the Shark you can't connect 
to.  Check the sense data for several messages, I bet they are all the same.  
It is likely that either this card on the Shark is out or you have some bad 
fiber, etc.  If you have a fairly recent SDD (1.5.x, I think) then you don't 
have to go down to remove this adapter from the config.  Run datapath remove 
adapter 1 and that will take it out of SDD.  From there you can rmdev -dRL 
fscsi1.  Then you might try rerunning cfgmgr -vl fcs1 and see what sdd looks 
like after running addpaths.  My guess is that fscsi1 will come back with 
half the paths it has now, but it will no longer be degraded.

If you have to replace the HBA on the Shark you won't have to do anything but 
run cfgmgr and addpaths again, if the replacement is put in the same slot.  
If it is the cable or something like that cfgmgr and addpaths, but only as 
long as you connect the cable to the same switchport or have the switch 
configured to map FCID back to the WWPN, depending on how your vendor does 
it.  HTH.
-- 
Jonathan Fosburgh
AIX and Storage Administrator
UT MD Anderson Cancer Center
Houston, TX 

[Attachment #3 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic