[prev in list] [next in list] [prev in thread] [next in thread]
List: opensolaris-driver-discuss
Subject: [driver-discuss] LSI 3GB HBA SAS Errors (and other misc)
From: Ryan Wehler <wrwehler () gmail ! com>
Date: 2011-12-02 1:24:18
Message-ID: 2AC8652B-26B3-4B3E-A2D3-C49FCB49B831 () gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
During the diagnostics of my SAN failure last week we thought we had seen a backplane \
failure due to high error counts with 'lsiutil'. However, even with a new backplane \
and ruling out failed cards (MPXIO or singular) or bad cables I'm still seeing my \
error count with LSIUTIL increment. I've got no disks attached to the array right \
now so I've also ruled those out.
Even with nothing connected but the HBA to the backplane expander, a simple restart \
of the SAN into a OpenIndiana LiveCD or other distribution (NexentaStor) increments \
the counter.
I've been as careful as I can be to clear the counter between changes to parts to try \
and eliminate a potentially bad cable/card/etc. You can see phy 8-15 throw errors \
irregardless of MPXIO or single card config, OR which expander port I use on the \
backplane.
According to my VAR something in the mptsas code changed "recently" (not sure what \
that means in time terms) and they do not see the problems with 6GB backplanes and \
adapters.
Attached is a log I took through NexentaStor 3.1.1 with my disks still attached. The \
disks themselves don't seem to be throwing errors, so that's good.
Has anyone seen anything like this? I have not tried to boot into an older version \
of Solaris or NexentaStor yet, but booting into Scientific Linux 6.1 yields about the \
same results with lsiutil.
Nothing from fmadm, /var/adm/messages or otherwise indicate these data errors outside \
of lsiutil.
>
[Attachment #5 (multipart/mixed)]
[Attachment #7 (unknown)]
<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; \
-webkit-line-break: after-white-space; "><div><br><blockquote \
type="cite"></blockquote><font class="Apple-style-span" \
color="#144fae"><br></font>During the diagnostics of my SAN failure last week we \
thought we had seen a backplane failure due to high error counts with 'lsiutil'. \
However, even with a new backplane and ruling out failed cards (MPXIO or \
singular) or bad cables I'm still seeing my error count with LSIUTIL increment. \
I've got no disks attached to the array right now so I've also ruled those \
out.<br><font class="Apple-style-span" color="#144fae"><br></font>Even with nothing \
connected but the HBA to the backplane expander, a simple restart of the SAN into a \
OpenIndiana LiveCD or other distribution (NexentaStor) increments the \
counter.<br><font class="Apple-style-span" color="#144fae"><br></font>I've been as \
careful as I can be to clear the counter between changes to parts to try and \
eliminate a potentially bad cable/card/etc. You can see phy 8-15 throw errors \
irregardless of MPXIO or single card config, OR which expander port I use on the \
backplane.<br><font class="Apple-style-span" color="#144fae"><br></font>According to \
my VAR something in the mptsas code changed "recently" (not sure what that means in \
time terms) and they do not see the problems with 6GB backplanes and \
adapters.<br><font class="Apple-style-span" \
color="#144fae"><br></font></div></body></html>
["SAS Diags.txt" (SAS Diags.txt)]
*********************************************************************
***** Right Card - Right Cable - Round Robin (Adpt #2 in lsiutil)
*********************************************************************
Adapter Phy 0: Link Up, No Errors
Adapter Phy 1: Link Up
Invalid DWord Count 1,276
Running Disparity Error Count 1,167
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 2: Link Up
Invalid DWord Count 3,779
Running Disparity Error Count 3,494
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 3: Link Up
Invalid DWord Count 3,477
Running Disparity Error Count 2,964
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 4: Link Down, No Errors
Adapter Phy 5: Link Down, No Errors
Adapter Phy 6: Link Down, No Errors
Adapter Phy 7: Link Down, No Errors
Expander (Handle 0009) Phy 0: Link Up, No Errors
Expander (Handle 0009) Phy 1: Link Up, No Errors
Expander (Handle 0009) Phy 2: Link Up, No Errors
Expander (Handle 0009) Phy 3: Link Up, No Errors
Expander (Handle 0009) Phy 4: Link Up, No Errors
Expander (Handle 0009) Phy 5: Link Up, No Errors
Expander (Handle 0009) Phy 6: Link Up, No Errors
Expander (Handle 0009) Phy 7: Link Up, No Errors
Expander (Handle 0009) Phy 8: Link Down, No Errors
Expander (Handle 0009) Phy 9: Link Down, No Errors
Expander (Handle 0009) Phy 10: Link Down, No Errors
Expander (Handle 0009) Phy 11: Link Down, No Errors
Expander (Handle 0009) Phy 12: Link Up
Invalid DWord Count 687,520
Running Disparity Error Count 651,781
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 13: Link Up
Invalid DWord Count 689,145
Running Disparity Error Count 678,705
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 14: Link Up
Invalid DWord Count 663,734
Running Disparity Error Count 622,380
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 15: Link Up
Invalid DWord Count 645,744
Running Disparity Error Count 611,468
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 16: Link Down, No Errors
Expander (Handle 0009) Phy 17: Link Down, No Errors
Expander (Handle 0009) Phy 18: Link Down, No Errors
Expander (Handle 0009) Phy 19: Link Down, No Errors
Expander (Handle 0009) Phy 20: Link Down, No Errors
Expander (Handle 0009) Phy 21: Link Down, No Errors
Expander (Handle 0009) Phy 22: Link Up, No Errors
Expander (Handle 0009) Phy 23: Link Up, No Errors
Expander (Handle 0009) Phy 24: Link Up, No Errors
Expander (Handle 0009) Phy 25: Link Up, No Errors
Expander (Handle 0009) Phy 26: Link Up, No Errors
Expander (Handle 0009) Phy 27: Link Up, No Errors
Expander (Handle 0009) Phy 28: Link Up, No Errors
Expander (Handle 0009) Phy 29: Link Up, No Errors
Expander (Handle 0009) Phy 30: Link Up, No Errors
Expander (Handle 0009) Phy 31: Link Up, No Errors
Expander (Handle 0009) Phy 32: Link Up, No Errors
Expander (Handle 0009) Phy 33: Link Up, No Errors
Expander (Handle 0009) Phy 34: Link Up, No Errors
Expander (Handle 0009) Phy 35: Link Up, No Errors
Expander (Handle 0009) Phy 36: Link Up, No Errors
Expander (Handle 0009) Phy 37: Link Down, No Errors
*********************************************************************
*****Right Card - Right Cable - Logical Block (Adpt #2 in lsiutil)
*********************************************************************
Adapter Phy 0: Link Up, No Errors
Adapter Phy 1: Link Up
Invalid DWord Count 3,085
Running Disparity Error Count 2,894
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 2: Link Up
Invalid DWord Count 2,901
Running Disparity Error Count 2,740
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 3: Link Up
Invalid DWord Count 2,886
Running Disparity Error Count 2,647
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 4: Link Down, No Errors
Adapter Phy 5: Link Down, No Errors
Adapter Phy 6: Link Down, No Errors
Adapter Phy 7: Link Down, No Errors
Expander (Handle 0009) Phy 0: Link Up, No Errors
Expander (Handle 0009) Phy 1: Link Up, No Errors
Expander (Handle 0009) Phy 2: Link Up, No Errors
Expander (Handle 0009) Phy 3: Link Up, No Errors
Expander (Handle 0009) Phy 4: Link Up, No Errors
Expander (Handle 0009) Phy 5: Link Up, No Errors
Expander (Handle 0009) Phy 6: Link Up, No Errors
Expander (Handle 0009) Phy 7: Link Up, No Errors
Expander (Handle 0009) Phy 8: Link Down, No Errors
Expander (Handle 0009) Phy 9: Link Down, No Errors
Expander (Handle 0009) Phy 10: Link Down, No Errors
Expander (Handle 0009) Phy 11: Link Down, No Errors
Expander (Handle 0009) Phy 12: Link Up
Invalid DWord Count 1,413,720
Running Disparity Error Count 1,342,133
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 13: Link Up
Invalid DWord Count 1,415,972
Running Disparity Error Count 1,394,435
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 14: Link Up
Invalid DWord Count 1,362,499
Running Disparity Error Count 1,278,166
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 15: Link Up
Invalid DWord Count 1,346,514
Running Disparity Error Count 1,267,128
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 16: Link Down, No Errors
Expander (Handle 0009) Phy 17: Link Down, No Errors
Expander (Handle 0009) Phy 18: Link Down, No Errors
Expander (Handle 0009) Phy 19: Link Down, No Errors
Expander (Handle 0009) Phy 20: Link Down, No Errors
Expander (Handle 0009) Phy 21: Link Down, No Errors
Expander (Handle 0009) Phy 22: Link Up, No Errors
Expander (Handle 0009) Phy 23: Link Up, No Errors
Expander (Handle 0009) Phy 24: Link Up, No Errors
Expander (Handle 0009) Phy 25: Link Up, No Errors
Expander (Handle 0009) Phy 26: Link Up, No Errors
Expander (Handle 0009) Phy 27: Link Up, No Errors
Expander (Handle 0009) Phy 28: Link Up, No Errors
Expander (Handle 0009) Phy 29: Link Up, No Errors
Expander (Handle 0009) Phy 30: Link Up, No Errors
Expander (Handle 0009) Phy 31: Link Up, No Errors
Expander (Handle 0009) Phy 32: Link Up, No Errors
Expander (Handle 0009) Phy 33: Link Up, No Errors
Expander (Handle 0009) Phy 34: Link Up, No Errors
Expander (Handle 0009) Phy 35: Link Up, No Errors
Expander (Handle 0009) Phy 36: Link Up, No Errors
Expander (Handle 0009) Phy 37: Link Down, No Errors
*********************************************************************
*****Right Card - Left SAS Cable - Logical Block (Adpt #2 in lsiutil)
*********************************************************************
Adapter Phy 0: Link Up, No Errors
Adapter Phy 1: Link Up
Invalid DWord Count 3,017
Running Disparity Error Count 2,863
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 2: Link Up
Invalid DWord Count 3,421
Running Disparity Error Count 3,186
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 3: Link Up
Invalid DWord Count 3,797
Running Disparity Error Count 3,488
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 4: Link Down, No Errors
Adapter Phy 5: Link Down, No Errors
Adapter Phy 6: Link Down, No Errors
Adapter Phy 7: Link Down, No Errors
Expander (Handle 0009) Phy 0: Link Up, No Errors
Expander (Handle 0009) Phy 1: Link Up, No Errors
Expander (Handle 0009) Phy 2: Link Up, No Errors
Expander (Handle 0009) Phy 3: Link Up, No Errors
Expander (Handle 0009) Phy 4: Link Up, No Errors
Expander (Handle 0009) Phy 5: Link Up, No Errors
Expander (Handle 0009) Phy 6: Link Up, No Errors
Expander (Handle 0009) Phy 7: Link Up, No Errors
Expander (Handle 0009) Phy 8: Link Up
Invalid DWord Count 1,346,559
Running Disparity Error Count 1,262,957
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 9: Link Up
Invalid DWord Count 1,306,600
Running Disparity Error Count 1,229,587
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 10: Link Up
Invalid DWord Count 1,395,518
Running Disparity Error Count 1,372,956
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 11: Link Up
Invalid DWord Count 1,349,666
Running Disparity Error Count 1,253,027
Loss of DWord Synch Count 2
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 12: Link Down, No Errors
Expander (Handle 0009) Phy 13: Link Down, No Errors
Expander (Handle 0009) Phy 14: Link Down, No Errors
Expander (Handle 0009) Phy 15: Link Down, No Errors
Expander (Handle 0009) Phy 16: Link Down, No Errors
Expander (Handle 0009) Phy 17: Link Down, No Errors
Expander (Handle 0009) Phy 18: Link Down, No Errors
Expander (Handle 0009) Phy 19: Link Down, No Errors
Expander (Handle 0009) Phy 20: Link Down, No Errors
Expander (Handle 0009) Phy 21: Link Down, No Errors
Expander (Handle 0009) Phy 22: Link Up, No Errors
Expander (Handle 0009) Phy 23: Link Up, No Errors
Expander (Handle 0009) Phy 24: Link Up, No Errors
Expander (Handle 0009) Phy 25: Link Up, No Errors
Expander (Handle 0009) Phy 26: Link Up, No Errors
Expander (Handle 0009) Phy 27: Link Up, No Errors
Expander (Handle 0009) Phy 28: Link Up, No Errors
Expander (Handle 0009) Phy 29: Link Up, No Errors
Expander (Handle 0009) Phy 30: Link Up, No Errors
Expander (Handle 0009) Phy 31: Link Up, No Errors
Expander (Handle 0009) Phy 32: Link Up, No Errors
Expander (Handle 0009) Phy 33: Link Up, No Errors
Expander (Handle 0009) Phy 34: Link Up, No Errors
Expander (Handle 0009) Phy 35: Link Up, No Errors
Expander (Handle 0009) Phy 36: Link Up, No Errors
Expander (Handle 0009) Phy 37: Link Down, No Errors
*********************************************************************
*****Left Card - Right SAS Cable - Logical Block (Adpt #1 in lsiutil)
*********************************************************************
Adapter Phy 0: Link Up, No Errors
Adapter Phy 1: Link Up
Invalid DWord Count 4,184
Running Disparity Error Count 3,837
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 2: Link Up
Invalid DWord Count 1,991
Running Disparity Error Count 1,776
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 3: Link Up
Invalid DWord Count 2,185
Running Disparity Error Count 1,987
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 4: Link Down, No Errors
Adapter Phy 5: Link Down, No Errors
Adapter Phy 6: Link Down, No Errors
Adapter Phy 7: Link Down, No Errors
Expander (Handle 0009) Phy 0: Link Up, No Errors
Expander (Handle 0009) Phy 1: Link Up, No Errors
Expander (Handle 0009) Phy 2: Link Up, No Errors
Expander (Handle 0009) Phy 3: Link Up, No Errors
Expander (Handle 0009) Phy 4: Link Up, No Errors
Expander (Handle 0009) Phy 5: Link Up, No Errors
Expander (Handle 0009) Phy 6: Link Up, No Errors
Expander (Handle 0009) Phy 7: Link Up, No Errors
Expander (Handle 0009) Phy 8: Link Down, No Errors
Expander (Handle 0009) Phy 9: Link Down, No Errors
Expander (Handle 0009) Phy 10: Link Down, No Errors
Expander (Handle 0009) Phy 11: Link Down, No Errors
Expander (Handle 0009) Phy 12: Link Up
Invalid DWord Count 676,179
Running Disparity Error Count 641,172
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 13: Link Up
Invalid DWord Count 677,299
Running Disparity Error Count 666,484
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 14: Link Up
Invalid DWord Count 653,458
Running Disparity Error Count 612,114
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 15: Link Up
Invalid DWord Count 632,894
Running Disparity Error Count 598,927
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 16: Link Down, No Errors
Expander (Handle 0009) Phy 17: Link Down, No Errors
Expander (Handle 0009) Phy 18: Link Down, No Errors
Expander (Handle 0009) Phy 19: Link Down, No Errors
Expander (Handle 0009) Phy 20: Link Down, No Errors
Expander (Handle 0009) Phy 21: Link Down, No Errors
Expander (Handle 0009) Phy 22: Link Up, No Errors
Expander (Handle 0009) Phy 23: Link Up, No Errors
Expander (Handle 0009) Phy 24: Link Up, No Errors
Expander (Handle 0009) Phy 25: Link Up, No Errors
Expander (Handle 0009) Phy 26: Link Up, No Errors
Expander (Handle 0009) Phy 27: Link Up, No Errors
Expander (Handle 0009) Phy 28: Link Up, No Errors
Expander (Handle 0009) Phy 29: Link Up, No Errors
Expander (Handle 0009) Phy 30: Link Up, No Errors
Expander (Handle 0009) Phy 31: Link Up, No Errors
Expander (Handle 0009) Phy 32: Link Up, No Errors
Expander (Handle 0009) Phy 33: Link Up, No Errors
Expander (Handle 0009) Phy 34: Link Up, No Errors
Expander (Handle 0009) Phy 35: Link Up, No Errors
Expander (Handle 0009) Phy 36: Link Up, No Errors
Expander (Handle 0009) Phy 37: Link Down, No Errors
*********************************************************************
*****Left Card - Left SAS Cable - Logical Block (Adpt #1 in lsiutil)
*********************************************************************
Adapter Phy 0: Link Up, No Errors
Adapter Phy 1: Link Up
Invalid DWord Count 2,278
Running Disparity Error Count 2,177
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 2: Link Up
Invalid DWord Count 2,933
Running Disparity Error Count 2,432
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 3: Link Up
Invalid DWord Count 3,593
Running Disparity Error Count 3,272
Loss of DWord Synch Count 0
Phy Reset Problem Count 0
Adapter Phy 4: Link Down, No Errors
Adapter Phy 5: Link Down, No Errors
Adapter Phy 6: Link Down, No Errors
Adapter Phy 7: Link Down, No Errors
Expander (Handle 0009) Phy 0: Link Up, No Errors
Expander (Handle 0009) Phy 1: Link Up, No Errors
Expander (Handle 0009) Phy 2: Link Up, No Errors
Expander (Handle 0009) Phy 3: Link Up, No Errors
Expander (Handle 0009) Phy 4: Link Up, No Errors
Expander (Handle 0009) Phy 5: Link Up, No Errors
Expander (Handle 0009) Phy 6: Link Up, No Errors
Expander (Handle 0009) Phy 7: Link Up, No Errors
Expander (Handle 0009) Phy 8: Link Up
Invalid DWord Count 674,174
Running Disparity Error Count 632,699
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 9: Link Up
Invalid DWord Count 663,526
Running Disparity Error Count 624,135
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 10: Link Up
Invalid DWord Count 707,295
Running Disparity Error Count 694,602
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 11: Link Up
Invalid DWord Count 683,153
Running Disparity Error Count 634,954
Loss of DWord Synch Count 1
Phy Reset Problem Count 0
Expander (Handle 0009) Phy 12: Link Down, No Errors
Expander (Handle 0009) Phy 13: Link Down, No Errors
Expander (Handle 0009) Phy 14: Link Down, No Errors
Expander (Handle 0009) Phy 15: Link Down, No Errors
Expander (Handle 0009) Phy 16: Link Down, No Errors
Expander (Handle 0009) Phy 17: Link Down, No Errors
Expander (Handle 0009) Phy 18: Link Down, No Errors
Expander (Handle 0009) Phy 19: Link Down, No Errors
Expander (Handle 0009) Phy 20: Link Down, No Errors
Expander (Handle 0009) Phy 21: Link Down, No Errors
Expander (Handle 0009) Phy 22: Link Up, No Errors
Expander (Handle 0009) Phy 23: Link Up, No Errors
Expander (Handle 0009) Phy 24: Link Up, No Errors
Expander (Handle 0009) Phy 25: Link Up, No Errors
Expander (Handle 0009) Phy 26: Link Up, No Errors
Expander (Handle 0009) Phy 27: Link Up, No Errors
Expander (Handle 0009) Phy 28: Link Up, No Errors
Expander (Handle 0009) Phy 29: Link Up, No Errors
Expander (Handle 0009) Phy 30: Link Up, No Errors
Expander (Handle 0009) Phy 31: Link Up, No Errors
Expander (Handle 0009) Phy 32: Link Up, No Errors
Expander (Handle 0009) Phy 33: Link Up, No Errors
Expander (Handle 0009) Phy 34: Link Up, No Errors
Expander (Handle 0009) Phy 35: Link Up, No Errors
Expander (Handle 0009) Phy 36: Link Up, No Errors
Expander (Handle 0009) Phy 37: Link Down, No Errors
[Attachment #9 (unknown)]
<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; \
-webkit-line-break: after-white-space; "><div><br><font class="Apple-style-span" \
color="#144fae"><br><br></font>Attached is a log I took through NexentaStor 3.1.1 \
with my disks still attached. The disks themselves don't seem to be throwing \
errors, so that's good.<br><font class="Apple-style-span" \
color="#144fae"><br><br></font>Has anyone seen anything like this? I have not \
tried to boot into an older version of Solaris or NexentaStor yet, but booting into \
Scientific Linux 6.1 yields about the same results with lsiutil.<br><font \
class="Apple-style-span" color="#144fae"><br></font>Nothing from fmadm, \
/var/adm/messages or otherwise indicate these data errors outside of \
lsiutil.<br><font class="Apple-style-span" \
color="#144fae"><br><br><br></font><blockquote \
type="cite"><div><br></div></blockquote></div><br></body></html>
_______________________________________________
driver-discuss mailing list
driver-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/driver-discuss
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic