[prev in list] [next in list] [prev in thread] [next in thread]
List: zfs-discuss
Subject: Re: [zfs-discuss] Solaris Based Systems "Lock Up" - Possibly
From: Lachlan Mulcahy <lmulcahy () marinsoftware ! com>
Date: 2011-10-31 21:10:52
Message-ID: CANv=DFvSmha13xF+GqQYLRM5Shj6CrFM6AJ17XH=asxzuNjRzg () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Hi Marion,
Thanks for your swifty reply!
Have you got the latest firmware on your LSI 1068E HBA's? These have been
> known to have lockups/timeouts when used with SAS expanders (disk
> enclosures)
> with incompatible firmware revisions, and/or with older mpt drivers.
>
I'll need to check that out -- I'm 90% sure that these are fresh out of box
HBAs.
Will try an upgrade there and see if we get any joy there...
> The MD1220 is a 6Gbit/sec device. You may be better off with a matching
> HBA -- Dell has certainly told us the MD1200-series is not intended for
> use with the 3Gbit/sec HBA's. We're doing fine with the LSI SAS 9200-8e,
> for example, when connecting to Dell MD1200's with the 2TB "nearline SAS"
> disk drives.
>
I was aware the MD1220 is a 6G device, but I figured that since our IO
throughput doesn't actually come close to saturating 3Gbit/sec that it
would just operate at the lower speed and be OK. I guess it is something to
look at if I run out of other options...
Last, are you sure it's memory-related? You might keep an eye on "
> arcstat.pl"
> output and see what the ARC sizes look like just prior to lockup. Also,
> maybe you can look up instructions on how to force a crash dump when the
> system hangs -- one of the experts around here could tell a lot from a
> crash dump file.
>
I'm starting to doubt that it is a memory issue now -- especially since I
now have some results from my latest "test"...
output of arcstat.pl looked like this just prior to the lock up:
19:57:36 24G 24G 94 161 61 194 1 1
19:57:41 24G 24G 96 174 62 213 0 0
time arcsz c mh% mhit hit% hits l2hit% l2hits
19:57:46 23G 24G 94 161 62 192 1 1
19:57:51 24G 24G 96 169 63 205 0 0
19:57:56 24G 24G 95 169 61 206 0 0
^-- This is the very last line printed...
I actually discovered and rebooted the machine via DRAC at around 20:44, so
it had been in it's bad state for around 1 hour.
Some snippets from the output some 20 minutes earlier shows the point at
while the arcsz grew to reach the maximum:
time arcsz c mh% mhit hit% hits l2hit% l2hits
19:36:45 21G 24G 95 152 58 177 0 0
19:37:00 22G 24G 95 156 57 182 0 0
19:37:15 22G 24G 95 159 59 185 0 0
19:37:30 23G 24G 94 153 58 178 0 0
19:37:45 23G 24G 95 169 59 195 0 0
19:38:00 24G 24G 95 160 59 187 0 0
19:38:25 24G 24G 96 151 58 177 0 0
So it seems that arcsz reaching the 24G maximum wasn't necessarily to
blame, since the system operated for a good 20mins in this state.
I was also logging "vmstat 5" prior to the crash (though I forgot to
include some timestamps in my output) and these are the final lines
recorded in that log:
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy
id
0 0 0 25885248 18012208 71 2090 0 0 0 0 0 0 0 0 22 17008 210267 30229 1
5 94
0 0 0 25884764 18001848 71 2044 0 0 0 0 0 0 0 0 25 14846 151228 25911 1
5 94
0 0 0 25884208 17991876 71 2053 0 0 0 0 0 0 0 0 8 16343 185416 28946 1
5 93
So it seems there was some 17-18G free in the system when the lock up
occurred. Curious...
I was also capturing some arc info from mdb -k and the output prior to the
lock up was...
Monday, October 31, 2011 07:57:51 PM UTC
arc_no_grow = 0
arc_tempreserve = 0 MB
arc_meta_used = 4621 MB
arc_meta_limit = 20480 MB
arc_meta_max = 4732 MB
Monday, October 31, 2011 07:57:56 PM UTC
arc_no_grow = 0
arc_tempreserve = 0 MB
arc_meta_used = 4622 MB
arc_meta_limit = 20480 MB
arc_meta_max = 4732 MB
Looks like metadata was not primarily responsible for consuming all of that
24G of ARC in arcstat.pl output...
Also seems nothing interesting in /var/adm/messages leading up to my
rebooting :
Oct 31 18:42:57 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 512 PPM exceeds tolerance 500 PPM
Oct 31 18:44:01 mslvstdp02r last message repeated 1 time
Oct 31 18:45:05 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 512 PPM exceeds tolerance 500 PPM
Oct 31 18:46:09 mslvstdp02r last message repeated 1 time
Oct 31 18:47:23 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 505 PPM exceeds tolerance 500 PPM
Oct 31 19:06:13 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 505 PPM exceeds tolerance 500 PPM
Oct 31 19:09:27 mslvstdp02r last message repeated 4 times
Oct 31 19:25:04 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 505 PPM exceeds tolerance 500 PPM
Oct 31 19:28:17 mslvstdp02r last message repeated 3 times
Oct 31 19:46:17 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency
error 505 PPM exceeds tolerance 500 PPM
Oct 31 19:49:32 mslvstdp02r last message repeated 4 times
Oct 31 20:44:33 mslvstdp02r genunix: [ID 540533 kern.notice] ^MSunOS
Release 5.11 Version snv_151a 64-bit
Oct 31 20:44:33 mslvstdp02r genunix: [ID 877030 kern.notice] Copyright (c)
1983, 2010, Oracle and/or its affiliates. All rights reserved.
Just some ntpd stuff really...
I'm going to check out a firmware upgrade next.. I can't believe that this
is really an out of memory situation now when there is 17-18G free reported
by vmstat....
Lets see...
Regards,
--
Lachlan Mulcahy
Senior DBA,
Marin Software Inc.
San Francisco, USA
AU Mobile: +61 458 448 721
US Mobile: +1 (415) 867 2839
Office : +1 (415) 671 6080
[Attachment #5 (text/html)]
Hi Marion,<br><br>Thanks for your swifty reply!<br><br><div \
class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex;"> Have you got the latest firmware \
on your LSI 1068E HBA's? These have been<br> known to have lockups/timeouts when \
used with SAS expanders (disk enclosures)<br> with incompatible firmware revisions, \
and/or with older mpt drivers.<br></blockquote><div><br>I'll need to check that \
out -- I'm 90% sure that these are fresh out of box HBAs.<br><br>Will try an \
upgrade there and see if we get any joy there...<br> </div><div> </div><blockquote \
class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, \
204, 204); padding-left: 1ex;"> The MD1220 is a 6Gbit/sec device. You may be better \
off with a matching<br> HBA -- Dell has certainly told us the MD1200-series is not \
intended for<br> use with the 3Gbit/sec HBA's. We're doing fine with the LSI \
SAS 9200-8e,<br> for example, when connecting to Dell MD1200's with the 2TB \
"nearline SAS"<br> disk drives.<br></blockquote><div><br>I was aware the \
MD1220 is a 6G device, but I figured that since our IO throughput doesn't \
actually come close to saturating 3Gbit/sec that it would just operate at the lower \
speed and be OK. I guess it is something to look at if I run out of other \
options...<br> <br><br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt \
0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> Last, are \
you sure it's memory-related? You might keep an eye on "<a \
href="http://arcstat.pl" target="_blank">arcstat.pl</a>"<br> output and see what \
the ARC sizes look like just prior to lockup. Also,<br> maybe you can look up \
instructions on how to force a crash dump when the<br> system hangs -- one of the \
experts around here could tell a lot from a<br> crash dump \
file.<br></blockquote><div><br>I'm starting to doubt that it is a memory issue \
now -- especially since I now have some results from my latest \
"test"...<br><br>output of <a href="http://arcstat.pl">arcstat.pl</a> \
looked like this just prior to the lock up:<br> <br>19:57:36 24G 24G 94 161 \
61 194 1 1 <br>19:57:41 24G 24G 96 174 62 213 0 \
0 <br> time arcsz c mh% mhit hit% hits l2hit% l2hits <br>19:57:46 \
23G 24G 94 161 62 192 1 1 <br> 19:57:51 24G 24G 96 \
169 63 205 0 0 <br>19:57:56 24G 24G 95 169 61 206 \
0 0 <br><br>^-- This is the very last line printed... <br><br>I actually \
discovered and rebooted the machine via DRAC at around 20:44, so it had been in \
it's bad state for around 1 hour. <br> <br>Some snippets from the output some 20 \
minutes earlier shows the point at while the arcsz grew to reach the maximum:<br><br> \
time arcsz c mh% mhit hit% hits l2hit% l2hits <br>19:36:45 21G 24G \
95 152 58 177 0 0 <br> 19:37:00 22G 24G 95 156 57 \
182 0 0 <br>19:37:15 22G 24G 95 159 59 185 0 0 \
<br>19:37:30 23G 24G 94 153 58 178 0 0 <br>19:37:45 23G \
24G 95 169 59 195 0 0 <br> 19:38:00 24G 24G 95 160 \
59 187 0 0 <br>19:38:25 24G 24G 96 151 58 177 0 \
0 <br><br>So it seems that arcsz reaching the 24G maximum wasn't necessarily to \
blame, since the system operated for a good 20mins in this state.<br> <br>I was also \
logging "vmstat 5" prior to the crash (though I forgot to include some \
timestamps in my output) and these are the final lines recorded in that log:<br><br> \
kthr memory page disk faults cpu<br> r b w \
swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id<br> 0 0 0 \
25885248 18012208 71 2090 0 0 0 0 0 0 0 0 22 17008 210267 30229 1 5 94<br> 0 0 0 \
25884764 18001848 71 2044 0 0 0 0 0 0 0 0 25 14846 151228 25911 1 5 94<br> 0 0 0 \
25884208 17991876 71 2053 0 0 0 0 0 0 0 0 8 16343 185416 28946 1 5 93<br><br>So \
it seems there was some 17-18G free in the system when the lock up occurred. \
Curious...<br><br>I was also capturing some arc info from mdb -k and the output \
prior to the lock up was...<br> <br>Monday, October 31, 2011 07:57:51 PM \
UTC<br>arc_no_grow = 0<br>arc_tempreserve = 0 \
MB<br>arc_meta_used = 4621 MB<br>arc_meta_limit = \
20480 MB<br> arc_meta_max = 4732 MB<br>Monday, October 31, 2011 \
07:57:56 PM UTC<br>arc_no_grow = 0<br>arc_tempreserve \
= 0 MB<br>arc_meta_used = 4622 MB<br>arc_meta_limit \
= 20480 MB<br> arc_meta_max = 4732 MB<br><br>Looks like \
metadata was not primarily responsible for consuming all of that 24G of ARC in <a \
href="http://arcstat.pl">arcstat.pl</a> output...<br><br>Also seems nothing \
interesting in /var/adm/messages leading up to my rebooting :<br> <br>Oct 31 18:42:57 \
mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency error 512 PPM exceeds \
tolerance 500 PPM<br>Oct 31 18:44:01 mslvstdp02r last message repeated 1 time<br>Oct \
31 18:45:05 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency error 512 PPM \
exceeds tolerance 500 PPM<br>
Oct 31 18:46:09 mslvstdp02r last message repeated 1 time<br>Oct 31 18:47:23 \
mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency error 505 PPM exceeds \
tolerance 500 PPM<br>Oct 31 19:06:13 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] \
frequency error 505 PPM exceeds tolerance 500 PPM<br>
Oct 31 19:09:27 mslvstdp02r last message repeated 4 times<br>Oct 31 19:25:04 \
mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency error 505 PPM exceeds \
tolerance 500 PPM<br>Oct 31 19:28:17 mslvstdp02r last message \
repeated 3 times<br>
Oct 31 19:46:17 mslvstdp02r ntpd[368]: [ID 702911 daemon.notice] frequency error 505 \
PPM exceeds tolerance 500 PPM<br>Oct 31 19:49:32 mslvstdp02r last message repeated 4 \
times<br>Oct 31 20:44:33 mslvstdp02r genunix: [ID 540533 kern.notice] ^MSunOS Release \
5.11 Version snv_151a 64-bit<br>
Oct 31 20:44:33 mslvstdp02r genunix: [ID 877030 kern.notice] Copyright (c) 1983, \
2010, Oracle and/or its affiliates. All rights reserved.<br><br>Just some ntpd stuff \
really...<br><br>I'm going to check out a firmware upgrade next.. I can't \
believe that this is really an out of memory situation now when there is 17-18G free \
reported by vmstat....<br clear="all"> <br></div></div>Lets \
see...<br><br>Regards,<br>-- <br><div><div style="font-family:Helvetica">Lachlan \
Mulcahy</div><div style="font-family:Helvetica">Senior DBA, </div><div \
style="font-family:Helvetica">Marin Software Inc.</div> <div \
style="font-family:Helvetica">San Francisco, USA</div><div \
style="font-family:Helvetica"><br>AU Mobile: +61 458 448 721<br></div><div \
style="font-family:Helvetica">US Mobile: +1 (415) 867 2839</div><div \
style="font-family:Helvetica"> Office : +1 (415) 671 6080</div></div><br>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic