[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Re: kgzip(1) is broken
From:       Devin Teske <devin.teske () fisglobal ! com>
Date:       2013-01-16 19:04:54
Message-ID: 23AAEBCB-6438-42EB-9B2E-E657CFC3BA1B () fisglobal ! com
[Download RAW message or body]


On Jan 15, 2013, at 5:07 PM, Steven Hartland wrote:

> 
> ----- Original Message ----- From: <dteske@freebsd.org>
> To: "'Ian Lepore'" <freebsd@damnhippie.dyndns.org>
> Cc: <freebsd-hackers@freebsd.org>; <dteske@freebsd.org>
> Sent: Wednesday, January 16, 2013 12:56 AM
> Subject: RE: kgzip(1) is broken
> 
> 
> > > -----Original Message-----
> > > From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org]
> > > Sent: Tuesday, January 15, 2013 4:43 PM
> > > To: Devin Teske
> > > Cc: dteske@freebsd.org; freebsd-hackers@freebsd.org
> > > Subject: RE: kgzip(1) is broken
> > > On Tue, 2013-01-15 at 16:10 -0800, Devin Teske wrote:
> > > > 
> > > > > -----Original Message-----
> > > > > From: Devin Teske [mailto:devin.teske@fisglobal.com] On Behalf Of
> > > > > dteske@freebsd.org
> > > > > Sent: Tuesday, January 15, 2013 3:10 PM
> > > > > To: 'Ian Lepore'
> > > > > Cc: freebsd-hackers@freebsd.org; dteske@freebsd.org
> > > > > Subject: RE: kgzip(1) is broken
> > > > > 
> > > > > 
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org]
> > > > > > Sent: Tuesday, January 15, 2013 3:05 PM
> > > > > > To: dteske@freebsd.org
> > > > > > Cc: freebsd-hackers@freebsd.org
> > > > > > Subject: Re: kgzip(1) is broken
> > > > > > 
> > > > > > On Tue, 2013-01-15 at 13:27 -0800, dteske@freebsd.org wrote:
> > > > > > > Hello,
> > > > > > > 
> > > > > > > I have been sad of-late because kgzip(1) no longer produces a usable
> > > > kernel.
> > > > > > > 
> > > > > > > All versions of 9.x suffer this.
> > > > > > > 
> > > > > > > And somewhere between 8.3-RELEASE-p1 and 8.3-RELEASE-p5 this
> > > recently
> > > > > > broke in
> > > > > > > the 8.x series.
> > > > > > > 
> > > > > > > I haven't tried the 7 series lately, but if whatever is making the
> > rounds
> > > > > gets
> > > > > > > MFC'd that far back, I expect the problem to percolate there too.
> > > > > > > 
> > > > > > > The symptom is that the machine reboots immediately and unexpectedly
> > > the
> > > > > > moment
> > > > > > > the kernel is executed by the loader.
> > > > > > > 
> > > > > > > This is quite troubling and I am looking for someone to help find the
> > > > > culprit. I
> > > > > > > don't know where to start looking.
> > > > > > 
> > > > > > Here are some possible candidates from the things that were MFC'd to 8
> > > > > > in that timeframe.  I haven't looked at what these do, they're just
> > > > > > changes that affect files related to booting.
> > > > > > 
> > > > > > r233211
> > > > > > r233377
> > > > > > r233469
> > > > > > r234563
> > > > > > 
> > > > > 
> > > > > Thanks Ian!
> > > > > 
> > > > > I'll test each one individually to see if regressing any one (or all)
> > > > addresses
> > > > > the problem.
> > > > 
> > > > Progress...
> > > > 
> > > > Looks like I found the culprit.
> > > > 
> > > > Turns out it's a back-ported bxe(4) driver (back-ported from 9 -- where
> > kgzip
> > > > seems to never work).
> > > > 
> > > > I wonder why back-porting bxe(4) from stable/9 to releng/8.3 would cause
> > > kgzip
> > > > to produce non-working kernels.
> > > > 
> > > Yeah, it'll be interesting to see how a device driver can lead to "the
> > > machine reboots immediately and unexpectedly the moment the kernel is
> > > executed by the loader," which I took to mean "before seeing the
> > > copyright or anything."
> > Indeed... loader throws up the syms and upon execution *KABOOM* (screen goes
> > black and back to POST)
> > The copyright never appears.
> > > > I'm emailing the maintainers (davidch + other Broadcom folk)
> > The current dossier is even more interesting... the back-ported driver (with
> > zero modifications mind you from stable/9 to stable/8) exhibits memory failures
> > (example below), and causes terminals to become wedged when attempting to (for
> > example) scp a file over an existing configured network (igb-based -- presumably
> > unrelated to bxe but in practice loading bxe causes igb to misbehave).
> > $ ifconfig bxe0 inet 192.168.1.5/24
> > bxe0: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Cannot fill
> > fp[00] RX chain.
> > bxe0: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborting!
> > $ ifconfig bxe1 inet 192.168.1.6/24
> > bxe1: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Cannot fill
> > fp[00] RX chain.
> > bxe1: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborting!
> > (as expected, also sent mail off to maintainers w/respect to above notes/errors)
> 
> Sounds like you may be out of mbufs which is easy, on a box with 4 igb's simply
> booting without tuning with cause this so, if you have igb's and bxe's this
> could be your cause.
> 
> Try adding the following to loader.conf and see if it helps:-
> kern.ipc.nmbclusters=51200
> 

Sorry for delayed response -- we had to go through a power cycle.

I haven't yet tried bumping the value as suggested, but I suspect it will indeed help \
greatly -- I noticed that I got 18% into the scp before things took a dive for the \
worse (hanging terminals and such).

Another thing worth noting about the uplifted bxe(4) plopped into RELENG_8… when we \
rebooted:

bxe0: ../../../dev/bxe/if_bxe.c(6419): Slowpath queue is full!
bxe0: ---------- Begin crash dump ----------
bxe0: ----------  End crash dump  ----------
bxe0: ../../../dev/bxe/if_bxe.c(6419): Slowpath queue is full!
bxe0: ---------- Begin crash dump ----------
bxe0: ----------  End crash dump  ----------
bxe0: ../../../dev/bxe/if_bxe.c(3262): fp[01] client ramrod halt failed!

Heh. The machine had to be hard cycled.
-- 
Devin

_____________
The information contained in this message is proprietary and/or confidential. If you \
are not the intended recipient, please: (i) delete the message and all copies; (ii) \
do not disclose, distribute or use the message in any manner; and (iii) notify the \
sender immediately. In addition, please be aware that any message addressed to our \
domain is subject to archiving and review by persons other than the intended \
recipient. Thank you. _______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic