[prev in list] [next in list] [prev in thread] [next in thread]
List: freebsd-ppc
Subject: lr=u_trap+0x10 and ssr0=k_trap+0x28 for "stopped at 0 illegal instruction 0" before-copyright hang o
From: Mark Millard <markmi () dsl-only ! net>
Date: 2014-09-19 3:57:12
Message-ID: 72535F89-3942-45A6-B351-7F746209ED9F () dsl-only ! net
[Download RAW message or body]
I modified DDB to automatically "show registers" even at the early "before Copyright" \
crash time. The end of this note will show the /usr/src/sys/ddb/db_script.c diff for \
the hack. While I also had DDB bt, the bt does not actually print a back trace for \
this context. (It might for others.)
The registers give interesting context despite the lack of a back trace. I do not \
know if it will be sufficient to be of much immediate help if someone used the \
information to start looking at the problem.
I'll start with register lr: 0x1026f0 u_trap+0x10.
/usr/src/sys/powerpc/aim/trap_subr64.S has:
s_trap:
bf 17,k_trap /* branch if PSL_PR is false */
GET_CPUINFO(%r1)
u_trap:
ld %r1,PC_CURPCB(%r1)
mr %r27,%r28 /* Save LR, r29 */
mtsprg2 %r29
bl restore_kernsrs /* enable kernel mapping */
mfsprg2 %r29
mr %r28,%r27
/*
* Now the common trap catching code.
*/
k_trap:
FRAME_SETUP(PC_TEMPSAVE)
/* Call C interrupt dispatcher: */
trapagain:
and so this appears to indicate a pending return to execute the "mfsprg2 %r29" after \
"bl restore_kernsrs", which indicates that restore_kernsrs should be active.
But register srr0 indicates: 0x102720 k_trap+0x28. (So apparently in \
FRAME_SETUP(PC_TEMPSAVE) someplace.)
So it appears to me that the processor got to the k_trap code during the supposed \
restore_kernsrs time frame. (But I'm no expert at these sorts of things or for the \
processor.)
I'll list the other register values:
r0: 0
r1: 0
r2: 0xc1be80 M_AUDITBSM
r3: 0xb16138
r4: 0x8926e8 .ofwcall+0xa8
r5: 0
r6: 0xbb5f90
r7: 0xe3d118 ofw_real_mode
r8: 0x1
r9: 0xe0ce80 __pcpu
r10: 0x1c35ec9
r11: 0
r12: 0x10000000
r13: db890 thread0
r14-r19: all 0
r20: 0x10bc000
r21: 0x4
r22: 0x1801db4
r23: 0x1803a28
r24: 0xc000000000008760
r25: 0xcc6908 smp_no_rendevous_barrier
r26: 0xec79e0 ofw_rendezvous_dispatch (yep one has v and the other zv)
r27: 0x8926e8 .ofwcall+0xa8
r28: 0x8926e8 .ofwcall+0xa8 (yep: same value)
r29: 0x24000022
r30: 0x9000000000001032
r31: 0xc7f488 vop_unlock_desc
ctr: 0xff846d78
cr: 0x2000d7b0
xer: 0
dar: 0xfffffffffffffd50
dsisr: 0x42000000
(Hopefully this manual transcription from the screen display is complete --and also \
accurate for what it does present.)
The personal HACK to /usr/src/sys/ddb/db_script.c's db_script_kdbenter(...) to have \
it show registers and try bt...
$ cd /usr/src/sys/ddb/
$ svnlite diff .
Index: db_script.c
===================================================================
--- db_script.c (revision 271610)
+++ db_script.c (working copy)
@@ -319,10 +319,25 @@
{
char scriptname[DB_MAXSCRIPTNAME];
+ /* HACK!!! : Additional lines to force a basic default script to exist.
+ * Will dump information even if ddb input is not available for early crash.
+ * Used to get more information about PowerMac G5 "before Copyright" hangs.
+ */
+ struct ddb_script *dsp = db_script_lookup(DB_SCRIPT_KDBENTER_DEFAULT);
+ if (!dsp) db_script_set(DB_SCRIPT_KDBENTER_DEFAULT, "show registers; bt");
+
snprintf(scriptname, sizeof(scriptname), "%s.%s",
DB_SCRIPT_KDBENTER_PREFIX, eventname);
if (db_script_exec(scriptname, 0) == ENOENT)
(void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
+
+ /* HACK!!! : Additional lines to always use the default script,
+ * even if scriptname existed and was executed.
+ * Will dump information even if ddb input is not available for early crash.
+ * Used to get more information about PowerMac G5 "before Copyright" hangs.
+ */
+ else
+ (void)db_script_exec(DB_SCRIPT_KDBENTER_DEFAULT, 0);
}
/*-
===
Mark Millard
markmi at dsl-only.net
On Sep 16, 2014, at 9:28 PM, Mark Millard <markmi at dsl-only.net> wrote:
In part I sent directly to you because of a past exchange (July-27) where you had \
written:
> Nathan and I both speculate that it's
> dropping into Open Firmware (we make extensive use of OFW), and then
> messing something up, taking a page fault or something.
The specific text that I report and its uniformity when it is produced seems to add a \
little information beyond a speculated "page fault or something" and so might \
eventually help a little. As I understand the text it is reporting execution reaching \
address zero without any prior un-handled exceptions or other such that would stop \
it. A corrupted stack (pointer) so a bad return address or some such? I'd guess there \
are no explicit jumps to address zero so I expect that indirection is likely \
involved, with the content for the indirection messed up.
I really wish that I had a logic analyzer configuration for this. I've not found a \
way to make the failing context visible so far and the extra way of looking at things \
might have helped.
===
Mark Millard
markmi@dsl-only.net
On Sep 16, 2014, at 8:28 PM, Justin Hibbits <chmeeedalf@gmail.com> wrote:
Hi mark,
I see this on my G5, and I think it's due to the amount of RAM in the machine. More \
than 4gb seems to confuse open firmware when called by FreeBSD. There is some effort \
to remove the need of the callbacks but thus far it's not far along. The good news is \
that after it boots it's solid except when switching vtys, buy earlier this year or \
last year I added a sysctl hack to disable the call into open firmware on vty switch \
(don't recall offhand and not at my computer right now, but if you grep the sysctl \
output for reset and ofw you can find it).
-Justin
On Sep 16, 2014 8:01 PM, "Mark Millard" <markmi@dsl-only.net> wrote:
I've now spent time with rebooting and power-off/power-on for all 3 PowerMac G5's \
(one PowerMac7,2 and two PowerMac11,2's) and all 3 get the
> GDB: no debug ports present
> KDB: debugger backends: DDB
> KDB: current backend: DDB
> [ thread pid -1 tid 1006665719 ]
> Stopped at 0: illegal instruction 0
> db>
when they fail just before the Copyright notice would normally be displayed. None \
fail any earlier. At that spot none have failed any other way. It is the same SSD in \
all 3. (Happens with other SSD's as well.) Overall there is a mix of Radeon and \
NVIDIA display boards. Besides the SSD use and RAM upgrades the rest is stock \
equipment. scons used, not vt. (I've yet to try vt.)
Seeing a failure after the Copyright notice as been fairly rare in all my experiments \
from when I started last April or so. The ones that I've noted had Data Storage \
Interrupt reported. So far no examples of the above have been reported after the \
Copyright notice. So I'd guess that they are separate issues. Of course it seems that \
only in the last few days would I have seen the above sort of thing if it did happen \
after the Copyright notice: The prior history does not count for judgements about \
that.
===
Mark Millard
markmi at dsl-only.net
On Sep 16, 2014, at 8:15 AM, Mark Millard <markmi@dsl-only.net> wrote:
Using 10.1-BETA1 I added "options DDB" and "options GDB" to powerpc64's GENERIC64. (I \
also used WITH_DEBUG_FILES=, WITHOUT_CLANG=, and WITH_DEBUG= in /etc/make.conf.) So \
buildworld, kernel was basically just set up to have more of a debugging context \
around (including for any ports builds).
The result was new information about the PowerMac G5 boot hangups: The screen is no \
longer blank when the G5 is hung up without there being a Copyright notice yet. It \
says...
> GDB: no debug ports present
> KDB: debugger backends: DDB
> KDB: current backend: DDB
> [ thread pid -1 tid 1006665719 ]
> Stopped at 0: illegal instruction 0
> db>
(I had no ability to input at that point.) Normally the Copyright notice would have \
displayed instead of "[...]" and what follows. (I do not claim to have all the \
spacing, capitalization, and such correct above.)
That text is constant from hang to hang when it hangs just before it would normally \
output the Copyright notice: The numbers do not vary, much less the other text. It \
has never failed until after the two KDB messages are present. So far I've only \
tested one PowerMac G5, booting over and over for a few hours.
(I do not claim to be set up for remote kernel debugging. I just decided to let GDB \
go along for the ride when I added DDB.)
===
Mark Millard
markmi at dsl-only.net
_______________________________________________
freebsd-ppc@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-ppc
To unsubscribe, send any mail to "freebsd-ppc-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic