[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netbsd-port-i386
Subject:    Re: kern/37924: can't reboot halt(8)ed system
From:       "Greg A. Woods" <woods () planix ! com>
Date:       2008-05-02 17:06:59
Message-ID: m1Jryiv-000kn10 () most ! weird ! com
[Download RAW message or body]

This bug is not fixed for me yet.  I still can't get my serial-console
based i386 machine(s) to reboot with a key press after the "press any
key to reboot" message.

It seems that the system gets stuck in the cngetc() call and no amount
of serial data will wake it up.  However asserting a BREAK signal on the
line causes the cngetc() call to return "0" and thus further hang the
system hard until manual intervention with the reset or power buttons.

Here's a recent example (from a kernel with the changes below):

[Thu May  1 12:56:01 2008]db{0}> reboot
[Thu May  1 13:14:27 2008]
[Thu May  1 13:14:27 2008]The operating system has halted.
[Thu May  1 13:14:27 2008]Please press any key to reboot.
[Thu May  1 13:14:27 2008]
[Thu May  1 13:14:44 2008]
[Thu May  1 13:14:44 2008]Cannot read from the console, calling the HLT instruction.
[Thu May  1 13:14:44 2008]
[Thu May  1 13:14:44 2008]RESET or power cycle the system to reboot.
[Thu May  1 13:14:44 2008]

... and there it sat until I hit the reset button.

(As for why a DDB "reboot" caused a halt, I don't know either!  This has
happened twice now with very recent -current kernels, but only after a
panic during boot.)

I've made the following changes to try to support serial flow control
properly (initially thinking it might be the problem) as well as to
better inform the operator of what's going on (I think these are
suitable for import into the tree) and to help me further diagnose
what's going on.

Index: sys/arch/i386/i386/machdep.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/sys/arch/i386/i386/machdep.c,v
retrieving revision 1.632
diff -u -r1.632 machdep.c
--- sys/arch/i386/i386/machdep.c	29 Apr 2008 15:27:08 -0000	1.632
+++ sys/arch/i386/i386/machdep.c	30 Apr 2008 12:43:24 -0000
@@ -848,6 +848,7 @@
 void
 cpu_reboot(int howto, char *bootstr)
 {
+	int keyval = 0;
 
 	if (cold) {
 		howto |= RB_HALT;
@@ -855,6 +856,10 @@
 	}
 
 	boothowto = howto;
+	/*
+	 * XXX this bit, except for the "cold" check above, should be MI --
+	 * i.e. back in kern/kern_xxx.c:sys_reboot()
+	 */
 	if ((howto & RB_NOSYNC) == 0 && waittime < 0) {
 		waittime = 0;
 		vfs_shutdown();
@@ -914,11 +919,10 @@
 	}
 
 	if (howto & RB_HALT) {
-		printf("\n");
-		printf("The operating system has halted.\n");
-		printf("Please press any key to reboot.\n\n");
+		printf("\nThe operating system has halted.\n"
+		       "Please press any key to reboot.\n\n");
 
-#ifdef BEEP_ONHALT
+#ifdef BEEP_ONHALT /* XXX could be:  defined(BEEP_ONHALT_COUNT) && (BEEP_ONHALT_COUNT > 0) */
 		{
 			int c;
 			for (c = BEEP_ONHALT_COUNT; c > 0; c--) {
@@ -931,21 +935,58 @@
 		}
 #endif
 
-		cnpollc(1);	/* for proper keyboard command handling */
-		if (cngetc() == 0) {
-			/* no console attached, so just hlt */
-			for(;;) {
-				x86_hlt();
+		cnpollc(1);	/* for proper keyboard command handling without
+				 * interrupts */
+		/*
+		 * ACK!!!  The line discipline does _NOT_ get used from within
+		 * the kernel for console I/O (though it probably should be).
+		 *
+		 * If any output above went out too fast for the device
+		 * connected to a serial console then we'll read a <CTRL-S>
+		 * here, and/or perhaps a <CTRL-Q>, and we'll just have to
+		 * ignore them.
+		 */
+#define ASCII_XON	0x11
+#define ASCII_XOFF	0x13
+		while (keyval != ASCII_XOFF && keyval != ASCII_XON) {
+			if (cngetc() == 0) {
+				/*
+				 * no console attached, or perhaps a BREAK
+				 * condition caused a read error, so just
+				 * invoke the HLT instruction and wait for the
+				 * operator to push the reset (or power)
+				 * button.
+				 */
+				printf("\nCannot read from the console, calling the HLT instruction.\n\n");
+				printf("RESET or power cycle the system to reboot.\n\n");
+				for(;;) {
+					x86_hlt();
+				}
+				/*NOTREACHED*/
 			}
+#ifdef DEBUG
+			else if (keyval == ASCII_XOFF && keyval != ASCII_XON) {
+				printf("(ignoring flow control char (0x%x)\n", keyval);
+				/* XXX even this could trigger another XOFF, sigh... */
+			}
+#endif
 		}
 		cnpollc(0);
 	}
 
+#ifdef DEBUG
+	if (keyval)
+		printf("(read key value 0x%x)\n\n", keyval);
+#endif
 	printf("rebooting...\n");
 	if (cpureset_delay > 0)
 		delay(cpureset_delay * 1000);
 	cpu_reset();
-	for(;;) ;
+	printf("cpu_reset() returned, waiting for hardware to reset or be reset...\n\n");
+
+	for(;;) {
+		x86_hlt();
+	}
 	/*NOTREACHED*/
 }
 


-- 
						Greg A. Woods
						Planix, Inc.

<woods@planix.com>     +1 416 489-5852 x122     http://www.planix.com/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic