[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-smp
Subject:    
From:       PATROL Quality Assurance Account <patqa1 () vml3 ! bmc ! com>
Date:       2002-06-19 5:09:15
[Download RAW message or body]

As you will see in the text below (dmesg.out), there is a request to send
this to you ladies and gentlemen.  If you figure anything out, or if I can
provide you with any further information, please email me:

pstalder@bmc.com

This is the architecture of the machine:

Dual PIII - 866
1024 MB RAM
ABIT VP6II Motherboard
2 - 40GB WD Caviar 7200 RPM Drives (Software RAID 0)
3com 905b ethernet card
MATROX Millenium graphics card
Creative Labs 52X cdrom
Sony 1.44 MB floppy

Pretty basic stuff, really.  The system crashes at a command prompt with no
X running, or will run for several days with 100% CPU usage, and over 100
applications running.  We had a power outage a few weeks ago, and I suspect
that has something to do with this particular problem.  The other 3 very rarely
go down (one of them has not crashed, yet).

I hope this helps you, and I hope you will be able to help me.  I have not
found a diagnostic tool yet that can figure this one out.

Paul Stalder

0000100000 (usable)
 BIOS-e820: 000000000000d000 @ 000000003fff3000 (ACPI data)
 BIOS-e820: 0000000000003000 @ 000000003fff0000 (ACPI NVS)
127MB HIGHMEM available.
hm, page 00001000 reserved twice.
Scan SMP from c0000000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f0000 for 65536 bytes.
found SMP MP-table at 000f5770
hm, page 000f5000 reserved twice.
hm, page 000f6000 reserved twice.
hm, page 000f7000 reserved twice.
hm, page 000f1000 reserved twice.
hm, page 000f2000 reserved twice.
hm, page 000f3000 reserved twice.
On node 0 totalpages: 262128
zone(0): 4096 pages.
zone DMA has max 32 cached pages.
zone(1): 225280 pages.
zone Normal has max 1024 cached pages.
zone(2): 32752 pages.
zone HighMem has max 255 cached pages.
Intel MultiProcessor Specification v1.1
    Virtual Wire compatibility mode.
OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000
Processor #0 Pentium(tm) Pro APIC version 17
    Floating point unit present.
    Machine Exception supported.
    64 bit compare & exchange supported.
    Internal APIC present.
    SEP present.
    MTRR  present.
    PGE  present.
    MCA  present.
    CMOV  present.
    Bootup CPU
Processor #1 Pentium(tm) Pro APIC version 17
    Floating point unit present.
    Machine Exception supported.
    64 bit compare & exchange supported.
    Internal APIC present.
    SEP present.
    MTRR  present.
    PGE  present.
    MCA  present.
    CMOV  present.
Bus #0 is PCI   
Bus #1 is PCI   
Bus #2 is ISA   
I/O APIC #2 Version 17 at 0xFEC00000.
Int: type 3, pol 0, trig 0, bus 2, IRQ 00, APIC ID 2, APIC INT 00
Int: type 0, pol 0, trig 0, bus 2, IRQ 01, APIC ID 2, APIC INT 01
Int: type 0, pol 0, trig 0, bus 2, IRQ 00, APIC ID 2, APIC INT 02
Int: type 0, pol 0, trig 0, bus 2, IRQ 03, APIC ID 2, APIC INT 03
Int: type 0, pol 0, trig 0, bus 2, IRQ 04, APIC ID 2, APIC INT 04
Int: type 0, pol 0, trig 0, bus 2, IRQ 05, APIC ID 2, APIC INT 05
Int: type 0, pol 0, trig 0, bus 2, IRQ 06, APIC ID 2, APIC INT 06
Int: type 0, pol 0, trig 0, bus 2, IRQ 07, APIC ID 2, APIC INT 07
Int: type 0, pol 1, trig 1, bus 2, IRQ 08, APIC ID 2, APIC INT 08
Int: type 0, pol 0, trig 0, bus 2, IRQ 09, APIC ID 2, APIC INT 09
Int: type 0, pol 0, trig 0, bus 2, IRQ 0c, APIC ID 2, APIC INT 0c
Int: type 0, pol 0, trig 0, bus 2, IRQ 0d, APIC ID 2, APIC INT 0d
Int: type 0, pol 0, trig 0, bus 2, IRQ 0e, APIC ID 2, APIC INT 0e
Int: type 0, pol 0, trig 0, bus 2, IRQ 0f, APIC ID 2, APIC INT 0f
Int: type 0, pol 3, trig 3, bus 2, IRQ 0b, APIC ID 2, APIC INT 0b
Int: type 0, pol 3, trig 3, bus 2, IRQ 0a, APIC ID 2, APIC INT 0a
Lint: type 3, pol 0, trig 0, bus 2, IRQ 00, APIC ID ff, APIC LINT 00
Lint: type 1, pol 0, trig 0, bus 2, IRQ 00, APIC ID ff, APIC LINT 01
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
hm, page 01000000 reserved twice.
Kernel command line: auto BOOT_IMAGE=linux ro root=301 \
BOOT_FILE=/boot/vmlinuz-2.4.2-2smp Initializing CPU#0
Detected 865.248 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1723.59 BogoMIPS
Memory: 1027868k/1048512k available (1500k kernel code, 20256k reserved, 103k data, \
252k init, 131008k highmem) Dentry-cache hash table entries: 131072 (order: 8, \
1048576 bytes) Buffer-cache hash table entries: 65536 (order: 6, 262144 bytes)
Page-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
VFS: Diskquotas version dquot_6.5.0 initialized
CPU: Before vendor init, caps: 0387fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0387fbff 00000000 00000000 00000000
CPU serial number disabled.
CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
CPU: Common caps: 0383fbff 00000000 00000000 00000000
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: Intel
CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000
CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
CPU: Common caps: 0383fbff 00000000 00000000 00000000
CPU0: Intel Pentium III (Coppermine) stepping 03
per-CPU timeslice cutoff: 730.79 usecs.
Getting VERSION: 40011
Getting VERSION: 40011
Getting ID: 0
Getting ID: f000000
Getting LVT0: 700
Getting LVT1: 400
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
CPU present map: 3
Booting processor 1/1 eip 3000
Setting warm reset code and vector.
1.
2.
3.
Asserting INIT.
Waiting for send to finish...
+Deasserting INIT.
Waiting for send to finish...
+#startup loops: 2.
Sending STARTUP #1.
After apic_write.
Initializing CPU#1
CPU#1 (phys ID: 1) waiting for CALLOUT
Startup point 1.
Waiting for send to finish...
+Sending STARTUP #2.
After apic_write.
Startup point 1.
Waiting for send to finish...
+After Startup.
Before Callout 1.
After Callout 1.
CALLIN, before setup_local_APIC().
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 1730.15 BogoMIPS
Stack at about c212dfb8
CPU: Before vendor init, caps: 0387fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check reporting enabled on CPU#1.
CPU: After vendor init, caps: 0387fbff 00000000 00000000 00000000
CPU serial number disabled.
CPU: After generic, caps: 0383fbff 00000000 00000000 00000000
CPU: Common caps: 0383fbff 00000000 00000000 00000000
OK.
CPU1: Intel Pentium III (Coppermine) stepping 03
CPU has booted.
Before bogomips.
Total of 2 processors activated (3453.74 BogoMIPS).
Before bogocount - setting activated=1.
Boot done.
ENABLING IO-APIC IRQs
...changing IO-APIC physical APIC ID to 2 ... ok.
Synchronizing Arb IDs.
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not \
                connected.
..TIMER: vector=49 pin1=2 pin2=0
number of MP IRQ sources: 16.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.... register #01: 00178011
.......     : max redirection entries: 0017
.......     : IO APIC version: 0011
 WARNING: unexpected IO-APIC, please mail
          to linux-smp@vger.kernel.org
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:   
 00 000 00  1    0    0   0   0    0    0    00
 01 003 03  0    0    0   0   0    1    1    39
 02 003 03  0    0    0   0   0    1    1    31
 03 003 03  0    0    0   0   0    1    1    41
 04 003 03  0    0    0   0   0    1    1    49
 05 003 03  0    0    0   0   0    1    1    51
 06 003 03  0    0    0   0   0    1    1    59
 07 003 03  0    0    0   0   0    1    1    61
 08 003 03  0    0    0   0   0    1    1    69
 09 003 03  0    0    0   0   0    1    1    71
 0a 003 03  1    1    0   1   0    1    1    79
 0b 003 03  1    1    0   1   0    1    1    81
 0c 003 03  0    0    0   0   0    1    1    89
 0d 003 03  0    0    0   0   0    1    1    91
 0e 003 03  0    0    0   0   0    1    1    99
 0f 003 03  0    0    0   0   0    1    1    A1
 10 000 00  1    0    0   0   0    0    0    00
 11 000 00  1    0    0   0   0    0    0    00
 12 000 00  1    0    0   0   0    0    0    00
 13 000 00  1    0    0   0   0    0    0    00
 14 000 00  1    0    0   0   0    0    0    00
 15 000 00  1    0    0   0   0    0    0    00
 16 000 00  1    0    0   0   0    0    0    00
 17 000 00  1    0    0   0   0    0    0    00
IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
.................................... done.
calibrating APIC timer ...
..... CPU clock speed is 865.2644 MHz.
..... host bus clock speed is 133.1174 MHz.
cpu: 0, clocks: 1331174, slice: 443724
CPU0<T0:1331168,T1:887440,D:4,S:443724,C:1331174>
cpu: 1, clocks: 1331174, slice: 443724
CPU1<T0:1331168,T1:443712,D:8,S:443724,C:1331174>
checking TSC synchronization across CPUs: passed.
Setting commenced=1, go go go
mtrr: your CPUs had inconsistent variable MTRR settings
mtrr: probably your BIOS does not setup all CPUs
PCI: PCI BIOS revision 2.10 entry at 0xfb3a0, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 0: assuming transparent
Unknown bridge resource 1: assuming transparent
Unknown bridge resource 2: assuming transparent
PCI: Using IRQ router VIA [1106/0686] at 00:07.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.14)
apm: disabled - APM is not SMP safe.
Starting kswapd v1.8
pty: 256 Unix98 ptys configured
block: queued sectors max/low 682541kB/551469kB, 2048 slots per queue
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
    ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:pio, hdd:pio
hda: WDC WD400BB-00AUA1, ATA DISK drive
hdc: WDC WD400BB-00AUA1, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 78165360 sectors (40021 MB) w/2048KiB Cache, CHS=4865/255/63, UDMA(33)
hdc: 78165360 sectors (40021 MB) w/2048KiB Cache, CHS=77545/16/63, UDMA(33)
Partition check:
 hda: hda1 hda2 < hda5 hda6 >
 hdc: [PTBL] [4865/255/63] hdc1 hdc2 < hdc5 >
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 248k freed
Serial driver version 5.02 (2000-08-09) with MANY_PORTS MULTIPORT SHARE_IRQ \
SERIAL_PCI ISAPNP enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10d
md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
(read) hda5's sb offset: 36869056 [events: 00000013]
(read) hdc1's sb offset: 36869056 [events: 00000013]
autorun ...
considering hdc1 ...
  adding hdc1 ...
  adding hda5 ...
created md0
bind<hda5,1>
bind<hdc1,2>
running: <hdc1><hda5>
hdc1's event counter: 00000013
hda5's event counter: 00000013
request_module[md-personality-2]: Root fs not mounted
md.c: personality 2 is not loaded!
do_md_run() returned -22
md0 stopped.
unbind<hdc1,1>
export_rdev(hdc1)
unbind<hda5,0>
export_rdev(hda5)
... autorun DONE.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 8192 buckets, 64Kbytes
TCP: Hash tables configured (established 262144 bind 65536)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem).
raid0 personality registered as nr 2
autodetecting RAID arrays
(read) hdc1's sb offset: 36869056 [events: 00000013]
(read) hda5's sb offset: 36869056 [events: 00000013]
autorun ...
considering hda5 ...
  adding hda5 ...
  adding hdc1 ...
created md0
bind<hdc1,1>
bind<hda5,2>
running: <hda5><hdc1>
hda5's event counter: 00000013
hdc1's event counter: 00000013
md0: max total readahead window set to 2048k
md0: 2 data-disks, max readahead per data-disk: 1024k
raid0: looking at hdc1
raid0:   comparing hdc1(36869056) with hdc1(36869056)
raid0:   END
raid0:   ==> UNIQUE
raid0: 1 zones
raid0: looking at hda5
raid0:   comparing hda5(36869056) with hdc1(36869056)
raid0:   EQUAL
raid0: FINAL 1 zones
zone 0
 checking hdc1 ... contained as device 0
  (36869056) is smallest!.
 checking hda5 ... contained as device 1
 zone->nb_dev: 2, size: 73738112
current zone offset: 36869056
done.
raid0 : md_size is 73738112 blocks.
raid0 : conf->smallest->size is 73738112 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: updating md0 RAID superblock on device
hda5 [events: 00000014](write) hda5's sb offset: 36869056
hdc1 [events: 00000014](write) hdc1's sb offset: 36869056
.
... autorun DONE.
VFS: Mounted root (ext2 filesystem) readonly.
change_root: old root has d_count=3
Trying to unmount old root ... okay
Freeing unused kernel memory: 252k freed
Adding Swap: 1542200k swap-space (priority -1)
Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ...
SMSC Super-IO detection, now testing Ports 2F0, 370 ...
parport0: PC-style at 0x378, irq 7 [PCSPP,EPP]
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport_pc: Via 686A parallel port: io=0x378, irq=7
ip_conntrack (8191 buckets, 65528 max)
3c59x.c:LK1.1.13 27 Jan 2001  Donald Becker and others. \
http://www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt
eth0: 3Com PCI 3c905C Tornado at 0xec00,  00:01:03:1f:6a:c5, IRQ 11
  product code 484e rev 00.3 date 09-16-00
  8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
  MII transceiver found at address 24, status 782d.
  Enabling bus-master transmits and whole-frame receives.
eth0: scatter/gather disabled. h/w checksums enabled
eth0: using NWAY device table, not 8
eth0: using NWAY device table, not 8
APIC error on CPU1: 00(08)
APIC error on CPU0: 00(04)
/dev/vmmon: Module vmmon: registered with major=10 minor=165 tag=$Name: build-1790 $
/dev/vmmon: Module vmmon: initialized
Winbond Super-IO detection, now testing ports 3F0,370,250,4E,2E ...
SMSC Super-IO detection, now testing Ports 2F0, 370 ...
parport0: PC-style at 0x378, irq 7 [PCSPP,EPP]
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport_pc: Via 686A parallel port: io=0x378, irq=7
/dev/vmnet: open called by PID 688 (vmnet-bridge)
/dev/vmnet: hub 0 does not exist, allocating memory.
/dev/vmnet: port on hub 0 successfully opened
bridge-eth0: up
bridge-eth0: attached
/dev/vmnet: open called by PID 706 (vmnet-natd)
/dev/vmnet: hub 8 does not exist, allocating memory.
/dev/vmnet: port on hub 8 successfully opened
APIC error on CPU0: 04(01)
APIC error on CPU0: 01(04)
APIC error on CPU1: 08(02)
APIC error on CPU0: 04(04)
eth0: Too much work in interrupt, status e401.
APIC error on CPU0: 04(01)
APIC error on CPU1: 02(02)
-
To unsubscribe from this list: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic