[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-smp
Subject:    Myrinet 2000 on SuperMicro Dual P4 Redhat 7.1 (fwd)
From:       Grahame Jones <grahame () streamline-computing ! com>
Date:       2001-09-18 16:03:31
[Download RAW message or body]


We are having problems with Myrinet LAai9 cards on a P4 system
with RH 7.1. Our hardware is a s follows :

Motherboard : Supermicro P4DC6
Processors : 2 X 1.7Ghz P4
Memory : Rambus @400Mhz
M3S-SW16 Switch

You will notice an unexpected IO_APIC so I am mailing you the messages 
given by the myrinet cards and also of dmesg. We are having difficulty
with these cards on this particular motherboard, the application which
tests the myrinet card gives a no pause interrupt interface not
responding. Is it likely that the unexpected IO-APIC is the cause
of this problem.


Software

Linux Kernel : V2.4.5

We compile and load the module, and the link is shown as up on the switch. 
When we start the mapper we get the following error

GM: NOTICE: dirvers/gm.c:271:gm_pause_lanai():kernel:
GM: LANai[0] interface not responding (no pause interrupt?)
GM:     isr=0xc16a30c8 eimr=0x40000000
GM:     sram[0] = 0x0 0x86e48326 0x180082f7 0xb03902e0
GM:     DMA len=4096 lar=0x39650 ear=0x0174a1000
GM: NOTICE: dirvers/gm.c:289:gm_pause_lanai():kernel:
GM:      at the moment, pause_rqst = 1

At this point the link light disappears.

The output of gm_debug, gm_board_info and dmesg are as follows :


DMA rate for 4096 Byte transfers (64bit / 66MHz bus)
1st: 8 pages from bogus sdma pg, 8 to bogus rdma
        bus_read  (send) = 146 MBytes/s
        bus_write (recv) = 315 MBytes/s

DMA rate for 4096 Byte transfers (64bit / 66MHz bus)
2nd: sdma a page, rdma a page 8 times
        bus_read  (send) = 146 MBytes/s
        bus_write (recv) = 292 MBytes/s

DMA rate for 4096 Byte transfers (64bit / 66MHz bus)
3rd: sdma and rdma to/from alternating pages (coarse grain)
        bus_read  (send) = 146 MBytes/s
        bus_write (recv) = 292 MBytes/s

DMA rate for 4096 Byte transfers (64bit / 66MHz bus)
4th: sdma and rdma to/from alternating pages (fine grain)
        bus_read  (send) = 146 MBytes/s
        bus_write (recv) = 292 MBytes/s

First two words of LANai globals: 0x10000000, 0xcafebabe
Some counters from LANai

netsend_cnt      0
netrecv_cnt      2
error counters
  bad_header_cnt                                      0
  bad_length_cnt                                      0
  bad_type_cnt                                        0
  badcrc_cnt                                          0
  badroute_cnt                                        0
  bogus_header_cnt                                    0
  drop_cnt                                            2
  handle_connection_reset_request_cnt                 0
  misrouted_cnt                                       0
  nack_cnt                                            0
  nack_down_cnt                                       0
  nack_ignore_close_connection_cnt                    0
  nack_ignore_open_connection_cnt                     0
  nack_ignored_cnt                                    0
  nack_normal_cnt                                     0
  nack_receive_close_connection_cnt                   0
  nack_receive_open_connection_cnt                    0
  nack_received_cnt                                   0
  nack_reject_cnt                                     0
  nack_send_close_connection_cnt                      0
  nack_send_nothing1_cnt                              0
  nack_send_nothing2_cnt                              0    
  nack_send_open_connection_cnt                         0
  no_match_for_datagram_recv_cnt                      0
  no_match_for_ether_recv_cnt                         0
  no_match_for_reliable_recv_cnt                      0
  no_match_for_raw_recv_cnt                           0
  out_of_sequence_cnt                                 0
  resend_cnt                                          0
  short_mapper_config_packet_cnt                      0
  short_mapper_packet_cnt                             0
  short_mapper_scout_packet_cnt                       0
  short_packet_cnt                                    0
  used_bogus_send_cnt                                 0
  used_bogus_recv_cnt                                 0
  zero_len_cnt                                        2

       packet_sexno 0000 0000
       ack_sexno 0000 0000


port counters
  port[2].active_subport_cnt                          0

The output of gm_board_info is as follows:
GM build ID is "1.4 root@bootserv.streamline Fri Aug 31 05:24:34 BST 2001."


Board number 0:
  lanai_clockval    = 0x082082a0
  lanai_cpu_version = 0x0900 (LANai9.0)
  lanai_board_id    = 00:60:dd:7f:69:7e
  lanai_sram_size   = 0x00200000 (2048K bytes)
  fpga_version      = "Thu Dec  9 16:13:40 1999"
  more_version      = ""
  max_lanai_speed   = 0x0086
  board_type        = 0x0003 (GM_MYRINET_BOARD_TYPE_L5+)
  bus_type          = 0x0002 (GM_MYRINET_BUS_PCI)
  product_code      = 0x006d
  serial_number     = 85801
    (should be labeled: "M3S-PCI64B-2-85801")
LANai time is 0x001de78a3 ticks, or about 0 minutes since reset.
This is node 0 (node04)  node_type=0
   *** Node ID not set, mapper not yet run?
Board has room for 8 ports,  3000 nodes/routes,  32768 cache entries
          Port token cnt: send=29, recv=248
Port: Status  PID
   0:   BUSY   969  (this process [gm_board_info])
Route table for this node follows:
The mapper 48-bit ID was: 00:00:00:00:00:00
gmID MAC Address                               Hostname Route
---- ----------------- -------------------------------- 
---------------------
   *** No routes found ***

DMESG
=====

Linux version 2.4.5-nfs (root@node04) (gcc version egcs-2.91.66 
19990314/Linux (egcs-1.1.2 rele
ase)) #8 SMP Sun Sep 16 13:33:05 BST 2001
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000020000000 (usable)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
Scan SMP from c0000000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f0000 for 65536 bytes.
found SMP MP-table at 000f46d0
hm, page 000f4000 reserved twice.
hm, page 000f5000 reserved twice.
hm, page 000f1000 reserved twice.
hm, page 000f2000 reserved twice.
On node 0 totalpages: 131072
zone(0): 4096 pages.
zone(1): 126976 pages.
zone(2): 0 pages.
Intel MultiProcessor Specification v1.4
    Virtual Wire compatibility mode.
OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000
Processor #0 Pentium 4(tm) APIC version 17
    Floating point unit present.
    Machine Exception supported.
    64 bit compare & exchange supported.
    Internal APIC present.
    SEP present.
    MTRR  present.
    PGE  present.
    MCA  present.
    CMOV  present.
    Bootup CPU
Processor #1 Pentium 4(tm) APIC version 17
    Floating point unit present.
    Machine Exception supported.
    64 bit compare & exchange supported.
    Internal APIC present.
    SEP present.
    MTRR  present.
    PGE  present.
    MCA  present.
    CMOV  present.
Bus #0 is PCI
Bus #1 is PCI
Bus #2 is PCI
Bus #3 is PCI
Bus #4 is PCI
Bus #5 is ISA

I/O APIC #2 Version 17 at 0xFEC00000.
Int: type 0, pol 3, trig 3, bus 1, IRQ 00, APIC ID 2, APIC INT 10
Int: type 3, pol 0, trig 0, bus 5, IRQ 00, APIC ID 2, APIC INT 00
Int: type 0, pol 0, trig 0, bus 5, IRQ 01, APIC ID 2, APIC INT 01
Int: type 0, pol 0, trig 0, bus 5, IRQ 00, APIC ID 2, APIC INT 02
Int: type 0, pol 0, trig 0, bus 5, IRQ 03, APIC ID 2, APIC INT 03
Int: type 0, pol 0, trig 0, bus 5, IRQ 04, APIC ID 2, APIC INT 04
Int: type 0, pol 0, trig 0, bus 5, IRQ 06, APIC ID 2, APIC INT 06
Int: type 0, pol 0, trig 0, bus 5, IRQ 07, APIC ID 2, APIC INT 07
Int: type 0, pol 1, trig 1, bus 5, IRQ 08, APIC ID 2, APIC INT 08
Int: type 0, pol 0, trig 0, bus 5, IRQ 09, APIC ID 2, APIC INT 09
Int: type 0, pol 0, trig 0, bus 5, IRQ 0a, APIC ID 2, APIC INT 0a
Int: type 0, pol 0, trig 0, bus 5, IRQ 0b, APIC ID 2, APIC INT 0b
Int: type 0, pol 0, trig 0, bus 5, IRQ 0c, APIC ID 2, APIC INT 0c
Int: type 0, pol 0, trig 0, bus 5, IRQ 0d, APIC ID 2, APIC INT 0d
Int: type 0, pol 0, trig 0, bus 5, IRQ 0e, APIC ID 2, APIC INT 0e
Int: type 0, pol 0, trig 0, bus 5, IRQ 0f, APIC ID 2, APIC INT 0f
Lint: type 3, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 00
Lint: type 1, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 01
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
Kernel command line: auto BOOT_IMAGE=linux ro root=303 
BOOT_FILE=/boot/vmlinuz-2.4.5-nfs
Initializing CPU#0
Detected 1685.169 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 3355.44 BogoMIPS
Memory: 512648k/524288k available (1397k kernel code, 11252k reserved, 480k 
data, 224k init, 0k
highmem)
Dentry-cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
Buffer-cache hash table entries: 32768 (order: 5, 131072 bytes)
Page-cache hash table entries: 131072 (order: 7, 524288 bytes)
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
CPU:     After generic, caps: 3febfbff 00000000 00000000 00000000
CPU:             Common caps: 3febfbff 00000000 00000000 00000000
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au)
mtrr: detected mtrr type: Intel
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 256K   Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 3febfbff 00000000 00000000 00000000
CPU:     After generic, caps: 3febfbff 00000000 00000000 00000000
CPU:             Common caps: 3febfbff 00000000 00000000 00000000
CPU0: Intel(R) Xeon(TM) CPU 1700MHz stepping 0a
per-CPU timeslice cutoff: 731.20 usecs.
Getting VERSION: 50014
Getting VERSION: 50014
Getting ID: 0
Getting ID: f000000
Getting LVT0: 700
Getting LVT1: 400
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
CPU present map: 3
Booting processor 1/1 eip 2000
Setting warm reset code and vector.
1.
2.
3.
Asserting INIT.
Waiting for send to finish...
+Deasserting INIT.
Waiting for send to finish...
+#startup loops: 2.
Sending STARTUP #1.
After apic_write.
Initializing CPU#1
CPU#1 (phys ID: 1) waiting for CALLOUT
Startup point 1.
Waiting for send to finish...
+Sending STARTUP #2.
After apic_write.
Startup point 1.
Waiting for send to finish...
+After Startup.
Before Callout 1.
After Callout 1.
CALLIN, before setup_local_APIC().
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 3368.55 BogoMIPS
Stack at about c189dfbc
CPU: Before vendor init, caps: 3febfbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 12K, L1 D cache: 8K
CPU: L2 cache: 256KCPU:     After generic, caps: 3febfbff 00000000 00000000 
00000000
CPU:             Common caps: 3febfbff 00000000 00000000 00000000
OK.
CPU1: Intel(R) Xeon(TM) CPU 1700MHz stepping 0a
CPU has booted.
Before bogomips.
Total of 2 processors activated (6723.99 BogoMIPS).
Before bogocount - setting activated=1.
Boot done.
ENABLING IO-APIC IRQs
...changing IO-APIC physical APIC ID to 2 ... ok.
Synchronizing Arb IDs.
init IO_APIC IRQs
IO-APIC (apicid-pin) 2-0, 2-5, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not 
connected.
..TIMER: vector=49 pin1=2 pin2=0
number of MP IRQ sources: 16.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.... register #01: 00178020
.......     : max redirection entries: 0017
.......     : IO APIC version: 0020
WARNING: unexpected IO-APIC, please mail
          to linux-smp@vger.kernel.org
WARNING: unexpected IO-APIC, please mail
          to linux-smp@vger.kernel.org
.... register #02: 00000000
.......     : arbitration: 00
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 000 00  1    0    0   0   0    0    0    00
01 003 03  0    0    0   0   0    1    1    39
02 003 03  0    0    0   0   0    1    1    31
03 003 03  0    0    0   0   0    1    1    41
04 003 03  0    0    0   0   0    1    1    49
05 000 00  1    0    0   0   0    0    0    00
06 003 03  0    0    0   0   0    1    1    51
07 003 03  0    0    0   0   0    1    1    59
08 003 03  0    0    0   0   0    1    1    61
09 003 03  0    0    0   0   0    1    1    69
0a 003 03  0    0    0   0   0    1    1    71
0b 003 03  0    0    0   0   0    1    1    79
0c 003 03  0    0    0   0   0    1    1    81
0d 003 03  0    0    0   0   0    1    1    89
0e 003 03  0    0    0   0   0    1    1    91
0f 003 03  0    0    0   0   0    1    1    99
10 003 03  1    1    0   1   0    1    1    A1     11 000 00  1    0    0   
0   0    0    0    00
12 000 00  1    0    0   0   0    0    0    00
13 000 00  1    0    0   0   0    0    0    00
14 000 00  1    0    0   0   0    0    0    00
15 000 00  1    0    0   0   0    0    0    00
16 000 00  1    0    0   0   0    0    0    00
17 000 00  1    0    0   0   0    0    0    00
IRQ to pin mappings:
IRQ0 -> 2
IRQ1 -> 1
IRQ3 -> 3
IRQ4 -> 4
IRQ6 -> 6
IRQ7 -> 7
IRQ8 -> 8
IRQ9 -> 9
IRQ10 -> 10
IRQ11 -> 11
IRQ12 -> 12
IRQ13 -> 13
IRQ14 -> 14
IRQ15 -> 15
IRQ16 -> 16
.................................... done.
calibrating APIC timer ...
..... CPU clock speed is 1685.0421 MHz.
..... host bus clock speed is 99.1200 MHz.
cpu: 0, clocks: 991200, slice: 330400
CPU0<T0:991200,T1:660800,D:0,S:330400,C:991200>
cpu: 1, clocks: 991200, slice: 330400
CPU1<T0:991200,T1:330400,D:0,S:330400,C:991200>
checking TSC synchronization across CPUs: passed.
Setting commenced=1, go go go
mtrr: your CPUs had inconsistent fixed MTRR settings
mtrr: probably your BIOS does not setup all CPUs
PCI: PCI BIOS revision 2.10 entry at 0xfb3e0, last bus=4
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 0: assuming transparent
Unknown bridge resource 0: assuming transparent
Unknown bridge resource 0: assuming transparent
Unknown bridge resource 2: assuming transparent
PCI: Using IRQ router PIIX [8086/2440] at 00:1f.0
PCI->APIC IRQ transform: (B1,I0,P0) -> 16
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
Starting kswapd v1.8                           VFS: Diskquotas version 
dquot_6.4.0 initialized
pty: 256 Unix98 ptys configured
Toshiba System Managment Mode driver v1.9 22/3/2001
Serial driver version 5.05a (2001-03-20) with MANY_PORTS MULTIPORT SHARE_IRQ 
SERIAL_PCI ISAPNP
enabled
Real Time Clock Driver v1.10d
block: queued sectors max/low 340656kB/209584kB, 1024 slots per queue
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller on PCI bus 00 dev f9
PIIX4: chipset revision 4
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
hda: FUJITSU MPF3204AH, ATA DISK drive
hdc: Pioneer DVD-ROM ATAPIModel DVD-106S 011, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 40032696 sectors (20497 MB) w/2048KiB Cache, CHS=2491/255/63, UDMA(66)
hdc: ATAPI 40X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.12
Partition check:
hda: hda1 hda2 hda3 hda4
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
linear personality registered
raid0 personality registered
raid1 personality registered
raid5 personality registered
raid5: measuring checksumming speed
   8regs     :  1656.800 MB/sec
   32regs    :  1100.400 MB/sec
   pIII_sse  :  2182.800 MB/sec
   pII_mmx   :  1910.400 MB/sec
   p5_mmx    :  1941.200 MB/sec
raid5: using function: pIII_sse (2182.800 MB/sec)
md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
autorun ...
... autorun DONE.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 131072 bind 65536)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 224k freed
Adding Swap: 2096472k swap-space (priority -1)
ip_conntrack (4096 buckets, 32768 max)
eepro100.c:v1.09j-t 9/29/99 Donald Becker 
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin 
<saw@saw.sw.com.sg> and others
eth0: Intel Corporation 82557 [Ethernet Pro 100], 00:30:48:11:96:87, IRQ 5.
  Receiver lock-up bug exists -- enabling work-around.
  Board assembly 000000-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
  Receiver lock-up workaround activated.
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
eth0: 0 multicast blocks dropped.




_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

-
To unsubscribe from this list: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic