[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gentoo-sparc
Subject:    [Fwd: More Re: Bug - nfsroot fails with 2 NICs]
From:       Kumba <kumba () gentoo ! org>
Date:       2003-06-22 17:24:53
[Download RAW message or body]

This is interesting, might explain the dual-NIC problem some more. 
Should this go onto other on other arch MLs incase they're running dual 
NICs as described in this mail as well?

--Kumba

["More Re: Bug - nfsroot fails with 2 NICs" (message/rfc822)]

Return-path: <sparclinux-owner@vger.kernel.org>
Received: from mtain13 (mtain13-qfe0.icomcast.net [172.20.3.48])
 by msgstore03.icomcast.net
 (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003))
 with ESMTP id <0HGW00KP76U6NZ@msgstore03.icomcast.net> for
 Kumba12345@comcast.net; Sun, 22 Jun 2003 12:51:42 -0400 (EDT)
Received: from gentoo.org (mail.gentoo.org [204.126.2.42])
 by mtain13.icomcast.net
 (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003))
 with SMTP id <0HGW00BKP6U6IH@mtain13.icomcast.net> for Kumba12345@comcast.net
 (ORCPT Kumba12345@comcast.net); Sun, 22 Jun 2003 12:51:42 -0400 (EDT)
Received: (qmail 14262 invoked by alias); Sun, 22 Jun 2003 16:51:42 +0000
Received: (qmail 3856 invoked from network); Sun, 22 Jun 2003 16:51:42 +0000
Received: from unknown (HELO vger.kernel.org) (209.116.70.75)
 by mail.gentoo.org with SMTP; Sun, 22 Jun 2003 16:51:42 +0000
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand	id
 S264486AbTFVQhQ (ORCPT <rfc822;kumba@gentoo.org> + 1 other); Sun,
 22 Jun 2003 12:37:16 -0400
Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264493AbTFVQhQ
	(ORCPT <rfc822;sparclinux-outgoing>); Sun, 22 Jun 2003 12:37:16 -0400
Received: from netix1.demon.co.uk ([212.228.80.161]:57092 "EHLO netunix.com")
	by vger.kernel.org with ESMTP id S264486AbTFVQhO
 (ORCPT	<rfc822;sparclinux@vger.kernel.org>); Sun, 22 Jun 2003 12:37:14 -0400
Received: from localhost (crn@localhost)	by netunix.com (8.11.3/8.11.3)
 with ESMTP id h5MGrm305249; Sun, 22 Jun 2003 17:53:48 +0100
Date: Sun, 22 Jun 2003 17:53:47 +0100 (BST)
From: "C.Newport" <crn@netunix.com>
Subject: More Re: Bug - nfsroot fails with 2 NICs
In-reply-to: <Pine.LNX.4.33.0306192325590.2807-100000@hek.netunix.com>
Sender: sparclinux-owner@vger.kernel.org
To: sparclinux@vger.kernel.org
Cc: lkml@vger.kernel.org
Message-id: <Pine.LNX.4.33.0306221726530.5233-100000@hek.netunix.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Content-transfer-encoding: 7BIT
Precedence: bulk
Delivered-to: kumba@gentoo.org
X-Mailing-List: sparclinux@vger.kernel.org
Original-recipient: rfc822;Kumba12345@comcast.net


[ Cc to lkml, please include sparclinux in replies ]

I have done some more tests on this one.
The failure occurs whenever there are 2 NICs which use the same
driver. The appears to be a problem in the NIC initialisation
which ends up with part of the initialisation happening on the
wrong NIC, most likely a race of some kind.

A quick search reveals that this is a generic issue rather than
being specific to Sparc. Others in alt.os.linux.slackware have
noted that (on intel where each NIC has its own MAC) each NIC gets
initialised with the MAC of the other NIC which is WIERD.

Until this gets fixed it will be necessary to remove additional
NICs by removing boards before booting from the network. This will
not be possible on some machines with 2 on-board NICs such as
the V100 and V150.

This bug appears to have crept into recent kernels within the last
year or so, between 2.2.20 and 2.2.25. Both 2.4.20 and 2.4.21 are
affected.

On Fri, 20 Jun 2003, C.Newport wrote:

>
> Something appears to be broken in the network initialisation code
> when booting a machine which has more than one NIC from the network.
>
>  =============
> Test case 1 :-
>
> Ultra1 Creater 3D with onboard hme plus second hme on fastwideSE+FE card
> 501-2739  512MB memory.
> boot net ip=rarp root=nfs
>  Loops continually with IP-Config: Incomplete network configuration.
>                         IP-Config: Reopening network devices...
>  This fault occurs on both the 10Mb hub and the 10/100 switched network.
>
> After removing fastwideSE+FE card this machine boots correctly
> with kernel 2.4.20 and mounts the NFS root.
>
>  =============
>
> Test case 2 :-
>
> SS1000E with 2 x SM81, 512MB, 1 x sunlance on each of 2 boards.
>  ( = 4 x SM81, 1G, 2 x sunlance total)
> boot net ip=rarp root=nfs
>  Looking up port of RPC 100003/2 on 192.168.192.24
>  neighbour table overflow
>  neighbour table overflow ........
>
> boot net ip=rarp root=/dev/sda1
>  boots kernel 2.2.25
>
> After removing the 2nd system board
> boot net ip=rarp root=nfs
>  boots kernel 2.2.25 correctly.
>
> This fault does not happen with kernel 2.2.20
>
>  =========
>
> Test case 3 :-
>
> SS20 with 1 x SM71 256Mb sunlance onboard +  hme on fastwideSE+FE card
>
> boot net ip=rarp root=nfs
>  boots correctly to kernel 2.2.25
>  boots correctly to kernel 2.2.20pre2
>
> It seems that sunlance+hme works OK  (at least with 2.2.25)
> If it would help I will build a 2.4.20 tftp image for this machine
> in the morning - it is 0 dark 30 here.

The fault does not occur in this case ( with 2.4.20) because we
have different NICs, sunlance and sunhme.

-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic