[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-usb-devel
Subject:    2,2,19: Non-broadcast RX packets getting junked on Linksys USB100TX.... Pegasus & OHCI on Cyrix Medi
From:       Chris Worley <cworley () symbionsys ! com>
Date:       2001-04-26 20:02:19
[Download RAW message or body]

Here we go again.  It's been a while since I tried a MediaGX chipset with a 
Linksys USB ethernet... still having problems.

Today I'm using the 2.2.19 kernel and having the same problem reported on 
11/03/2000 in kernels 2.4.0-test10-pre??? and fixed in 2.4.0-test11-pre1 
(pegasus.c version 0.4.16).

The problem was strange an seemingly unrelated to the fix: the interface would 
appropriately configure without a hint of errors, but could only respond to arp 
requests and broadcast pings.

The solution I found was to put back in the "set_intr_interval" function, that 
allowed eeprom writes.

Petkan found that the problem actually required that the adapter be reset before 
the interrupt interval could be fetched from the adapter.

All relevant email is attached...

Chris


The initial problem report:

Chris Worley wrote:

>...
> Fri, 03 Nov 2000 15:18:07 -0700
> Chris Worley wrote:
> 
>> Without this patch, my linksys USB100TX seems to work fine, yet does
>> absolutely nothing but increment the TX and RX counters (i.e., try to
>> ping out from within, or vice-versa, causes ifconfig's RX & TX
>> counters to increment, but 100% packet loss from ping).  As if the
>> packets are garbled or a bad route (the routes are correct).  Note: no
>> net errors are reported.
> 
> 
> Update:  It looks like non-broadcast received packets are bad.
> 
> Other machines requesting ARP info works (they "who has" request with
> a broadcast to the network, and the machine with the Linksys USB100TX
> replies back with it's proper MAC address).  If other machines
> broadcast ping (ping 255.255.255.255), the machine with the Linksys
> USB100TX replies.  Other than that, no packets are processed.
> 
> Running tcpdump on the machine with the Linksys USB100TX shows IPX
> packets, arp requests, and broadcast pings being received, but nothing
> else.  Even though I instrumented the driver to printk at every
> received packet.  Lots of packets were being received and passed up to
> the kernel, but non made their way to tcpdump... as if the destination
> address is incorrect, and they're discarded, unless they're a
> broadcast packet.
> 
> TX packets work: the arp return and ping reply are addressed to the
> requestor, and they're getting the correct reply (in the case of arp,
> arp on the requesting machine knows the proper mac address of the
> machine with the Linksys USB100TX, in the broadcast ping, the
> requesting machine shows the correct IP address of the machine with
> the Linksys USB100TX that responded).
> 
> I've tried comparing the test9 and test10 pegasus driver... trying to
> take educated guesses at what to back out.  No luck.  The problem
> persists.
> 
> Chris

Trying to fix the problem by successively applying pieces of patches:
Chris Worley wrote:
 > Chris Worley wrote:
 >
 >> The 0.4.13 dated 10/9 exhibits the problem if it's loaded after
 >> rebooting, but, it works if 0.4.11 is loaded, ifconfig'd,
 >> un-ifconfig'd, unloaded, then 0.4.13 is loaded and ifconfig'd.
 >
 >
 > I was able to successively install differences between 0.4.13 and
 > 0.4.11 pegasus.c versions to find why my Linksys, SMC, and Accton
 > pegasus based USB adapters quit working with kernel 2.4.0-test10.
 >
 > The call to "set_intr_interval" had been removed (WARNING: Petkan, I
 > recall, once told me he was removing eeprom writes because they were
 > burning up his adapters).  When I put the call back in (and set the
 > flag to allow eeprom writes), I was able to get the pegasus.c in
 > 2.4.0-test10 working (Note: I tested this from power-off to be sure
 > I'd found the culprit).
 >
 > I don't have a clue as to why this patch works.  The functional
 > difference is that it reads the interrupt interval from the pegasus
 > chip earlier than the 0.4.13 version had, then, it writes the read
 > interval back to the eeprom and sets an "EEPROM_LOAD" flag for
 > 1000usecs.  Not knowing the semantics of this chip, I'm guessing
 > that's the proper way to write the eeprom.  Why we're writing what we
 > just read confuses me further.
 >
 > My guess is, the problem is probably related to the Cyrix MediaGX OHCI
 > implementation (on the CX5530 I/O chip), and the EEPROM write to the
 > pegasus chip somehow unscrambles the MediaGX's brains.
 >
 > Any other ideas on why this patch works?
 >
 > How dangerous is the EEPROM write?
 >
 > Given Petkan's earlier warning, I doubt anybody should try this patch
 > ;)
 >
 > Chris
 >
 >
 > ------------------------------------------------------------------------
 >
 > *** drivers/usb/pegasus.c.bak	Fri Nov  3 00:57:13 2000
 > --- drivers/usb/pegasus.c	Wed Nov  8 21:23:10 2000
 > ***************
 > *** 52,58 ****
 >
 > static const char *version = __FILE__ ": v0.4.13 2000/10/13 (C) 1999-2000 
Petko Manolov (petkan@dce.bg)";
 >
 > !
 > #define	PEGASUS_USE_INTR
 >
 >
 > --- 52,58 ----
 >
 > static const char *version = __FILE__ ": v0.4.13 2000/10/13 (C) 1999-2000 
Petko Manolov (petkan@dce.bg)";
 >
 > ! #define PEGASUS_WRITE_EEPROM
 > #define	PEGASUS_USE_INTR
 >
 >
 > ***************
 > *** 483,488 ****
 > --- 483,502 ----
 > warn( __FUNCTION__ " failed" );
 > return	-1;
 > }
 > +
 > + static void set_intr_interval( pegasus_t *pegasus )
 > + {
 > +   __u16	d;				
 > +   __u8	tmp;
 > +
 > +   get_interrupt_interval( pegasus );
 > +   ((__u8 *)&d)[1] = pegasus->intr_interval;
 > +   write_eprom_word( pegasus, 4, d );
 > +   get_registers( pegasus, EthCtrl2, 1, &tmp );
 > +   set_register( pegasus, EthCtrl2, tmp | EPROM_LOAD );
 > +   udelay( 10000 );
 > +   set_register( pegasus, EthCtrl2, tmp );
 > + }
 > #endif	/* PEGASUS_WRITE_EEPROM */
 >
 > static inline void get_node_id( pegasus_t *pegasus, __u8 *id )
 > ***************
 > *** 966,971 ****
 > --- 980,989 ----
 > warn( "can't locate MII phy, using default" );
 > pegasus->phy = 1;
 > }
 > +
 > + #ifdef	PEGASUS_WRITE_EEPROM
 > + 	set_intr_interval( pegasus );
 > + #endif
 >
 > info( "%s: %s", net->name, usb_dev_id[dev_indx].name );
 >
 > pegasus.c-0.4.13.diff

Petkan's reply to the fix:

Petko Manolov wrote:
 > Chris Worley wrote:
 >
 >> The call to "set_intr_interval" had been removed (WARNING: Petkan, I
 >> recall, once told me he was removing eeprom writes because they were
 >> burning up his adapters).  When I put the call back in (and set the
 >> flag to allow eeprom writes), I was able to get the pegasus.c in
 >> 2.4.0-test10 working (Note: I tested this from power-off to be sure
 >> I'd found the culprit).
 >>
 >> I don't have a clue as to why this patch works.  The functional
 >> difference is that it reads the interrupt interval from the pegasus
 >> chip earlier than the 0.4.13 version had, then, it writes the read
 >> interval back to the eeprom and sets an "EEPROM_LOAD" flag for
 >> 1000usecs.  Not knowing the semantics of this chip, I'm guessing
 >> that's the proper way to write the eeprom.  Why we're writing what we
 >> just read confuses me further.
 >>
 >> My guess is, the problem is probably related to the Cyrix MediaGX OHCI
 >> implementation (on the CX5530 I/O chip), and the EEPROM write to the
 >> pegasus chip somehow unscrambles the MediaGX's brains.
 >
 >
 >
 > Nobody knows why most of the Pegasus problems are with this chip (Cyrix
 > MediaGX - Cyrille has the same), but this becomes symptomatic.
 >
 > Anyway i don't see why writing to Pegasus' MII register will kick HC's
 > brain. ADMtek has bug in writing eeprom procedure as described in their
 > data sheet. It took me long time first to catch it then workaround it.
 >
 > The other think i noticed that in 0.4.11 i have set interrupt interval
 > at the end of probe() routine. This is wrong as you have to reset the
 > adapter in order to get already written value used. Now it is moved
 > just before reset_mac().
 >
 > And the last - why the hell moving get_interrupt_interval() from probe()
 > to open() should make sense as in 0.4.13 there is no write to eeprom?!?
 >
 > You can test this patch which is against test11-pre1.
 >
 >
 >
 >> Any other ideas on why this patch works?
 >>
 >> How dangerous is the EEPROM write?
 >
 >
 >
 > It is still not proven that burning was caused by eeprom writes, even
 > the symptoms point broken eeprom. It happened in time when big changes
 > to uhci drivers were made (IIRC around test-9).
 >
 >
 > Petkan
 >
 >



_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
http://lists.sourceforge.net/lists/listinfo/linux-usb-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic