[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gpfsug-discuss
Subject:    Re: [gpfsug-discuss] NSD network checksums (nsdCksumTraditional)
From:       valdis.kletnieks () vt ! edu
Date:       2018-10-31 1:09:40
Message-ID: 122689.1540948180 () turing-police ! cc ! vt ! edu
[Download RAW message or body]

[Attachment #2 (multipart/signed)]


On Tue, 30 Oct 2018 22:52:35 -0000, Bryan Banister said:
> Valdis will also recall how much "fun" we had with network related corruption
> due to what we surmised was a TCP offload engine FW defect in a certain 10GbE
> HCA.  Only happened sporadically every few weeks... what a nightmare that was!!

It makes for quite the bar story, as the symptoms pointed everywhere except
the network adapter.  For the purposes of this thread though, two points to note:

1) The card in question was a spectacularly good price/performer and totally
rock solid in 4 NFS servers that we had - in 6 years of trying, I never managed
to make them hiccup (the one suspected failure turned out to be a fiber cable
that had gotten crimped when the rack door was closed on a loop).

2) Since the TCP offload engine was computing the checksum across the data, but
it had gotten confused about which data it was about to transmit, every single packet
went out with a perfectly correct checksum.

[Attachment #5 (application/pgp-signature)]

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic