[prev in list] [next in list] [prev in thread] [next in thread] 

List:       e1000-devel
Subject:    Re: [E1000-devel] ixgbe and using iptables/SYNPROXY causes random system resets
From:       Christian Ruppert <idl0r () qasl ! de>
Date:       2015-07-01 9:56:45
Message-ID: bc122870a549a0ad7c9eebd7cb2835d2 () qasl ! de
[Download RAW message or body]

On 2015-06-30 22:33, Christian Ruppert wrote:
> On 2015-06-30 21:20, Rustad, Mark D wrote:
>> Christian,
>> 
>>> On Jun 30, 2015, at 1:58 AM, Christian Ruppert <idl0r@qasl.de> wrote:
>>> 
>>> bad news. It didn't work either. :(
>> 
>> That is too bad.
>> 
>>> The system just did a reset tonight and there's nothing useful.
>>> What I did was:
>>> I removed the console= parameter and therefore I added your mentioned 
>>> earlyprintk=
>>> I verified it's working by redirecting a "h" to the sysrq-trigger and 
>>> that's all I got:
>>> [  308.812492] SysRq : HELP : loglevel(0-9) reboot(b) crash(c) 
>>> terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) 
>>> thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) 
>>> show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) 
>>> show-registers(p) show-all-timers(q) unraw(r) sync(s) 
>>> show-task-states(t) unmount(u) show-blocked-tasks(w) 
>>> dump-ftrace-buffer(z)
>>> [4early console in decompress_kernel
>>> 
>>> Decompressing Linux... Parsing ELF... done.
>>> Booting the kernel.
>>> ...
>>> 
>>> So basically still nothing :/
>> 
>> Could you send the full log that was captured via the earlyprintk,
>> just in case I can notice something that is reported there.
> 
> See the attached log but it's basically just from the newly booted 
> kernel
> 
>> 
>>> One mentioned netconsole but I doubt it will be any better if even 
>>> console= or earlyprintk= didn't catch anything.
>> 
>> I agree. It is incredibly unlikely that netconsole can catch anything
>> that earlyprintk can't.
>> 
>>> Do you have any more ideas by chance?
>> 
>> One thing that comes to mind is that some systems will automatically
>> reset what any unrecoverable hardware error occurs. I have had systems
>> set up that way in the past and when such an error occurs, an
>> immediate reset is the result. Have you noticed any BIOS settings
>> related to that? If so, could you change them to SMIs or something? Or
>> is there a different instance of that hardware that you can run this
>> on?
> 
> See below
> 
>> 
>>> In my last mail I summarised our setup and I'm willing to provide as 
>>> much information as I can to get this solved but right now I have no 
>>> more ideas.
>> 
>> I think detailed information on your hardware and BIOS settings, along
>> with whatever log you do get via earlyprintk might help. It may be
>> possible that a software error could trigger an uncorrectable error,
>> but it isn't real common. It sure doesn't behave like a typical kernel
>> panic kind of issue. Oh, and do check any error log that your BIOS
>> might be holding for you.
> 
> We tried Supermicro 5018D-MTF (E3-1281v3), 5017C-MTF (E3-1220L IIRC)
> and a Workstation PC (i5-4460) with an Asus mainboard (H97M-E) and
> it's the same everywhere. All Systems do have 32GB RAM, the two
> Supermicro even ECC. And we only have issues in combination with the
> mentioned X520 NIC AND the SYNPROXY iptables extension.
> mcelog is empty. The 5018D-MTF Event log has nothing either. I checked
> for watchdog related settings in the BIOS but that looked good so far.
> Also causing a test kernel panic resulted in a proper dump as well as
> a valid kernel dump file. I can check the BIOS tomorrow and/or even
> make some pictures of each page/tab in case it might help.

So I've got some more. I attached a tarball that contains IPMI 
screenshots of any BIOS tab/page of one of those 5018D-MTF. It also 
contains a dmesg as well as a very verbose "lspci -nnvvvxxxx".
By the way, did I mention that we're doing bonding/LACP? But that 
shouldn't matter as we only have those issues with x520 NICs AND 
SYNPROXY. We tried some different setups (just 1GE NICs, different 
mainboard, complete different hardware etc.) and it really seems to be 
related to those two parts.
Please let me know if you need any more information.

> 
> 
>> 
>> --
>> Mark Rustad, Networking Division, Intel Corporation

-- 
Regards,
Christian Ruppert

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic