[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR 8004124: Handle and/or warn about SI_KERNEL
From:       Mikael Gerdin <mikael.gerdin () oracle ! com>
Date:       2013-06-26 12:25:11
Message-ID: 51CADDA7.50505 () oracle ! com
[Download RAW message or body]



On 2013-06-21 08:43, David Holmes wrote:
> On 21/06/2013 7:08 AM, Coleen Phillimore wrote:
>>
>> This wiki page refers to 64 bit addresses and we've mostly seen this
>> SI_KERNEL linux bug on 32 bit platforms.  I can't remember if we've ever
>> seen it on 64 bit platforms.  If I exclude LP64 would you review this
>> change?
>
> I don't understand what you mean by excluding LP64, the code that is
> added seems to be common to both 32-bit and 64-bit. ??
>
> I would have suggested that for 64-bit you add a check for the
> non-canonical address (it should be checkable -right?) just in case.

I've found no way to determine this from the signal info.
When an intel "General protection fault (#GP)" occurs the CPU does not 
pass on the failing address to the OS.

The "normal" kind of fault you get when you try to read an unmapped 
address is a "Page-Fault Exception (#PF)". When a #PF occurs the CPU 
stores the faulting address in a special register for the OS to pick up.

See §6.15 in http://download.intel.com/products/processor/manual/325384.pdf

/Mikael

>
> Minor nit - I think you want a space after pc in the string:
>
> !               fatal(err_msg("exception happened outside interpreter,
> nmethods and vtable stubs at pc" INTPTR_FORMAT, pc));
>
> David
> -----
>
>   thanks,
>> Coleen
>>
>> On 06/20/2013 04:27 PM, Mikael Gerdin wrote:
>>> Coleen,
>>>
>>> On 06/20/2013 04:48 PM, Coleen Phillimore wrote:
>>>> Summary: Detect this crash in the signal handler and give a fatal error
>>>> message  instead of making us chase down bugs that don't reproduce
>>>>
>>>> This change also has more information for crash site from bug
>>>> https://jbs.oracle.com/bugs/browse/JDK-8007019
>>>>
>>>> guarantee(cb->is_adapter_blob() ||
>>>> cb->is_method_handles_adapter_blob())
>>>> failed: exception happened outside interpreter, nmethods and vtable
>>>> stubs (1) <https://jbs.oracle.com/bugs/browse/JDK-8007019>
>>>>
>>>> There used to be two places that had the same message so they were
>>>> qualified by (1) and (2).   The second one is gone.   Now this prints
>>>> the blob and pc.
>>>>
>>>> Tested with full vm.quick.testlist and the sets of jdi tests that
>>>> failed
>>>> with -client -Xcomp and specjvm98 that used to fail with this signal
>>>> code.   I got one failure two days ago before this change but now it
>>>> won't fail with my new message or at all.
>>>
>>> The error message you added for SI_KERNEL puts the blame
>>> unconditionally on the kernel.
>>> As I mentioned in the bug it's possible to cause this signal
>>> combination by trying to access memory with an invalid memory address
>>> on non-canonical form:
>>> https://en.wikipedia.org/wiki/X86-64#Canonical_form_addresses
>>>
>>> (sorry for the wikipedia link, I don't have the Intel X86_64 manual
>>> page reference at hand)
>>>
>>> Basically, if we trash an object somewhere or the compiler does
>>> something strange we may try to use a random value from memory as an
>>> address and if that address is on non-canonical form we'll say that
>>> the OS is broken when in fact it is probably our fault.
>>>
>>> /Mikael
>>>
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8004124/
>>>> bug link at http://bugs.sun.com/view_bug.do?bug_id=8004124
>>>> local bug link https://jbs.oracle.com/bugs/browse/JDK-8004124
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic