[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR: 8283179: SA tests fail with "ERROR: catch_mach_exception_raise: Message doesn't denote a Un
From:       Chris Plummer <cjplummer () openjdk ! java ! net>
Date:       2022-04-27 18:00:39
Message-ID: LsaM0Pe9yc_Yd91rGw3kp6m0Xf1AooTGL3SgUU-0rFA=.1a8c7482-9494-4d3e-8a44-d82e298ba863 () github ! com
[Download RAW message or body]

On Tue, 19 Apr 2022 04:01:16 GMT, Chris Plummer <cjplummer@openjdk.org> wrote:

> > During the SA attach process on macOS, SA installs an exception handler and \
> > expects to get an EXC_SOFTWARE exception for the expected SIGSTOP signal. On \
> > aarch64, sometimes it instead gets an EXC_BAD_INSTRUCTION exception. I found if I \
> > changed the code to just ignore this, it eventually gets the EXC_SOFTWARE \
> > exception and everything is fine. 
> > The EXC_BAD_INSTRUCTION is from the following code in the JIT and is expected \
> > behavior: 
> > 
> > bool NativeInstruction::is_sigill_zombie_not_entrant() {
> > return uint_at(0) == 0xd4bbd5a1; // dcps1 #0xdead
> > }
> > 
> > void NativeIllegalInstruction::insert(address code_pos) {
> > *(juint*)code_pos = 0xd4bbd5a1; // dcps1 #0xdead
> > }
> > 
> > void NativeJump::patch_verified_entry(address entry, address verified_entry, \
> >                 address dest) {
> > ...
> > if (Assembler::reachable_from_branch_at(verified_entry, dest)) {
> > ...
> > } else {
> > // We use an illegal instruction for marking a method as
> > // not_entrant or zombie.
> > NativeIllegalInstruction::insert(verified_entry);
> > } 
> > 
> > 
> > I fixed SA to retry each time it gets EXC_BAD_INSTRUCTION. I also did the same \
> > for EXC_BAD_ACCESS. We saw this once on x64, and it looks like it is for a \
> > trap-based NPE, given that the bad address that was provided as part of the \
> > exception data is for address 0x8. 
> > Although the fix is somewhat straight forward, the context in which it is run is \
> > not. SA does a `ptrace_attach()`. This seems to result in a SIGSTOP being \
> > generated. SA provides an exception handler called \
> > `catch_mach_exception_raise()`. It gets called for any pending exception when SA \
> > calls `mach_msg(&exc_msg.header, MACH_RCV_MSG, ...)` from `wait_for_exception()`. \
> > So `catch_mach_exception_raise()` should see the SIGSTOP, return KERN_SUCCESS \
> > (which means to ignore the exception), and then `mach_msg()` returns to \
> > `wait_for_exception()`, which checks for success of having received the SIGSTOP, \
> > and then allows SA to continue with its attach process (like suspending all \
> > threads). 
> > The details of attaching and exception handling are actually much uglier than \
> > that, and I don't pretend to even understand half of it. \
> > https://www.spaceflint.com/?p=150 is an interesting read if you are looking for \
> > more details.
> 
> Chris Plummer has updated the pull request incrementally with one additional commit \
> since the last revision: 
> Minor comment fix.

Thanks for the reviews Alex and David!

-------------

PR: https://git.openjdk.java.net/jdk/pull/8255


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic