[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with invalid bci or invalid scope 
From:       coleen.phillimore () oracle ! com
Date:       2017-02-23 20:09:31
Message-ID: 5dd24127-fbc3-99aa-1508-152515ea775e () oracle ! com
[Download RAW message or body]



On 2/23/17 5:34 AM, Markus Gronlund wrote:
> Hi again Coleen,
> 
> I looked into changing "noreg" to "rax" for the call to \
> SharedRuntime::OSR_migration_begin(), but this is not correct. 
> By explicitly passing rax, we are indicating the interest in receiving some kind of \
> "managed" return value (like an oop in this example). 
> MacroAssembler::call_VM_base() will proceed with storing back the \
> thread->vm_result() oop into rax, instead of leaving the platform specific return \
> value holding the natively allocated buffer. 
> This is because CONSTANT_REGISTER_DECLARATION(rax) == 0 entails \
> oop_result->is_valid() while CONSTANT_REGISTER_DECLARATION(noreg) == -1 entails \
> !oop_result->is_valid(). 
> Therefore I will keep the noreg in place as-is.

Okay, thanks for the explanation!  This is reviewed by me.
Coleen
> 
> Thanks
> Markus
> 
> -----Original Message-----
> From: Markus Gronlund
> Sent: den 23 februari 2017 00:55
> To: Coleen Phillimore; Daniel Daugherty
> Cc: hotspot-runtime-dev@openjdk.java.net
> Subject: RE: RFR(S): 8175178: Stack traversal during OSR migration asserts with \
> invalid bci or invalid scope desc on x86 
> Thanks Coleen,
> 
> I will also update the "noreg" to "rax" before putback.
> 
> Thanks again
> Markus
> 
> -----Original Message-----
> From: Coleen Phillimore
> Sent: den 22 februari 2017 22:21
> To: hotspot-runtime-dev@openjdk.java.net
> Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration asserts with \
> invalid bci or invalid scope desc on x86 
> 
> This change looks good to me.  Although I think this
> 
> call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));
> 
> 
> Should be:
> 
> call_VM(rax, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin));
> 
> 
> Since this returns the OSR buffer in rax.
> 
> 
> On 2/19/17 8:07 AM, Markus Gronlund wrote:
> > Thanks a lot Dan for taking a look at this.
> > 
> > The "failure mode" in a release build is that any unresolved bci will be \
> > unconditionally re-set to 0 (see \
> > vframeStreamCommon::fill_from_interpreter_frame()). The effect of this is that \
> > stacktraces are currently reporting erroneous bci information, especially in the \
> > context of OSR frames, which would normally have a bci > 0 if the OSR is \
> > triggered by backedge counter overflow. 
> > Thanks also for pointing to the other bytecode restore location (@L2213).
> > 
> > I have updated the webrev with your inputs in light of the other redundant \
> > bytecode restore in addition to updates to the copyright header(s). 
> > In addition, I took the opportunity to turn \
> > vframeStreamCommon::found_bad_method_frame() into a DEBUG_ONLY conditional \
> > function. 
> > Updated webrev:
> > 
> > http://cr.openjdk.java.net/~mgronlun/8175178/webrev02/
> > 
> > Regarding:
> > 
> > "It feels like we're missing some infrastructure to prevent the accidental use of \
> > 'r13'."
> We use r13 (and rlocals) as a scratch register in lots of places due to not having \
> enough registers on x86.  As long as there isn't a safepoint (or call into the \
> runtime as you found), this should be perfectly safe, as long as there's a \
> restore_bcp() or restore_locals() call. 
> Coleen
> 
> > I agree with you, lets see what we can do to improve this.
> > 
> > Thanks again
> > Markus
> > 
> > 
> > -----Original Message-----
> > From: Daniel D. Daugherty
> > Sent: den 17 februari 2017 18:51
> > To: Markus Gronlund; hotspot-runtime-dev@openjdk.java.net
> > Subject: Re: RFR(S): 8175178: Stack traversal during OSR migration
> > asserts with invalid bci or invalid scope desc on x86
> > 
> > On 2/17/17 5:29 AM, Markus Gronlund wrote:
> > > Greetings,
> > > 
> > > 
> > > 
> > > Kindly asking for reviews for the following changeset:
> > > 
> > > 
> > > 
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8175178
> > > 
> > > Webrev: http://cr.openjdk.java.net/~mgronlun/8175178/webrev01/
> > src/cpu/x86/vm/templateTable_x86.cpp
> > (old) L2228:       __ load_unsigned_byte(rbx, Address(rbcp, 0));
> > // restore target bytecode
> > I was a little concerned about not restoring the target
> > bytecode into the 'rbx' register, but when I looked at
> > dispatch code:
> > 
> > L2196:     __ bind(dispatch);
> > L2197:   }
> > L2198:
> > L2199:   // Pre-load the next target bytecode into rbx
> > L2200:   __ load_unsigned_byte(rbx, Address(rbcp, 0));
> > 
> > so it looks like rbx gets the target bytecode just fine.
> > 
> > L2213:       __ load_unsigned_byte(rbx, Address(rbcp, 0));  //
> > restore target bytecode
> > L2214:       __ set_method_data_pointer_for_bcp();
> > L2215:       __ jmp(dispatch);
> > I think the "restore target bytecode" here is also redundant.
> > 
> > src/cpu/x86/vm/interp_masm_x86.cpp:
> > 
> > InterpreterMacroAssembler::set_method_data_pointer_for_bcp() {
> > <snip>
> > 
> > push(rbx);
> > 
> > get_method(rbx);
> > 
> > src/cpu/x86/vm/interp_masm_x86.hpp:
> > 
> > void get_method(Register reg) {
> > movptr(reg, Address(rbp,
> > frame::interpreter_frame_method_offset * wordSize));
> > }
> > 
> > set_method_data_pointer_for_bcp() doesn't use the rbx
> > value that we bothered to restore.
> > 
> > OK, so on 64-bit the code saved the nmethod before the call to
> > SharedRuntime::OSR_migration_begin() and it was using 'r13'
> > which is the register we use for 'bcp' in 64-bit. This use of r13/bcp was visible \
> > to stack walkers (because of save_bcp()) and caused the assert() failure. What's \
> > the failure mode in release bits? 
> > This is outstanding sleuthing!
> > 
> > You've switched the save to use 'rbx' on both 64-bit and 32-bit and you've \
> > removed stale code that was using 'rbx' for saving the target bytecode \
> > unnecessarily. 
> > Thumbs up on this change!
> > 
> > Please don't forget to update the copyright before you push.
> > 
> > Dan
> > 
> > P.S.
> > It feels like we're missing some infrastructure to prevent the accidental use of \
> > 'r13'. We have other "special" registers that we guard against being used... \
> > Perhaps we need something for 'r13'. 
> > 
> > > 
> > > 
> > > Summary:
> > > 
> > > 
> > > 
> > > vframeStream stack traversal can assert on x86 when trying to decode an \
> > > interpreter frame that is in the process of being migrated for On-Stack \
> > > Replacement (OSR). This is because the interpreter frame does not have a valid \
> > > bcp, it instead has an nmethod in the bcp slot (the OSR nmethod). 
> > > #
> > > # A fatal error has been detected by the Java Runtime Environment:
> > > #
> > > # Internal Error
> > > (distro/s/hotspot/src/share/vm/runtime/vframe.cpp:472), pid=3624,
> > > tid=3640 # assert(false) failed: invalid bci or invalid scope desc #
> > > 
> > > There is currently a save operation that uses r13 for saving the OSR nmethod \
> > > over the VM call into SharedRuntime::OSR_migration_begin(). This has the \
> > > side-effect of installing the OSR nmethod into the interpreter frame bcp slot. 
> > > 
> > > 
> > > Thanks
> > > 
> > > Markus


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic