[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR JDK-8195639: [Graal] nsk/jvmti/PopFrame/popframe009 fail with Graal in -Xcomp
From:       "serguei.spitsyn () oracle ! com" <serguei ! spitsyn () oracle ! com>
Date:       2018-11-28 21:52:06
Message-ID: 1a9f8c15-9fb3-a969-22fc-9529487ce931 () oracle ! com
[Download RAW message or body]

On 11/28/18 13:43, dean.long@oracle.com wrote:
> I guess even with -Xcomp, there are still compiles triggered by 
> profiled code, otherwise I don't see why changing policy()->event() 
> helps.
>
> Another problem I see with this change is that it will make the real 
> problem harder to reproduce.

True.
But on the other hand the tests become more stable. :)

Thanks,
Serguei

>
> dl
>
> On 11/28/18 12:36 PM, Alex Menkov wrote:
>> Hi guys,
>>
>> I was able to reproduce the issue only with Graal + -Xcomp and only 
>> with debug build (I tried Graal + -Xint, Graal without Xcomp/Xint, 
>> just -Xcomp (i.e. C1 is used)).
>>
>> --alex
>>
>> On 11/28/2018 01:27, serguei.spitsyn@oracle.com wrote:
>>> On 11/28/18 01:07, dean.long@oracle.com wrote:
>>>> OK, reconsidering solution 1, I have a couple more questions:
>>>>
>>>> The bug report says the problem is reproducible with -Xcomp, but I 
>>>> don't see the policy()->event() path taken with -Xcomp. Instead, 
>>>> -Xcomp uses CompilationPolicy::compile_if_required, so I don't see 
>>>> how Solution 1 fixes the problem with -Xcomp.
>>>>
>>>> > CompilerRuntime always calls policy()->event with CompLevel == 
>>>> CompLevel_aot.
>>>>
>>>> As far as I can tell, CompLevel_aot is only used when there is 
>>>> active AOT code.   Is this an AOT-specific problem?
>>>
>>> Tracing showed that the CompLevel_aot is always used for JVMCI which 
>>> is pretty confusing.
>>> Most likely, it is because now the AOT is enabled by default (is it 
>>> correct?).
>>> I'm not sure it actually has anything to do with the AOT mode.
>>>
>>>
>>>>    I would expect there to be a problem with c1 as well.
>>>
>>> Not sure about the c1.
>>> I hope, Alex could comment on this tomorrow.
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>> dl
>>>>
>>>> On 11/27/18 3:38 PM, serguei.spitsyn@oracle.com wrote:
>>>>> Good summary with a list of solutions below!
>>>>>
>>>>> On 11/27/18 2:50 PM, dean.long@oracle.com wrote:
>>>>>> Let me list the proposed solutions:
>>>>>>
>>>>>> 1. short-circuit compiles in CompilationPolicy::event in 
>>>>>> is_interp_only_mode
>>>>>> 2. detect is_interp_only_mode in resolve stubs and force a c2i call
>>>>>> 3. don't offer a JVMTI suspend point in resolve stubs (or 
>>>>>> JavaCallWrapper)
>>>>>>
>>>>>> Solution 1 penalizes other threads, and don't prevent other 
>>>>>> threads from compiling the method, so while it may help a little, 
>>>>>> it's incomplete.
>>>>>
>>>>> I doubt, the Solution 1 penalizes other threads.
>>>>> First, it is an important pretty frequent/common case that allows 
>>>>> to save on compilations.
>>>>> Second, there is no point to compile methods on a thread executed 
>>>>> in interp_only_mode.
>>>>> Also, there is no big advantage to compile methods even for other 
>>>>> threads
>>>>> when an active debugging (e.g. single stepping) is in progress.
>>>>> Third, this would make it consistent with what the the interpreter 
>>>>> is doing.
>>>>>
>>>>> So, I think, the Solution 1 is needed independently of other 
>>>>> solutions we take.
>>>>> My suggestion is to apply it in the JDK-8195639 bug fix.
>>>>>
>>>>>> Solutions 1 and 2 allow the thread to resume in a different 
>>>>>> method (the callee method), causing other problems.
>>>>>
>>>>> Agreed.
>>>>>
>>>>>> With Solution 3, the frame we suspend is the same as the frame we 
>>>>>> end up in after the resume, so I believe it solves all the problems.
>>>>>> IMHO this is the correct solution to pursue.
>>>>>
>>>>> I agree in general.
>>>>> This solution has some risks involved as it impacts a lot of code 
>>>>> including platform-independent.
>>>>> It is also going to fix the other bugs: JDK-8195635, JDK-8214093, 
>>>>> JDK-8207013 (not sure about all of them yet).
>>>>>
>>>>> My suggestion is to pursue it as the JDK-8195635 fix.
>>>>> Could you, please, confirm if works for you?
>>>>>
>>>>> Also, we have no consensus yet on the Solution 1.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>> dl
>>>>>>
>>>>>> On 11/27/18 2:19 PM, serguei.spitsyn@oracle.com wrote:
>>>>>>> Hi Dean,
>>>>>>>
>>>>>>> You also asked the questions:
>>>>>>>
>>>>>>> > Doesn't is_interp_only_mode() only apply to the current thread?
>>>>>>>
>>>>>>> Yes it does.
>>>>>>>
>>>>>>>
>>>>>>> > I don't think it's safe to assume no compiles will happen in 
>>>>>>> other threads,
>>>>>>> > or that a method call target is not already compiled, because 
>>>>>>> as far
>>>>>>> > as I can tell, JVMTI only deoptimizes the active frames.
>>>>>>>
>>>>>>> I agree with it.
>>>>>>> However, compilations on the thread with the interp_only_mode 
>>>>>>> enabled is the most important case.
>>>>>>> Disabling compilations such thread improves testing stability.
>>>>>>>
>>>>>>> > The bug report describes a case where the caller has been 
>>>>>>> deoptimized
>>>>>>> > while resolving a call.   I don't see how this fix prevents the 
>>>>>>> target
>>>>>>> > from being a previously compiled method.
>>>>>>> > But we do need to make sure not to call into compiled code, so 
>>>>>>> I think
>>>>>>> > the fix needs to be in code like 
>>>>>>> SharedRuntime::resolve_static_call_C(),
>>>>>>> > where it returns get_c2i_entry() if is_interp_only_mode() is 
>>>>>>> true.
>>>>>>>
>>>>>>> Thank you for making this point.
>>>>>>>
>>>>>>> It would be nice to attack this as well.
>>>>>>> We can try to investigate this approach further.
>>>>>>> One problem is that there are more cases like 
>>>>>>> resolve_static_call_C,
>>>>>>> for instance, resolve_virtual_call_C and 
>>>>>>> resolve_opt_virtual_call_C.
>>>>>>> We may need to fix the same in other places, like 
>>>>>>> JavaCallWrapper::JavaCallWrapper.
>>>>>>>
>>>>>>> We can start from fixing it in the resolve_static_call_C.
>>>>>>> If it is successful then continue this effort for other cases as 
>>>>>>> well.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 11/27/18 1:01 PM, Alex Menkov wrote:
>>>>>>>> Hi Dean,
>>>>>>>>
>>>>>>>> Thank you for the feedback and for the comments in Jira.
>>>>>>>>
>>>>>>>> >> How does this solve the problem of
>>>>>>>> >> HotSpotJVMCIRuntime.adjustCompilationLevel being called?
>>>>>>>>
>>>>>>>> It doesn't. It fixes another issue.
>>>>>>>> The test suspend test thread and ensures that top frame is a 
>>>>>>>> "fibonachi" method. Then it turns SingleStep on, does PopFrame 
>>>>>>>> and resumes the thread.
>>>>>>>> The problem is the thread continues execution of compiled code 
>>>>>>>> (ignoring SingleStep) and we get SindleStep event only when 
>>>>>>>> adjustCompilationLevel method is called (it's interpreted code).
>>>>>>>>
>>>>>>>> There are several other bugs which are (most likely) caused by 
>>>>>>>> suspend during call setup (JDK-8195635, JDK-8214093, JDK-8207013)
>>>>>>>>
>>>>>>>> --alex
>>>>>>>>
>>>>>>>> On 11/27/2018 01:39, serguei.spitsyn@oracle.com wrote:
>>>>>>>>> Hi Dean,
>>>>>>>>>
>>>>>>>>> Thank you a lot for looking at this!
>>>>>>>>> Just a couple of points from me (it is up to Alex to provide a 
>>>>>>>>> full answer).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think, Alex in this RFR missed to tell that we knew about 
>>>>>>>>> this issue that an incorrect frame will be popped.
>>>>>>>>> But after some discussion we decided to file a separate issue 
>>>>>>>>> on this.
>>>>>>>>> Alex wanted to create a good stand-alone test case 
>>>>>>>>> demonstrating this problem before filing it.
>>>>>>>>>
>>>>>>>>> Now, as I can see, the JDK-8195635 
>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8195635> looks very 
>>>>>>>>> close to a bug that we wanted to file.
>>>>>>>>> The only difference is that our scenario includes the 
>>>>>>>>> SharedRuntime::resolve_static_call_C instead of the 
>>>>>>>>> JavaCallWrapper::JavaCallWrapper as a helper for Java method 
>>>>>>>>> invocation.
>>>>>>>>> If I understand corrctly (Alex will fix me if needed), the 
>>>>>>>>> jtreg test we used to chase this issue did not catch a problem 
>>>>>>>>> with the HotSpotJVMCIRuntime.adjustCompilationLevel.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The suggested fix was to fix the mismatch in the 
>>>>>>>>> TieredThresholdPolicy::even with the comment in the 
>>>>>>>>> interpreter code:
>>>>>>>>> nmethod* 
>>>>>>>>> InterpreterRuntime::frequency_counter_overflow(JavaThread* 
>>>>>>>>> thread, address branch_bcp) {
>>>>>>>>>           . . .
>>>>>>>>>       if (nm != NULL && thread->is_interp_only_mode()) {
>>>>>>>>>           // Normally we never get an nm if is_interp_only_mode() 
>>>>>>>>> is true, because
>>>>>>>>>           // policy()->event has a check for this and won't compile 
>>>>>>>>> the method when
>>>>>>>>>           // true. However, it's possible for is_interp_only_mode() 
>>>>>>>>> to become true
>>>>>>>>>           // during the compilation. We don't want to return the nm 
>>>>>>>>> in that case
>>>>>>>>>           // because we want to continue to execute interpreted.
>>>>>>>>>           nm = NULL;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>>   > So I think the fix needs to be in code like 
>>>>>>>>> SharedRuntime::resolve_static_call_C(),
>>>>>>>>>   > where it returns get_c2i_entry() if is_interp_only_mode() 
>>>>>>>>> is true.
>>>>>>>>>
>>>>>>>>> I'm not sure, the adding this condition and returning the 
>>>>>>>>> get_c2i_entry() is always correct.
>>>>>>>>> We need some help from the Compiler team to make it right.
>>>>>>>>>
>>>>>>>>> BTW, the interp_only_mode has been enabled only when some 
>>>>>>>>> interp_only events are enabled.
>>>>>>>>> It is not set by either PopFrame or ForceEarlyReturn.
>>>>>>>>> But the popframe009 test enables single stepping, so we wanted 
>>>>>>>>> to make this part right.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/26/18 16:05, dean.long@oracle.com wrote:
>>>>>>>>>> How does this solve the problem of 
>>>>>>>>>> HotSpotJVMCIRuntime.adjustCompilationLevel being called?
>>>>>>>>>>
>>>>>>>>>> I don't think this fix is the right approach. Doesn't 
>>>>>>>>>> is_interp_only_mode() only apply to the current thread? I 
>>>>>>>>>> don't think it's safe to assume no compiles will happen in 
>>>>>>>>>> other threads, or that a method call target is not already 
>>>>>>>>>> compiled, because as far as I can tell, JVMTI only 
>>>>>>>>>> deoptimizes the active frames. The bug report describes a 
>>>>>>>>>> case where the caller has been deoptimized while resolving a 
>>>>>>>>>> call.   I don't see how this fix prevents the target from 
>>>>>>>>>> being a previously compiled method.   But we do need to make 
>>>>>>>>>> sure not to call into compiled code, so I think the fix needs 
>>>>>>>>>> to be in code like SharedRuntime::resolve_static_call_C(), 
>>>>>>>>>> where it returns get_c2i_entry() if is_interp_only_mode() is 
>>>>>>>>>> true. However, there is still another problem. By allowing 
>>>>>>>>>> JVMTI to suspend the thread during call setup, but reporting 
>>>>>>>>>> the frame as still in the caller instead of the callee, we 
>>>>>>>>>> confuse JVMTI into thinking that execution will resume in the 
>>>>>>>>>> caller instead of the callee. We may want to restrict where 
>>>>>>>>>> we offer JVMTI suspend points, and not offer a JVMTI suspend 
>>>>>>>>>> point in SharedRuntime::resolve_static_call_C and friends at 
>>>>>>>>>> all.
>>>>>>>>>>
>>>>>>>>>> dl
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11/26/18 11:14 AM, Alex Menkov wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Please review the fix for
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8195639
>>>>>>>>>>> webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~amenkov/popframe009/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Description:
>>>>>>>>>>> The test suspends a thread, turns on single stepping and 
>>>>>>>>>>> then calls PopFrame. SingleStep event is expected as soon as 
>>>>>>>>>>> the thread is resumed and PopFrame is processed (so we have 
>>>>>>>>>>> call stack with the depth 1 less than it was before 
>>>>>>>>>>> PopFrame). Instead SingleStep event is received with much 
>>>>>>>>>>> deeper call stack (and PopFrame processes wrong frame).
>>>>>>>>>>> Research shown that this is caused by missed deoptimization 
>>>>>>>>>>> of the current frame.
>>>>>>>>>>> As far as I understand CompilationPolicy::event should 
>>>>>>>>>>> handle the case when the thread has is_interp_only_mode() 
>>>>>>>>>>> set, but TieredThresholdPolicy::event checks this only then 
>>>>>>>>>>> CompLevel is CompLevel_none.
>>>>>>>>>>> CompilerRuntime always calls policy()->event with CompLevel 
>>>>>>>>>>> == CompLevel_aot.
>>>>>>>>>>>
>>>>>>>>>>> The fix looks quite simple, but I'd appreciate feedback from 
>>>>>>>>>>> runtime and compiler teams as I'm not sure I completely 
>>>>>>>>>>> understand all the details of the "PopFrame dance".
>>>>>>>>>>>
>>>>>>>>>>> --alex
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic