[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112)
From: "serguei.spitsyn () oracle ! com" <serguei ! spitsyn () oracle ! com>
Date: 2014-10-31 21:07:00
Message-ID: 5453F9F4.20309 () oracle ! com
[Download RAW message or body]
It is 3-rd round of review for:
https://bugs.openjdk.java.net/browse/JDK-6988950
New webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/
Summary
For failing scenario, please, refer to the 1-st round RFR below.
I've found what is missed in the jdwp agent shutdown and decided to
switch from a workaround to a real fix.
The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1.
The agent debugLoop_run() has a guard against the VM shutdown:
165 } else if (gdata->vmDead &&
166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) {
167 /* Protect the VM from calls while dead.
168 * VirtualMachine cmdSet quietly ignores some cmds
169 * after VM death, so, it sends it's own errors.
170 */
171 outStream_setError(&out, JDWP_ERROR(VM_DEAD));
However, the guard above does not help much if the VM_DEATH event
happens in the middle of a command execution.
There is a lack of synchronization here.
The fix introduces new lock (vmDeathLock) which does not allow to
execute the commands
and the VM_DEATH event callback concurrently.
It should work well for any function that is used in implementation
of the JDWP_COMMAND_SET(VirtualMachine) .
Testing:
Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests
Thanks,
Serguei
On 10/29/14 6:05 PM, serguei.spitsyn@oracle.com wrote:
> The updated webrev:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/
>
>
> The changes are:
> - added a comment recommended by Staffan
> - removed the ignore_wrong_phase() call from function classSignature()
>
> The classSignature() function is called in 16 places.
> Most of them do not tolerate the NULL in place of returned signature
> and will crash.
> I'm not comfortable to fix all the occurrences now and suggest to
> return to this
> issue after gaining experience with more failure cases that are still
> expected.
> The failure with the classSignature() involved was observed only once
> in the nightly
> and should be extremely rare reproducible.
> I'll file a placeholder bug if necessary.
>
> Thanks,
> Serguei
>
> On 10/28/14 6:11 PM, serguei.spitsyn@oracle.com wrote:
>> Please, review the fix for:
>> https://bugs.openjdk.java.net/browse/JDK-6988950
>>
>>
>> Open webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/
>>
>>
>>
>> Summary:
>>
>> The failing scenario:
>> The debugger and the debuggee are well aware a VM shutdown has
>> been started in the target process.
>> The debugger at this point is not expected to send any commands
>> to the JDWP agent.
>> However, the JDI layer (debugger side) and the jdwp agent
>> (debuggee side)
>> are not in sync with the consumer layers.
>>
>> One reason is because the test debugger does not invoke the JDI
>> method VirtualMachine.dispose().
>> Another reason is that the Debugger and the debuggee processes
>> are uneasy to sync in general.
>>
>> As a result the following steps are possible:
>> - The test debugger sends a 'quit' command to the test debuggee
>> - The debuggee is normally exiting
>> - The jdwp backend reports (over the jdwp protocol) an
>> anonymous class unload event
>> - The JDI InternalEventHandler thread handles the
>> ClassUnloadEvent event
>> - The InternalEventHandler wants to uncache the matching
>> reference type.
>> If there is more than one class with the same host class
>> signature, it can't distinguish them,
>> and so, deletes all references and re-retrieves them again
>> (see tracing below):
>> MY_TRACE: JDI:
>> VirtualMachineImpl.retrieveClassesBySignature:
>> sig=Ljava/lang/invoke/LambdaForm$DMH;
>> - The jdwp backend debugLoop_run() gets the command from JDI
>> and calls the functions
>> classesForSignature() and classStatus() recursively.
>> - The classStatus() makes a call to the JVMTI GetClassStatus()
>> and gets the JVMTI_ERROR_WRONG_PHASE
>> - As a result the jdwp backend reports the JVMTI error to the
>> JDI, and so, the test fails
>>
>> For details, see the analysis in bug report closed as a dup of
>> the bug 6988950:
>> https://bugs.openjdk.java.net/browse/JDK-8024865
>>
>> Some similar cases can be found in the two bug reports (6988950
>> and 8024865) describing this issue.
>>
>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error
>> as it is normal at the VM shutdown.
>> The original jdwp backend implementation had a similar approach
>> for the raw monitor functions.
>> Threy use the ignore_vm_death() to workaround the
>> JVMTI_ERROR_WRONG_PHASE errors.
>> For reference, please, see the file: src/share/back/util.c
>>
>>
>> Testing:
>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests
>>
>>
>> Thanks,
>> Serguei
>>
>
[Attachment #3 (text/html)]
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
It is 3-rd round of review for:<br>
<a class="moz-txt-link-freetext" \
href="https://bugs.openjdk.java.net/browse/JDK-6988950">https://bugs.openjdk.java.net/browse/JDK-6988950</a>
<br>
<br>
New webrev:<br>
<a class="moz-txt-link-freetext" \
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3 \
/">http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/</a><br>
<br>
<br>
Summary<br>
<br>
For failing scenario, please, refer to the 1-st round RFR below.<br>
<br>
I've found what is missed in the jdwp agent shutdown and decided
to switch from a workaround to a real fix.<br>
<br>
The agent VM_DEATH callback sets the gdata field:
gdata->vmDead = 1.<br>
The agent debugLoop_run()
<meta http-equiv="content-type" content="text/html; charset=utf-8">
has a guard against the VM shutdown:<br>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<pre> 165 } else if (gdata->vmDead &&
166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) {
167 /* Protect the VM from calls while dead.
168 * VirtualMachine cmdSet quietly ignores some cmds
169 * after VM death, so, it sends it's own errors.
170 */
171 outStream_setError(&out, JDWP_ERROR(VM_DEAD));</pre>
<br>
However, the guard above does not help much if the VM_DEATH
event happens in the middle of a command execution. <br>
There is a lack of synchronization here.<br>
<br>
The fix introduces new lock (vmDeathLock) which does not allow
to execute the commands<br>
and the VM_DEATH event callback concurrently.<br>
It should work well for any function that is used in
implementation of the JDWP_COMMAND_SET(VirtualMachine)
<meta http-equiv="content-type" content="text/html; charset=utf-8">
.<br>
<br>
<br>
Testing:
<br>
Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi
tests
<br>
<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<br>
On 10/29/14 6:05 PM, <a class="moz-txt-link-abbreviated" \
href="mailto:serguei.spitsyn@oracle.com">serguei.spitsyn@oracle.com</a> wrote:<br> \
</div> <blockquote cite="mid:54518EE1.9040208@oracle.com" type="cite">The
updated webrev:
<br>
<a class="moz-txt-link-freetext" \
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2 \
/">http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/</a>
<br>
<br>
The changes are:
<br>
- added a comment recommended by Staffan
<br>
- removed the ignore_wrong_phase() call from function
classSignature()
<br>
<br>
The classSignature() function is called in 16 places.
<br>
Most of them do not tolerate the NULL in place of returned
signature and will crash.
<br>
I'm not comfortable to fix all the occurrences now and suggest to
return to this
<br>
issue after gaining experience with more failure cases that are
still expected.
<br>
The failure with the classSignature() involved was observed only
once in the nightly
<br>
and should be extremely rare reproducible.
<br>
I'll file a placeholder bug if necessary.
<br>
<br>
Thanks,
<br>
Serguei
<br>
<br>
On 10/28/14 6:11 PM, <a class="moz-txt-link-abbreviated" \
href="mailto:serguei.spitsyn@oracle.com">serguei.spitsyn@oracle.com</a> wrote: <br>
<blockquote type="cite">Please, review the fix for:
<br>
<a class="moz-txt-link-freetext" \
href="https://bugs.openjdk.java.net/browse/JDK-6988950">https://bugs.openjdk.java.net/browse/JDK-6988950</a>
<br>
<br>
<br>
Open webrev:
<br>
<a class="moz-txt-link-freetext" \
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1 \
/">http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/</a>
<br>
<br>
<br>
Summary:
<br>
<br>
The failing scenario:
<br>
The debugger and the debuggee are well aware a VM shutdown
has been started in the target process.
<br>
The debugger at this point is not expected to send any
commands to the JDWP agent.
<br>
However, the JDI layer (debugger side) and the jdwp agent
(debuggee side)
<br>
are not in sync with the consumer layers.
<br>
<br>
One reason is because the test debugger does not invoke the
JDI method VirtualMachine.dispose().
<br>
Another reason is that the Debugger and the debuggee
processes are uneasy to sync in general.
<br>
<br>
As a result the following steps are possible:
<br>
- The test debugger sends a 'quit' command to the test
debuggee
<br>
- The debuggee is normally exiting
<br>
- The jdwp backend reports (over the jdwp protocol) an
anonymous class unload event
<br>
- The JDI InternalEventHandler thread handles the
ClassUnloadEvent event
<br>
- The InternalEventHandler wants to uncache the matching
reference type.
<br>
If there is more than one class with the same host
class signature, it can't distinguish them,
<br>
and so, deletes all references and re-retrieves them
again (see tracing below):
<br>
MY_TRACE: JDI:
VirtualMachineImpl.retrieveClassesBySignature:
sig=Ljava/lang/invoke/LambdaForm$DMH;
<br>
- The jdwp backend debugLoop_run() gets the command from
JDI and calls the functions
<br>
classesForSignature() and classStatus() recursively.
<br>
- The classStatus() makes a call to the JVMTI
GetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE
<br>
- As a result the jdwp backend reports the JVMTI error to
the JDI, and so, the test fails
<br>
<br>
For details, see the analysis in bug report closed as a dup
of the bug 6988950:
<br>
<a class="moz-txt-link-freetext" \
href="https://bugs.openjdk.java.net/browse/JDK-8024865">https://bugs.openjdk.java.net/browse/JDK-8024865</a>
<br>
<br>
Some similar cases can be found in the two bug reports
(6988950 and 8024865) describing this issue.
<br>
<br>
The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE
error as it is normal at the VM shutdown.
<br>
The original jdwp backend implementation had a similar
approach for the raw monitor functions.
<br>
Threy use the ignore_vm_death() to workaround the
JVMTI_ERROR_WRONG_PHASE errors.
<br>
For reference, please, see the file: src/share/back/util.c
<br>
<br>
<br>
Testing:
<br>
Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi
tests
<br>
<br>
<br>
Thanks,
<br>
Serguei
<br>
<br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic