[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: Re: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: fals
From: "serguei.spitsyn () oracle ! com" <serguei ! spitsyn () oracle ! com>
Date: 2018-07-31 8:32:34
Message-ID: f0a3509a-8c37-7d8a-a6f1-d3a73c85b336 () oracle ! com
[Download RAW message or body]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 7/30/18 09:46, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com">
<div class="moz-cite-prefix">On 7/30/18 12:47 AM, <a
class="moz-txt-link-abbreviated"
href="mailto:serguei.spitsyn@oracle.com"
moz-do-not-send="true">serguei.spitsyn@oracle.com</a> wrote:<br>
</div>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com">
<div class="moz-cite-prefix">Hi Chris,<br>
<br>
Just one quick simple comment below.<br>
<br>
On 7/29/18 22:05, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
I updated my changes to do the wait() for step, stepi, cont,
and next. Some issues turned up when testing. A number of
tests were timing out occasionally, I believe because
sometimes the event was handled (and the notify() call
already completed) before we ever got to the handler.wait()
call.</div>
</blockquote>
<br>
I agree as I had this exact concern when was looking at the
webrev.0.<br>
Need more time to understand all the details.<br>
</blockquote>
Hopefully it's a moot point since my 2nd version doesn't rely on
any synchronization.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
I think a properly working synchronized version would require a
large synchronized block in TTY::executeCommand() to block out any
event handling until the prompt has been printed. I think this is
safe even when you have a monitor command in place. Although that
results in TTY::executeCommand() being called from the event
handler, and it may temporarily block if the command thread is
already in TTY::executeCommand(), eventually the command thread
will exit the synchronized block allowing the event thread to make
progress. Having said that, I just assume stick with the simpler
webrev above if no one see issues with it.<br>
</blockquote>
<br>
Thank you for this explanation.<br>
It looks reasonable but I feel more meditation is needed.<br>
<br>
Thanks,<br>
Serguei<br>
<br>
<blockquote type="cite"
cite="mid:8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com">
thanks,<br>
<br>
Chris<br>
<blockquote type="cite"
cite="mid:973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com"> <br>
Thanks,<br>
Serguei<br>
<br>
<br>
<blockquote type="cite"
cite="mid:352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com">
<div class="moz-cite-prefix"> Although I never confirmed this
was the cause, it certainly seems feasible.<br>
<br>
There were also issues with the prompt printed by
TTY:executeCommand(). I've come to realize that it should
really always be printed before any event handling.
Currently that seems to be the case, probably due to the
Thread.yield() I mention below in event handling code like
stepEvent(), although I don't think it's necessarily
guaranteed. However, with my changes prompt was always
printed after the event was done being handled, and this
proved to be a problem for some tests.<br>
<br>
When executing the "cont" command, it seems this prompt is
expected to be "> ", which is the prompt printed if the
VM is not currently suspended. Some tests count on this,
mostly because they call jdb.receiveReplyFor(), and pass in
the number of prompts that are expected.
jdb.receiveReplyFor() does not count "> ". My change
ended up making the printPrompt() in TTY:executeCommand()
print as "main[1]" after a "cont" command, because the added
synchronizing was making it not print until the thread was
suspended. This caused some tests to fail because the
expected reply was not yet received, even after
jdb.receiveReplyFor() had returned (it was basically
returning one prompt too soon).<br>
<br>
I started thinking that the synchronized approach probably
needs a large synchronized block in TTY:executeCommand().
But then I figured out a much easier solution. Thinking
more about the "prompt after cont" issue, it occurred to me
that maybe we just need to print out the prompt before
executing the command, so it's not possible for events to
come in before the prompt is printed (or worse yet, in the
middle of event handling output). Doing this ends up looking
much like your original suggestion to set printPrompt =
false for "cont", "step", "stepi", and "next", except I also
added a call printPrompt() before doing this (Dan an I had
rejected your proposal to simply remove the prompt, since it
did server a purpose). For example:<br>
<br>
} else if (cmd.equals("cont")) {<br>
MessageOutput.printPrompt();<br>
showPrompt = false;<br>
evaluator.commandCont();<br>
<br>
But this does cause the issue I brought up above that the
prompt from "cont" is expected (by a number of tests) to be
"> ", but since the above code is executed while the vm
is suspended, it ends up being main[1]. I fixed this by
adding a "simple" argument to printPrompt() so we can force
it to be "> " when needed. This will maintain
compatibility with anyone making assumptions about what the
prompt should look like.<br>
<br>
Here's the webrev, which seems to be working. I've run about
1000 iterations on all platforms. So far it seems to be
working. There have been some failures, but for the most
part I've also seen those with clean repos, and the ones
that I haven't I don't believe are related to my changes.<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.01/</a><br>
<br>
One other comment about your initial fix to set showPrompt =
false:<br>
<br>
+ showPrompt = false; // asynchronous command <br>
<br>
I'm not sure if the "asynchronous command" comment was
intentional or just the result of the copy-and-paste. I just
want to clarify what is meant by "asynchronous command" in
the other places where this comment appears. There are a few
commands that are executed async on a different thread. If
you look in Commands.java, you'll see how these commands use
the AsyncExecution class to handle execution of the command.
I'm not sure why this is done, because all these commands
look like ones that are inherently synchronous, yet they are
always run on a different thread, and as a result may not be
complete when execution returns to TTY::executeCommand().
These commands also all set showPrompt = false, but the
prompt is printed by AsyncExecution when complete. So, the
point is the commands which are inherently async, like
"step" and "cont", are not async w.r.t. these "asynchronous
command" comments or the AsyncExecution class.<br>
<br>
Chris<br>
<br>
<br>
On 7/27/18 4:27 PM, Chris Plummer wrote:<br>
</div>
<blockquote type="cite"
cite="mid:853aba55-fafc-2797-ed44-818760bd5571@oracle.com">
<div class="moz-cite-prefix">Hi Gary,<br>
<br>
Sorry about the delay in the review. Been trying to fully
understand all the various code flows, which is tricky due
to having the separate event thread whose output needs to
be co-ordinated with command thread. Also made more
complicated by the multiple entrypoints into
executeCommand(), and there being various places where
output produced and we care about the order. I think I
have it all mapped out now and a pretty good understanding
of how it all works.<br>
<br>
My first impression is that it seems like a large number
of changes and a lot of passing around of a StringBuffer
object to fix a problem which I think could be resolved
with some strategic wait/notify code. For example, it
might be as simple as doing something like this:<br>
<br>
<a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Ecjplummer/8169718/webrev.00/"
moz-do-not-send="true">http://cr.openjdk.java.net/~cjplummer/8169718/webrev.00/</a><br>
<br>
This only covers the step command. You'd have to figure
out which other commands need to ensure an event has been
processed before printing the prompt, and set
waitForEventCompletion = true for them. Is there a reason
not to go with something like this?<br>
<br>
As for you current changes, just a couple of things I've
noticed, but I have not given it a thorough review (for
example, making sure that the StringBuffer is always
eventually printed, and done so in the right order).<br>
<br>
In EventHandler.run(), it's unclear who is handling the
printing of sb when handleEvent() or
handleDisconnectedException() are called.<br>
<br>
In the following EventHandler code:<br>
<br>
123 @Override<br>
124 public void stepEvent(StepEvent se, StringBuilder
sb) {<br>
125 Thread.yield(); // fetch output<br>
126 sb.append(MessageOutput.format("Step
completed:"));<br>
127 }<br>
<br>
Do you understand why the Thread.yield() is there? What
does the comment mean by "fetch output"? Just seems like
it has something to do with the ordering of output, but it
seems bad that the code ever relied on a yield to make
sure the output was ordered properly.<br>
<br>
Also in the above code snippet, you need to be careful
when replacing MessageOutput.lnprint() with
MessageOutput.format(). lnprint() adds a newline to the
start of the line. That appears to be missing in your code
above.<br>
<br>
thanks,<br>
<br>
Chris<br>
<br>
On 7/20/18 12:11 PM, Gary Adams wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5B5233DC.5040003@oracle.com"> Here's another
attempt to clear up the overlapping output from <br>
the command processing and event handler in the jdb tests.<br>
<br>
The fundamental problem is observed when "prompts"<br>
are produced interleaved with command and event output.<br>
<br>
This attempts to fix the issue by buffering the output and<br>
printing it fully assembled.<br>
<br>
Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev.01/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev.01/</a><br>
<br>
On 5/26/18, 6:50 AM, <a class="moz-txt-link-abbreviated"
href="mailto:gary.adams@oracle.com"
moz-do-not-send="true">gary.adams@oracle.com</a> wrote:
<blockquote
cite="mid:5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com"
type="cite"> This is a review request for a previously
closed test bug.<br>
The test was recently moved to the open repos, and the <br>
proposed fix is in the open code.<br>
<p> Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8169718/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8169718/webrev/</a><br>
</p>
<div class="moz-forward-container"><br>
-------- Forwarded Message --------
<table class="moz-email-headers-table">
<tbody>
<tr>
<th nowrap="nowrap">Subject: </th>
<td>RFR: JDK-8169718: nsk/jdb/locals/locals002:
ERROR: Cannot find boolVar with expected
value: false</td>
</tr>
<tr>
<th nowrap="nowrap">Date: </th>
<td>Fri, 25 May 2018 11:35:10 -0400</td>
</tr>
<tr>
<th nowrap="nowrap">From: </th>
<td>Gary Adams <a class="moz-txt-link-rfc2396E"
href="mailto:gary.adams@oracle.com"
moz-do-not-send="true"><gary.adams@oracle.com></a></td>
</tr>
<tr>
<th nowrap="nowrap">Reply-To: </th>
<td><a class="moz-txt-link-abbreviated"
href="mailto:gary.adams@oracle.com"
moz-do-not-send="true">gary.adams@oracle.com</a></td>
</tr>
<tr>
<th nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<br>
<pre>The jdb tests use stdin to send commands to a jdb process
and parses the stdout to determine if a command was
successful and when the process is prompting for new commands
to be sent.
Some commands are synchronous, so when the command is completed
a new prompt is sent back immediately.
Some commands are asynchronous, so there could be a delay
until a breakpoint is reached. The event handler then sends a prompt
when the application thread is stopped and new jdb commands can be sent.
The problem causing the intermittent failures was a corruption in the
output stream when prompts were being sent at the wrong times.
Instead of receiving
"Breakpoint hit:" <location>
<prompt>
the log contained
"Breakpoint hit:" <prompt> <location>
Once out of sync, jdb commands were being sent prematurely
and the wrong values were being compared against expected behavior.
The simple fix proposed here recognizes that commands like "cont",
"step" and "next" are asynchronous commands and should not send back
a prompt immediately. Instead. the event handler will deliver the next prompt
when the next "Breakpoint hit:" or "Step completed:" state change occurs.
The bulk of the testing was done on windows-x64-debug builds where the
intermittent failures were observed in ~5 in 1000 testruns. The fix has
also been tested on linux-x64-debug, solaris-sparcv9-debug,
and macosx-x64-debug, even though the failures have never been reported
against those platforms.
Failures have been observed in many of the nsk/jdb tests with similar corrupted
output streams, but never directly associated with this issue before.
redefine001, caught_exception002, locals002, eval001, next001,
stop_at003, step002, print002, trace001, step_up001, read001,
clear004, kill001, set001
</pre>
</div>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic