[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: Re: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values
From: Chris Plummer <chris.plummer () oracle ! com>
Date: 2018-07-25 18:13:09
Message-ID: 7dec2692-c3d0-4903-b9ae-8a22528539a7 () oracle ! com
[Download RAW message or body]
Thanks!
On 7/25/18 11:00 AM, Alex Menkov wrote:
> Looks good to me
>
> --alex
>
> On 07/24/2018 16:23, Chris Plummer wrote:
> > Thanks, Serguei.
> >
> > I could use one more reviewer.
> >
> > thanks,
> >
> > Chris
> >
> > On 7/24/18 3:00 PM, serguei.spitsyn@oracle.com wrote:
> > > Chris,
> > >
> > > Thank you for the explanations.
> > > I'm Okay with this webrev as it is.
> > >
> > > Thanks,
> > > Serguei
> > >
> > >
> > > On 7/24/18 13:55, Chris Plummer wrote:
> > > > On 7/24/18 1:46 PM, serguei.spitsyn@oracle.com wrote:
> > > > >
> > > > > http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html \
> > > > >
> > > > > - log.complain("Redefinition not started. Maybe running with
> > > > > -Xcomp. Test ignored.");
> > > > > + log.complain("Redefinition not started. May need more time for
> > > > > -Xcomp.");
> > > > > + status = Consts.TEST_FAILED;
> > > > > return false;
> > > > > }
> > > > > . . .
> > > > > - log.complain("Redefinition not completed.");
> > > > > + log.complain("Redefinition not completed. May need more time for
> > > > > -Xcomp.");
> > > > > + status = Consts.TEST_FAILED;
> > > > > + return false; The complain is not fully correct if this can
> > > > > happen not only with the -Xcomp. Could this message be relaxed a
> > > > > little bit?
> > > > I think it is relaxed. It says *may* need more time for -Xcomp. I'm
> > > > not sure how else to word it unless you want me to just say
> > > > "Redefinition not completed".
> > > > > Also, just a side comment: The changes above are not that
> > > > > harmless. As the status now is set to TEST_FAILED there is a
> > > > > potential for the tests to start failing where they were passed
> > > > > before.
> > > > Yes, that was intentional. It's still the case that you only need
> > > > the fail = 0 change to fix the bug, but having these methods
> > > > properly cause the test to fail is necessary if something were to
> > > > ever go wrong and the redef was not started or completed. Otherwise
> > > > the test would either silently pass (if redef was not started) or
> > > > just produce error messages like it has been when it checks for the
> > > > proper redef (if the redef never completed).
> > > >
> > > > thanks,
> > > >
> > > > Chris
> > > > > Otherwise, looks good.
> > > > >
> > > > > Thanks,
> > > > > Serguei
> > > > >
> > > > >
> > > > > On 7/24/18 13:22, Chris Plummer wrote:
> > > > > > http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
> > > > > >
> > > > > > Since I was removed the "else", there was no need for the "if",
> > > > > > so I removed it also. I had to re-indent the body of the "if"
> > > > > > section because of that. The webrev seems to not call out the
> > > > > > whitespace changes, although I also did a couple of other minor
> > > > > > formatting changes in the code that do show up.
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > On 7/24/18 12:42 PM, Chris Plummer wrote:
> > > > > > > Yes. I'm just retesting first.
> > > > > > >
> > > > > > > thanks,
> > > > > > >
> > > > > > > Chris
> > > > > > >
> > > > > > > On 7/24/18 12:18 PM, serguei.spitsyn@oracle.com wrote:
> > > > > > > > Hi Chris,
> > > > > > > >
> > > > > > > > You have my all my comments and I leave it up to you to decide
> > > > > > > > what approach to pick.
> > > > > > > > Could you send an updated webrev, please?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Serguei
> > > > > > > >
> > > > > > > >
> > > > > > > > On 7/24/18 09:27, Chris Plummer wrote:
> > > > > > > > > On 7/24/18 12:25 AM, serguei.spitsyn@oracle.com wrote:
> > > > > > > > > > Hi Chris,
> > > > > > > > > >
> > > > > > > > > > I still feel, this fix adds more confusion and complexity.
> > > > > > > > > > Let's look at some fragments.
> > > > > > > > > >
> > > > > > > > > > http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotsp \
> > > > > > > > > > ot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html \
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 116 if ((strcmp(name, expHSMethod) == 0) &&
> > > > > > > > > > 117 (strcmp(sig, expHSSignature) == 0)) {
> > > > > > > > > > 118 NSK_DISPLAY0("CompiledMethodLoad: a tested
> > > > > > > > > > hotspot method found\n");
> > > > > > > > > > 119
> > > > > > > > > > 120 // CR 6604375: check whether "hot" method was \
> > > > > > > > > > entered
> > > > > > > > > > 121 if (enteredHotMethod) {
> > > > > > > > > > 122 hsMethodID = method;
> > > > > > > > > > 123 fire = 1;
> > > > > > > > > > 124 } else {
> > > > > > > > > > 125 NSK_DISPLAY0("Compilation occured before method
> > > > > > > > > > execution\n");
> > > > > > > > > > 126 fire = 0; // Ignore this compilation. Wait for next one.
> > > > > > > > > > 127 }
> > > > > > > > > > 128 }
> > > > > > > > > >
> > > > > > > > > > I think, the line #126 is not needed.
> > > > > > > > > > It just creates a confusion.
> > > > > > > > > > The fire == 0 from beginning.
> > > > > > > > > > Why do we need it to set to 0 again?
> > > > > > > > > Yes, it can be removed. I just didn't give it much thought
> > > > > > > > > when changing the code from -1 to 0.
> > > > > > > > > > Is it because it can be already set to 1?
> > > > > > > > > > Id so, I'm not sure I understand this code then.
> > > > > > > > > >
> > > > > > > > > > 187 } while(fire == 0);
> > > > > > > > > > 188
> > > > > > > > > > 189 NSK_DISPLAY0("agentProc: hotspot method
> > > > > > > > > > compiled\n\n");
> > > > > > > > > > 190
> > > > > > > > > > 192 if (fire == 1) {
> > > > > > > > > > . . .
> > > > > > > > > > 224 } else {
> > > > > > > > > > 225 // fire == -1
> > > > > > > > > > 226 // NOTE: This isn't suppose to happen anymore. Hot method
> > > > > > > > > > should always end up being entered.
> > > > > > > > > > 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed.
> > > > > > > > > > Don't perform redefinition\n");
> > > > > > > > > > 228 }
> > > > > > > > > > I don't understand why do we need the check at the line #192.
> > > > > > > > > > The variable fire can be only equal to 0 or 1.
> > > > > > > > > > The only way out of the loop at the line #187 is if fire == 1.
> > > > > > > > > >
> > > > > > > > > > Then the else statement at the lines 224-228 confuses even more.
> > > > > > > > > The else section can be removed. I left it in as sort of an
> > > > > > > > > assert, but I see now that it just cause confusion.
> > > > > > > > >
> > > > > > > > > thanks,
> > > > > > > > >
> > > > > > > > > Chris
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On 7/23/18 20:19, Chris Plummer wrote:
> > > > > > > > > > > On 7/23/18 5:22 PM, serguei.spitsyn@oracle.com wrote:
> > > > > > > > > > > > Hi Chris,
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On 7/23/18 11:40, Chris Plummer wrote:
> > > > > > > > > > > > > Hi Serguei,
> > > > > > > > > > > > >
> > > > > > > > > > > > > If the fix was complicated I would agree, but it really
> > > > > > > > > > > > > just boils down to this one line change:
> > > > > > > > > > > > >
> > > > > > > > > > > > > - fire = -1;
> > > > > > > > > > > > > + fire = 0; // Ignore this \
> > > > > > > > > > > > > compilation. Wait for next one.
> > > > > > > > > > > >
> > > > > > > > > > > > It is not obvious that this will completely fix the problem.
> > > > > > > > > > > > Is it possible that there will not be next compilation with
> > > > > > > > > > > > the -Xcomp?
> > > > > > > > > > > It's only one method that we check for. I don't see why
> > > > > > > > > > > there would be 2nd -Xcomp compilation for it, but even if
> > > > > > > > > > > there was, the test will ignore it just like the first one.
> > > > > > > > > > > It will ignore compilations of the method until the flag has
> > > > > > > > > > > been set indicating the method has been executed once.
> > > > > > > > > >
> > > > > > > > > > > If for some reason the method is never compiled after being
> > > > > > > > > > > executed once, the test will give up waiting for it (I think
> > > > > > > > > > > after 30 seconds) and produce an error.
> > > > > > > > > >
> > > > > > > > > > I'm afraid that it is what will always happen with the -Xcomp.
> > > > > > > > > > Then there is no point to waist this by waiting for timeout
> > > > > > > > > > as the test will successfully complete without testing anything.
> > > > > > > > > > It seems to be not worth this complexity.
> > > > > > > > > >
> > > > > > > > > > I guess, you would want some extra tracing though. :)
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Serguei
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > If it is possible then it is better to explicitly exclude
> > > > > > > > > > > > these tests for -Xcomp.
> > > > > > > > > > > > Otherwise, consider this reviewed.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Given that, I see no reason not to increase our test
> > > > > > > > > > > > > coverage by supporting this test during -Xcomp runs.
> > > > > > > > > > > >
> > > > > > > > > > > > I'd agree if it is going to be stable.
> > > > > > > > > > > >
> > > > > > > > > > > If problems turn up in the future, we can reconsider
> > > > > > > > > > > disabling it.
> > > > > > > > > > >
> > > > > > > > > > > thanks,
> > > > > > > > > > >
> > > > > > > > > > > Chris
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Serguei
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Chris
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 7/23/18 9:44 AM, serguei.spitsyn@oracle.com wrote:
> > > > > > > > > > > > > > Hi Chris,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Would it be more simple to avoid running these tests with \
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -Xcomp?
> > > > > > > > > > > > > > I guess, this would work: @requires vm.compMode != \
> > > > > > > > > > > > > > "Xcomp"
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Serguei
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 7/23/18 00:42, Chris Plummer wrote:
> > > > > > > > > > > > > > > Hello,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please review the following fix for JDK11:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > https://bugs.openjdk.java.net/browse/JDK-8151259
> > > > > > > > > > > > > > > http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It fixes the following 3 tests:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
> > > > > > > > > > > > > > > vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
> > > > > > > > > > > > > > > vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Any of which could fail when run with -Xcomp with
> > > > > > > > > > > > > > > (followed by a bunch more errors):
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > # ERROR: Redefinition not started. Maybe running with
> > > > > > > > > > > > > > > -Xcomp. Test ignored.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Although lately we've only seen this with
> > > > > > > > > > > > > > > redefclass030.java on macosx.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > These 3 tests do redefinition of a "hot" method after
> > > > > > > > > > > > > > > triggering compilation for it. After the redef some
> > > > > > > > > > > > > > > testing is done to ensure that the redef was done
> > > > > > > > > > > > > > > correctly, but the issue these test have actually comes \
> > > > > > > > > > > > > > > before any redef is done.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The test attempts to trigger compilation by calling a
> > > > > > > > > > > > > > > hot method a lot. The agent detects compilation by
> > > > > > > > > > > > > > > receiving a CompiledMethodLoad event. There was an \
> > > > > > > > > > > > > > > issue discovered long ago that when -Xcomp is used, \
> > > > > > > > > > > > > > > the compilation happens before the "hot" method is \
> > > > > > > > > > > > > > > ever called. Then the redef would happen before \
> > > > > > > > > > > > > > > compilation, and this somehow messed up the test (I'm \
> > > > > > > > > > > > > > > not exactly sure how). The fix was to basically \
> > > > > > > > > > > > > > > abandon the redef attempt when this problem is \
> > > > > > > > > > > > > > > detected, and then supposedly just let the test run to \
> > > > > > > > > > > > > > > completion (skipping the actual testing of the redef). \
> > > > > > > > > > > > > > > After this change, if you ran with -Xcomp it would \
> > > > > > > > > > > > > > > pass, but if you looked in the log you would see:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > # ERROR: Redefinition not started. Maybe running with
> > > > > > > > > > > > > > > -Xcomp. Test ignored.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, there was a bug in the logic to make the test
> > > > > > > > > > > > > > > run to completion, and also causes the above message to \
> > > > > > > > > > > > > > > not appear. Instead the test would fail with:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > # ERROR: Redefinition not completed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Followed by a bunch more error message during the part
> > > > > > > > > > > > > > > of the test that checks if the redef was done properly.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If the CompiledMethodLoad event comes in before the hot \
> > > > > > > > > > > > > > > method is ever called (which it does with -Xcomp), the \
> > > > > > > > > > > > > > > test sets fire = -1. If the hot method was called, it \
> > > > > > > > > > > > > > > is set to 1. The setting of fire = -1 was added to fix \
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > -Xcomp problem mentioned above. The jvmti agent does \
> > > > > > > > > > > > > > > the following:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > do {
> > > > > > > > > > > > > > > THREAD_sleep(1);
> > > > > > > > > > > > > > > /* wait for compilation to happen */
> > > > > > > > > > > > > > > } while(fire == 0);
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > if (fire == 1) {
> > > > > > > > > > > > > > > /* do the redef here */
> > > > > > > > > > > > > > > NSK_DISPLAY0("agentProc: <<<<<<<<
> > > > > > > > > > > > > > > RedefineClasses() is successfully done\n");
> > > > > > > > > > > > > > > } else {
> > > > > > > > > > > > > > > // fire == -1
> > > > > > > > > > > > > > > NSK_DISPLAY0("agentProc: \"hot\" method wasn't
> > > > > > > > > > > > > > > executed. Don't perform redefinition\n");
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The agent then syncs with the debuggee, waiting for it
> > > > > > > > > > > > > > > finish up. What the test expects is that
> > > > > > > > > > > > > > > waitForRedefinitionStarted() in the debuggee will time
> > > > > > > > > > > > > > > out after two seconds while waiting for fire == 1 \
> > > > > > > > > > > > > > > (which it thinks will will always happen because it \
> > > > > > > > > > > > > > > was set to
> > > > > > > > > > > > > > > -1). When it times out, the test does appear to exit
> > > > > > > > > > > > > > > properly with, but with the following in the log, which \
> > > > > > > > > > > > > > > is intended:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > # ERROR: Redefinition not started. Maybe running with
> > > > > > > > > > > > > > > -Xcomp. Test ignored.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, sometimes before waitForRedefinitionStarted()
> > > > > > > > > > > > > > > times out, the hot method is called enough times to
> > > > > > > > > > > > > > > trigger compilation. So another CompiledMethodLoad \
> > > > > > > > > > > > > > > event arrives, and this time fire is set to 1. Because \
> > > > > > > > > > > > > > > of this, waitForRedefinitionStarted() doesn't time out \
> > > > > > > > > > > > > > > and returns with an indication that the redef has \
> > > > > > > > > > > > > > > started. After this waitForRedefinitionCompleted() is \
> > > > > > > > > > > > > > > executed. It waits for the redef to complete, but it \
> > > > > > > > > > > > > > > never does since the agent decided not to do the redef \
> > > > > > > > > > > > > > > when it saw fire == -1. So \
> > > > > > > > > > > > > > > waitForRedefinitionCompleted() times out after 10 \
> > > > > > > > > > > > > > > seconds and the test fails, with:
> > > > > > > > > > > > > > > # ERROR: Redefinition not completed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Actually the above error is not really what causes the
> > > > > > > > > > > > > > > failure. When the above error is detected, no error
> > > > > > > > > > > > > > > status is set and the test continues as if the redef \
> > > > > > > > > > > > > > > had been done. So then the logic that detects if the \
> > > > > > > > > > > > > > > redef was done properly ends up failing, and that's \
> > > > > > > > > > > > > > > where the test actually indicates a failure status. \
> > > > > > > > > > > > > > > You see a whole bunch of other errors in the log \
> > > > > > > > > > > > > > > because of all the checks that fail.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The fix is to not abandon the test when the first
> > > > > > > > > > > > > > > CompiledMethodLoad event is before the hot method was
> > > > > > > > > > > > > > > called. Instead just leave fire==0 and wait for the \
> > > > > > > > > > > > > > > next CompiledMethodLoad event that is triggered after \
> > > > > > > > > > > > > > > the method is called enough times to be recompiled. \
> > > > > > > > > > > > > > > I'm not sure why it was not originally done this way. \
> > > > > > > > > > > > > > > Possibly the recompilation did not happen reliably, \
> > > > > > > > > > > > > > > but I have not run into this problem. The other \
> > > > > > > > > > > > > > > changes in redefclass030.c are just cleaning up debug \
> > > > > > > > > > > > > > > tracing.
> > > > > > > > > > > > > > > Another fix was to properly set the error status when
> > > > > > > > > > > > > > > waitForRedefinitionStarted() or
> > > > > > > > > > > > > > > waitForRedefinitionCompleted() times out, although this \
> > > > > > > > > > > > > > > is just a safety net and I didn't run into any cases
> > > > > > > > > > > > > > > where this happened after fixing the CompiledMethodLoad \
> > > > > > > > > > > > > > > event handling. So in general the changes in
> > > > > > > > > > > > > > > redefclass030.java were not needed, but provide better
> > > > > > > > > > > > > > > error handling.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Chris
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic