[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: Re: RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code
From: "Daniel D. Daugherty" <daniel.daugherty () oracle ! com>
Date: 2015-10-22 15:35:43
Message-ID: 5629024F.1040602 () oracle ! com
[Download RAW message or body]
Thanks for the re-review!
Current stress test results:
$ grep -v PASS doit_loop.fast_?.log
doit_loop.fast_0.log:Copy fast_0: loop #271077...
doit_loop.fast_1.log:Copy fast_1: loop #271178...
doit_loop.fast_2.log:Copy fast_2: loop #271217...
doit_loop.fast_3.log:Copy fast_3: loop #271223...
$ elapsed_times mark.start_test_run doit_loop.fast_0.log
mark.start_test_run 0 seconds
doit_loop.fast_0.log 1 days 19 hours 30 minutes 39 seconds
The typical failure rate for this bug is 1-3 failures in 3 days
with some other failure modes (in C2 or G1) popping in for a visit.
So far... no failures...
Dan
On 10/22/15, 9:30 AM, Carsten Varming wrote:
> Dear Dan,
>
> I reviewed round 1. Looks good to me.
>
> Thank you for the updated webbrew.
>
> Carsten
>
> On Tue, Oct 20, 2015 at 2:15 PM, Daniel D. Daugherty
> <daniel.daugherty@oracle.com <mailto:daniel.daugherty@oracle.com>> wrote:
>
> Greetings,
>
> I've updated the fix based on feedback from Carsten V and David H.
>
> Webrev URL:
> http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/
> <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/1-jdk9-hs-rt/>
>
> Changes relative to round 0:
>
> - only src/share/vm/runtime/synchronizer.cpp has changed
> - reads of gBlockList now use OrderAccess::load_ptr_acquire()
>
> code style cleanups:
>
> - only cleaned up the functions that I touched to make the
> OrderAccess::load_ptr_acquire() changes
> - changed implied booleans into real boolean expressions
> - moved some locals to narrower context
> - added/removed some blank lines
> - made casts consistent with the majority style in this file
>
> I'm repeating all of the same testing that I did for round 0. The
> round 1 bits have not yet made it through JPRT-west, but the jobs
> are mostly done.
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
>
>
>
> On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:
>
> Greetings,
>
> I have a fix for a long standing race between the lock-free
> ObjectMonitor
> verification code and the normal (locked) ObjectMonitor block
> allocation
> code path. For this fix, I would like at least a Runtime team
> reviewer
> and a Serviceability team reviewer. Thanks!
>
> JDK-8047212
> runtime/ParallelClassLoading/bootstrap/random/inner-complex
>
> assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed:
> monitor is invalid
> https://bugs.openjdk.java.net/browse/JDK-8047212
>
> Webrev URL:
> http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/ \
> <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/>
> Testing: Aurora Adhoc RT-SVC nightly batch
> 4 inner-complex fastdebug parallel runs for 4+ days and
> 600K iterations without seeing this failure; the
> experiment
> is still running; final results to be reported at
> the end
> of the review cycle
> JPRT -testset hotspot
>
> This fix:
>
> - makes ObjectMonitor::gBlockList volatile
> - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
> make sure the new block updates _happen before_ gBlockList is
> changed to refer to the new block
> - add SA support for a "static pointer volatile" field like:
>
> static ObjectMonitor * volatile gBlockList;
>
> See the following link for a nice description of what "volatile"
> means in the different positions on a variable/parameter decl
> line:
>
> http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword
>
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
>
[Attachment #3 (text/html)]
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<tt>Thanks for the re-review!<br>
<br>
Current stress test results:<br>
<br>
$ grep -v PASS doit_loop.fast_?.log<br>
doit_loop.fast_0.log:Copy fast_0: loop #271077...<br>
doit_loop.fast_1.log:Copy fast_1: loop #271178...<br>
doit_loop.fast_2.log:Copy fast_2: loop #271217...<br>
doit_loop.fast_3.log:Copy fast_3: loop #271223...<br>
<br>
$ elapsed_times mark.start_test_run doit_loop.fast_0.log<br>
mark.start_test_run \
0 seconds<br> doit_loop.fast_0.log 1 days 19 hours 30 minutes 39 seconds<br>
<br>
The typical failure rate for this bug is 1-3 failures in 3 days<br>
with some other failure modes (in C2 or G1) popping in for a
visit.<br>
<br>
So far... no failures...<br>
<br>
Dan<br>
</tt><br>
On 10/22/15, 9:30 AM, Carsten Varming wrote:
<blockquote
cite="mid:CAP_pwnX5Vz1nBjVYQ1KttnCN6QDqtLROJDyyNbUnC1MgpoLh2w@mail.gmail.com"
type="cite">
<div dir="ltr">Dear Dan,
<div><br>
</div>
<div>I reviewed round 1. Looks good to me.</div>
<div><br>
</div>
<div>Thank you for the updated webbrew.</div>
<div><br>
</div>
<div>Carsten</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Oct 20, 2015 at 2:15 PM, Daniel
D. Daugherty <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:daniel.daugherty@oracle.com" \
target="_blank">daniel.daugherty@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Greetings,<br>
<br>
I've updated the fix based on feedback from Carsten V and
David H.<br>
<br>
Webrev URL: <a moz-do-not-send="true"
href="http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/1-jdk9-hs-rt/"
rel="noreferrer" \
target="_blank">http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/</a><br>
<br>
Changes relative to round 0:<br>
<br>
- only src/share/vm/runtime/synchronizer.cpp has changed<br>
- reads of gBlockList now use
OrderAccess::load_ptr_acquire()<br>
<br>
code style cleanups:<br>
<br>
- only cleaned up the functions that I touched to make the<br>
OrderAccess::load_ptr_acquire() changes<br>
- changed implied booleans into real boolean expressions<br>
- moved some locals to narrower context<br>
- added/removed some blank lines<br>
- made casts consistent with the majority style in this file<br>
<br>
I'm repeating all of the same testing that I did for round
0. The<br>
round 1 bits have not yet made it through JPRT-west, but the
jobs<br>
are mostly done.<span class="im HOEnZb"><br>
<br>
Thanks, in advance, for any comments, questions or
suggestions.<br>
<br>
Dan<br>
<br>
<br>
<br>
<br>
<br>
</span>
<div class="HOEnZb">
<div class="h5">
On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
Greetings,<br>
<br>
I have a fix for a long standing race between the
lock-free ObjectMonitor<br>
verification code and the normal (locked)
ObjectMonitor block allocation<br>
code path. For this fix, I would like at least a
Runtime team reviewer<br>
and a Serviceability team reviewer. Thanks!<br>
<br>
JDK-8047212
runtime/ParallelClassLoading/bootstrap/random/inner-complex<br>
assert(ObjectSynchronizer::verify_objmon_isinpool(inf))
failed:<br>
monitor is invalid<br>
<a moz-do-not-send="true"
href="https://bugs.openjdk.java.net/browse/JDK-8047212"
rel="noreferrer" \
target="_blank">https://bugs.openjdk.java.net/browse/JDK-8047212</a><br> <br>
Webrev URL: <a moz-do-not-send="true"
href="http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/"
rel="noreferrer" \
target="_blank">http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/</a><br>
<br>
Testing: Aurora Adhoc RT-SVC nightly batch<br>
4 inner-complex fastdebug parallel runs for
4+ days and<br>
600K iterations without seeing this
failure; the experiment<br>
is still running; final results to be
reported at the end<br>
of the review cycle<br>
JPRT -testset hotspot<br>
<br>
This fix:<br>
<br>
- makes ObjectMonitor::gBlockList volatile<br>
- uses
"OrderAccess::release_store_ptr(&gBlockList,
temp)" to<br>
make sure the new block updates _happen before_
gBlockList is<br>
changed to refer to the new block<br>
- add SA support for a "static pointer volatile" field
like:<br>
<br>
static ObjectMonitor * volatile gBlockList;<br>
<br>
See the following link for a nice description of what
"volatile"<br>
means in the different positions on a
variable/parameter decl line:<br>
<br>
<a moz-do-not-send="true"
href="http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword"
rel="noreferrer" \
target="_blank">http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword</a>
<br>
<br>
Thanks, in advance, for any comments, questions or
suggestions.<br>
<br>
Dan<br>
<br>
<br>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic