[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR (XS) 8047212: fix race between ObjectMonitor alloc and verification code
From:       "Daniel D. Daugherty" <daniel.daugherty () oracle ! com>
Date:       2015-10-22 15:35:43
Message-ID: 5629024F.1040602 () oracle ! com
[Download RAW message or body]

Thanks for the re-review!

Current stress test results:

$ grep -v PASS doit_loop.fast_?.log
doit_loop.fast_0.log:Copy fast_0: loop #271077...
doit_loop.fast_1.log:Copy fast_1: loop #271178...
doit_loop.fast_2.log:Copy fast_2: loop #271217...
doit_loop.fast_3.log:Copy fast_3: loop #271223...

$ elapsed_times mark.start_test_run doit_loop.fast_0.log
mark.start_test_run                                0 seconds
doit_loop.fast_0.log   1 days 19 hours 30 minutes 39 seconds

The typical failure rate for this bug is 1-3 failures in 3 days
with some other failure modes (in C2 or G1) popping in for a visit.

So far... no failures...

Dan

On 10/22/15, 9:30 AM, Carsten Varming wrote:
> Dear Dan,
> 
> I reviewed round 1. Looks good to me.
> 
> Thank you for the updated webbrew.
> 
> Carsten
> 
> On Tue, Oct 20, 2015 at 2:15 PM, Daniel D. Daugherty 
> <daniel.daugherty@oracle.com <mailto:daniel.daugherty@oracle.com>> wrote:
> 
> Greetings,
> 
> I've updated the fix based on feedback from Carsten V and David H.
> 
> Webrev URL:
> http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/
> <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/1-jdk9-hs-rt/>
> 
> Changes relative to round 0:
> 
> - only src/share/vm/runtime/synchronizer.cpp has changed
> - reads of gBlockList now use OrderAccess::load_ptr_acquire()
> 
> code style cleanups:
> 
> - only cleaned up the functions that I touched to make the
> OrderAccess::load_ptr_acquire() changes
> - changed implied booleans into real boolean expressions
> - moved some locals to narrower context
> - added/removed some blank lines
> - made casts consistent with the majority style in this file
> 
> I'm repeating all of the same testing that I did for round 0. The
> round 1 bits have not yet made it through JPRT-west, but the jobs
> are mostly done.
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan
> 
> 
> 
> 
> 
> On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:
> 
> Greetings,
> 
> I have a fix for a long standing race between the lock-free
> ObjectMonitor
> verification code and the normal (locked) ObjectMonitor block
> allocation
> code path. For this fix, I would like at least a Runtime team
> reviewer
> and a Serviceability team reviewer. Thanks!
> 
> JDK-8047212
> runtime/ParallelClassLoading/bootstrap/random/inner-complex
> 
> assert(ObjectSynchronizer::verify_objmon_isinpool(inf)) failed:
> monitor is invalid
> https://bugs.openjdk.java.net/browse/JDK-8047212
> 
> Webrev URL:
> http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/ \
> <http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/> 
> Testing: Aurora Adhoc RT-SVC nightly batch
> 4 inner-complex fastdebug parallel runs for 4+ days and
> 600K iterations without seeing this failure; the
> experiment
> is still running; final results to be reported at
> the end
> of the review cycle
> JPRT -testset hotspot
> 
> This fix:
> 
> - makes ObjectMonitor::gBlockList volatile
> - uses "OrderAccess::release_store_ptr(&gBlockList, temp)" to
> make sure the new block updates _happen before_ gBlockList is
> changed to refer to the new block
> - add SA support for a "static pointer volatile" field like:
> 
> static ObjectMonitor * volatile gBlockList;
> 
> See the following link for a nice description of what "volatile"
> means in the different positions on a variable/parameter decl
> line:
> 
> http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword
>  
> 
> Thanks, in advance, for any comments, questions or suggestions.
> 
> Dan
> 
> 
> 


[Attachment #3 (text/html)]

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <tt>Thanks for the re-review!<br>
      <br>
      Current stress test results:<br>
      <br>
      $ grep -v PASS doit_loop.fast_?.log<br>
      doit_loop.fast_0.log:Copy fast_0: loop #271077...<br>
      doit_loop.fast_1.log:Copy fast_1: loop #271178...<br>
      doit_loop.fast_2.log:Copy fast_2: loop #271217...<br>
      doit_loop.fast_3.log:Copy fast_3: loop #271223...<br>
      <br>
      $ elapsed_times mark.start_test_run doit_loop.fast_0.log<br>
      mark.start_test_run                                                             \
0 seconds<br>  doit_loop.fast_0.log     1 days 19 hours 30 minutes 39 seconds<br>
      <br>
      The typical failure rate for this bug is 1-3 failures in 3 days<br>
      with some other failure modes (in C2 or G1) popping in for a
      visit.<br>
      <br>
      So far... no failures...<br>
      <br>
      Dan<br>
    </tt><br>
    On 10/22/15, 9:30 AM, Carsten Varming wrote:
    <blockquote
cite="mid:CAP_pwnX5Vz1nBjVYQ1KttnCN6QDqtLROJDyyNbUnC1MgpoLh2w@mail.gmail.com"
      type="cite">
      <div dir="ltr">Dear Dan,
        <div><br>
        </div>
        <div>I reviewed round 1. Looks good to me.</div>
        <div><br>
        </div>
        <div>Thank you for the updated webbrew.</div>
        <div><br>
        </div>
        <div>Carsten</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Tue, Oct 20, 2015 at 2:15 PM, Daniel
          D. Daugherty <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:daniel.daugherty@oracle.com" \
target="_blank">daniel.daugherty@oracle.com</a>&gt;</span>  wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">Greetings,<br>
            <br>
            I've updated the fix based on feedback from Carsten V and
            David H.<br>
            <br>
            Webrev URL: <a moz-do-not-send="true"
              href="http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/1-jdk9-hs-rt/"
                
              rel="noreferrer" \
target="_blank">http://cr.openjdk.java.net/~dcubed/8047212-webrev/1-jdk9-hs-rt/</a><br>
  <br>
            Changes relative to round 0:<br>
            <br>
            - only src/share/vm/runtime/synchronizer.cpp has changed<br>
            - reads of gBlockList now use
            OrderAccess::load_ptr_acquire()<br>
            <br>
            code style cleanups:<br>
            <br>
            - only cleaned up the functions that I touched to make the<br>
               OrderAccess::load_ptr_acquire() changes<br>
            - changed implied booleans into real boolean expressions<br>
            - moved some locals to narrower context<br>
            - added/removed some blank lines<br>
            - made casts consistent with the majority style in this file<br>
            <br>
            I'm repeating all of the same testing that I did for round
            0. The<br>
            round 1 bits have not yet made it through JPRT-west, but the
            jobs<br>
            are mostly done.<span class="im HOEnZb"><br>
              <br>
              Thanks, in advance, for any comments, questions or
              suggestions.<br>
              <br>
              Dan<br>
              <br>
              <br>
              <br>
              <br>
              <br>
            </span>
            <div class="HOEnZb">
              <div class="h5">
                On 10/19/15, 9:02 PM, Daniel D. Daugherty wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  Greetings,<br>
                  <br>
                  I have a fix for a long standing race between the
                  lock-free ObjectMonitor<br>
                  verification code and the normal (locked)
                  ObjectMonitor block allocation<br>
                  code path. For this fix, I would like at least a
                  Runtime team reviewer<br>
                  and a Serviceability team reviewer. Thanks!<br>
                  <br>
                  JDK-8047212
                  runtime/ParallelClassLoading/bootstrap/random/inner-complex<br>
                                   
                  assert(ObjectSynchronizer::verify_objmon_isinpool(inf))
                  failed:<br>
                                    monitor is invalid<br>
                  <a moz-do-not-send="true"
                    href="https://bugs.openjdk.java.net/browse/JDK-8047212"
                    rel="noreferrer" \
target="_blank">https://bugs.openjdk.java.net/browse/JDK-8047212</a><br>  <br>
                  Webrev URL: <a moz-do-not-send="true"
                    href="http://cr.openjdk.java.net/%7Edcubed/8047212-webrev/0-jdk9-hs-rt/"
                
                    rel="noreferrer" \
target="_blank">http://cr.openjdk.java.net/~dcubed/8047212-webrev/0-jdk9-hs-rt/</a><br>
  <br>
                  Testing: Aurora Adhoc RT-SVC nightly batch<br>
                                4 inner-complex fastdebug parallel runs for
                  4+ days and<br>
                                   600K iterations without seeing this
                  failure; the experiment<br>
                                   is still running; final results to be
                  reported at the end<br>
                                   of the review cycle<br>
                                JPRT -testset hotspot<br>
                  <br>
                  This fix:<br>
                  <br>
                  - makes ObjectMonitor::gBlockList volatile<br>
                  - uses
                  "OrderAccess::release_store_ptr(&amp;gBlockList,
                  temp)" to<br>
                     make sure the new block updates _happen before_
                  gBlockList is<br>
                     changed to refer to the new block<br>
                  - add SA support for a "static pointer volatile" field
                  like:<br>
                  <br>
                        static ObjectMonitor * volatile gBlockList;<br>
                  <br>
                  See the following link for a nice description of what
                  "volatile"<br>
                  means in the different positions on a
                  variable/parameter decl line:<br>
                  <br>
                  <a moz-do-not-send="true"
href="http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword"
                
                    rel="noreferrer" \
target="_blank">http://www.embedded.com/electronics-blogs/beginner-s-corner/4023801/Introduction-to-the-Volatile-Keyword</a>
  <br>
                  <br>
                  Thanks, in advance, for any comments, questions or
                  suggestions.<br>
                  <br>
                  Dan<br>
                  <br>
                  <br>
                </blockquote>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
  </body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic