[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE m
From:       "serguei.spitsyn () oracle ! com" <serguei ! spitsyn () oracle ! com>
Date:       2020-05-28 23:16:43
Message-ID: 3a497901-7a05-e87a-33e6-6f1011c32b8b () oracle ! com
[Download RAW message or body]

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">Hi Coleen,<br>
      <br>
      The updated webrev version is:<br>
        <a class="moz-txt-link-freetext" \
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/</a><br>
  <br>
      It has your suggestions addressed:<br>
       - <span class="new">remove log_is_enabled conditions<br>
         - move ResourceMark's out of loops<br>
      </span><br>
      Thanks,<br>
      Serguei<br>
      <br>
      <span class="new"></span><br>
      On 5/28/20 14:44, <a class="moz-txt-link-abbreviated" \
href="mailto:serguei.spitsyn@oracle.com">serguei.spitsyn@oracle.com</a> wrote:<br>  \
</div>  <blockquote type="cite"
      cite="mid:9b75fa4e-f579-e4a7-7996-bc307d001972@oracle.com">
      <div class="moz-cite-prefix">Hi Coleen,<br>
        <br>
        Thank you a lot for reviewing this!<br>
        <br>
        <br>
        On 5/28/20 12:48, <a class="moz-txt-link-abbreviated"
          href="mailto:coleen.phillimore@oracle.com"
          moz-do-not-send="true">coleen.phillimore@oracle.com</a> wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:31ca58d7-99ac-c53d-461f-680461fb5698@oracle.com"> Hi
        Serguei,<br>
        Sorry for the delay reviewing this again.<br>
        <br>
        <div class="moz-cite-prefix">On 5/18/20 3:30 AM, <a
            class="moz-txt-link-abbreviated"
            href="mailto:serguei.spitsyn@oracle.com"
            moz-do-not-send="true">serguei.spitsyn@oracle.com</a> wrote:<br>
        </div>
        <blockquote type="cite"
          cite="mid:f97d9630-df42-e874-46f6-e33ac34fad1a@oracle.com">
          <div class="moz-cite-prefix">Hi Coleen and potential
            reviewers,<br>
            <br>
            Now, the webrev:<br>
              <a class="moz-txt-link-freetext"
              href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/"
              moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/</a><br>
  <br>
            has a complete fix for all three failure modes related to
            the guarantee about OLD and OBSOLETE methods.<br>
            <br>
            The root cause are the following optimizations:<br>
            <br>
             1) Optimization based on the flag
            ik-&gt;is_being_redefined():<br>
                The problem is that the cpcache method entries of such
            classes are not being adjusted.<br>
                It is explained below in the initial RFR summary.<br>
                The fix is to get rid of this optimization.<br>
          </div>
        </blockquote>
        <br>
        This seems like a good thing to do even though (actually
        especially because) I can't re-imagine the logic that went into
        this optimization.<br>
      </blockquote>
      <br>
      Probably, I've not explained it well enough.<br>
      The logic was that the class marked as is_being_redefined was
      considered as being redefined in the current redefinition
      operation.<br>
      For classes redefined in current redefinition the cpcache is
      empty, so there is  nothing to adjust.<br>
      The problem is that classes can be marked as is_being_redefined by
      doit_prologue of one of the following redefinition operations.<br>
      In such a case, the VM_RedefineClasses::CheckClass::do_klass fails
      with this guarantee.<br>
      It is because the VM_RedefineClasses::CheckClass::do_klass does
      not have this optimization<br>
      and does not skip such classes as the
      VM_RedefineClasses::AdjustAndCleanMetadata::do_class.<br>
      Without this catch this issue could have unknown consequences in
      the future execution far away from the root cause.<br>
      <br>
      <blockquote type="cite"
        cite="mid:31ca58d7-99ac-c53d-461f-680461fb5698@oracle.com">
        <blockquote type="cite"
          cite="mid:f97d9630-df42-e874-46f6-e33ac34fad1a@oracle.com">
          <div class="moz-cite-prefix"> <br>
             2) Optimization for array classes based on the flag
            _has_redefined_Object.<br>
                The problem is that the vtable method entries are not
            adjusted for array classes.<br>
                The array classes have to be adjusted even if the
            java.lang.Object was redefined<br>
                by one of previous VM_RedefineClasses operation, not
            only if it was redefined in<br>
                the current VM_RedefineClasses operation. The fix is is
            follow this requirement.<br>
          </div>
        </blockquote>
        <br>
        This I can't understand.  The redefinitions are serialized in
        safepoints, so why would you need to replace vtable entries for
        arrays if java.lang.Object isn't redefined in this safepoint?<br>
      </blockquote>
      The VM_RedefineClasses::CheckClass::do_klass fails with the same
      guarantee because of this.<br>
      It never fails this way with this optimization relaxed.<br>
      I've already broke my head trying to understand it.<br>
      It can be because of another bug we don't know yet.<br>
      <br>
      <blockquote type="cite"
        cite="mid:31ca58d7-99ac-c53d-461f-680461fb5698@oracle.com">
        <blockquote type="cite"
          cite="mid:f97d9630-df42-e874-46f6-e33ac34fad1a@oracle.com">
          <div class="moz-cite-prefix"> <br>
             3) Optimization based on the flag <span \
class="removed">_has_null_class_loader  which assumes that the Hotspot<br>
                  does not support delegation </span><span
              class="removed">from the bootstrap class loader to a</span><span
              class="removed"> user-defined class<br>
                  loader.</span><span class="removed"> The assumption is
              that if the current class being redefined has a
              user-defined<br>
                  class</span><span class="removed"> loader as its
              defining class loader, then all</span><span
              class="removed"> classes loaded by the bootstrap<br>
                  class loader can be skipped for vtable/itable method
              entries adjustment.<br>
                  The problem is that this assumption is not really
              correct. There are classes that<br>
                  still need the adjustment. For instance, the class
              java.util.IdentityHashMap$KeyIterator<br>
                  loaded by the bootstrap class loader has the
              vtable/itable references to the method:<br>
                     </span><span class="removed"><span \
class="removed"></span>java.util.Iterator.forEachRemaining(java.util.function.Consumer)<br>
  The class </span><span class="removed"></span><span
              class="removed"><span class="removed"></span>java.util.Iterator
              is defined by a user-defined class loader.<br>
                  The fix is to get rid of this optimization.<br>
            </span></div>
        </blockquote>
        <br>
        Also with this optimization, I'm not sure what the logic was
        that determined that this was safe, so it's best to remove it. 
        Above makes sense.<br>
      </blockquote>
      <br>
      I don't know the full theory behind this optimization. We only
      have a comment.<br>
      <br>
      <span class="removed"></span><br>
      <blockquote type="cite"
        cite="mid:31ca58d7-99ac-c53d-461f-680461fb5698@oracle.com">
        <blockquote type="cite"
          cite="mid:f97d9630-df42-e874-46f6-e33ac34fad1a@oracle.com"><span
            class="removed"></span>
          <div class="moz-cite-prefix"><span class="removed"> All three
              failure modes are observed with the -Xcomp flag.<br>
              With all three fixes above in place, the Kitchensink does
              not fail with this guarantee anymore.<br>
            </span></div>
        </blockquote>
        <br>
        <br>
        <a
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html"
                
          moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html</a><br>
  <br>
        For logging, the log_trace function will also repeat the 'if'
        statement and not allocate the external_name() if logging isn't
        specified, so you don't need the 'if' statement above.<br>
        <br>
        <pre><span class="new">+  if (log_is_enabled(Trace, redefine, class, update)) \
{</span> <span class="new">+    log_trace(redefine, class, update, \
constantpool)</span> <span class="new">+          ("cpc %s entry update: %s", \
entry_type, new_method-&gt;external_name());

</span><span class="new"></span>
</pre>
        <a
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/klassVtable.cpp.udiff.html"
                
          moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/klassVtable.cpp.udiff.html</a><br>
  <br>
        Same in two cases here, and you could move the ResourceMark
        outside the loop at the top.<br>
      </blockquote>
      <br>
      Good suggestions, taken.<br>
      <br>
      Thanks!<br>
      Serguei<br>
      <br>
      <blockquote type="cite"
        cite="mid:31ca58d7-99ac-c53d-461f-680461fb5698@oracle.com"> <br>
        Thanks,<br>
        Coleen<br>
        <pre><span class="new"></span></pre>
        <blockquote type="cite"
          cite="mid:f97d9630-df42-e874-46f6-e33ac34fad1a@oracle.com">
          <div class="moz-cite-prefix"><span class="removed"> </span><span
              class="removed"></span><span class="removed"></span><span
              class="removed"></span><br>
            There is still a JIT compiler relted failure:<br>
              <a class="moz-txt-link-freetext"
              href="https://bugs.openjdk.java.net/browse/JDK-8245128"
              moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8245128</a><br>
  Kitchensink fails with: assert(destination ==
            (address)-1 || destination == entry) failed: b) MT-unsafe
            modification of inline cache<br>
            <br>
            I also saw this failure but just once:<br>
              <a class="moz-txt-link-freetext"
              href="https://bugs.openjdk.java.net/browse/JDK-8245126"
              moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8245126</a><br>
  Kitchensink fails with: assert(!method-&gt;is_old())
            failed: Should not be installing old methods<br>
            <br>
            Thanks,<br>
            Serguei<br>
            <br>
            <br>
            On 5/15/20 15:14, <a class="moz-txt-link-abbreviated"
              href="mailto:serguei.spitsyn@oracle.com"
              moz-do-not-send="true">serguei.spitsyn@oracle.com</a>
            wrote:<br>
          </div>
          <blockquote type="cite"
            cite="mid:52ba0f0f-a705-2043-1c1d-15ba4a441aba@oracle.com">
            <div class="moz-cite-prefix">Hi Coleen,<br>
              <br>
              Thanks a lot for review!<br>
              Good suggestion, will use it.<br>
              <br>
              In fact, I've found two more related problems with the
              same guarantee.<br>
              One is with vtable method entries adjustment and another
              with itable.<br>
              This webrev version includes a fix for the vtable related
              issue:<br>
                <a class="moz-txt-link-freetext"
                href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/"
                
                moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/</a><br>
  <br>
              I'm still investigating the itable related issue.<br>
              <br>
              It is interesting that the Kitchensink with
              Instrumentation modules enabled is like a Pandora box full
              of surprises.<br>
              New problems are getting discovered after some road blocks
              are removed.<br>
              I've just filed a couple of compiler bugs discovered in
              this mode of testing:<br>
                <a class="moz-txt-link-freetext"
                href="https://bugs.openjdk.java.net/browse/JDK-8245126"
                moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8245126</a><br>
  Kitchensink fails with: assert(!method-&gt;is_old())
              failed: Should not be installing old methods<br>
              <br>
                <a class="moz-txt-link-freetext"
                href="https://bugs.openjdk.java.net/browse/JDK-8245128"
                moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8245128</a><br>
  Kitchensink fails with: assert(destination ==
              (address)-1 || destination == entry) failed: b) MT-unsafe
              modification of inline cache<br>
              <br>
              <br>
              Thanks,<br>
              Serguei   <br>
              <br>
              <br>
              On 5/15/20 05:12, <a class="moz-txt-link-abbreviated"
                href="mailto:coleen.phillimore@oracle.com"
                moz-do-not-send="true">coleen.phillimore@oracle.com</a>
              wrote:<br>
            </div>
            <blockquote type="cite"
              cite="mid:2f9aa92c-18f5-1203-1523-3c1fd9ba9ad1@oracle.com">
              <br>
              Serguei,<br>
              <br>
              Good find!!  The fix looks good.  I'm sure the
              optimization wasn't noticeable and thank you for the
              additional comments.<br>
              <br>
              There is a Method::external_name() function that I believe
              prints all the things you want in the logging here:<br>
              <br>
              <a
href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/src/hotspot/share/oops/cpCache.cpp.udiff.html"
                
                moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/src/hotspot/share/oops/cpCache.cpp.udiff.html</a><br>
  <br>
              I don't need to see another webrev if you make this
              change.<br>
              <br>
              Thanks,<br>
              Coleen<br>
              <br>
              <div class="moz-cite-prefix">On 5/14/20 12:26 PM, <a
                  class="moz-txt-link-abbreviated"
                  href="mailto:serguei.spitsyn@oracle.com"
                  moz-do-not-send="true">serguei.spitsyn@oracle.com</a>
                wrote:<br>
              </div>
              <blockquote type="cite"
                cite="mid:5942b42c-b9b3-f1d4-6c13-774649fca32b@oracle.com">
                Please, review a fix for The Kitchensink bug:<br>
                  <a class="moz-txt-link-freetext"
                  href="https://bugs.openjdk.java.net/browse/JDK-8222005"
                  moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8222005</a><br>
  <br>
                Webrev:<br>
                  <a class="moz-txt-link-freetext"
                  href="http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/"
                
                  moz-do-not-send="true">http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/</a><br>
  <br>
                Summary:<br>
                  The VM_RedefineClasses::doit() uses two helper classes
                to walk all VM classes.<br>
                  First is AdjustAndCleanMetadata to adjust method
                entries in the vtables/itables/cpcaches.<br>
                  Second is CheckClass to check that adjustments for all
                method entries are correct.<br>
                  The Kitchensink test is failing with two modes:<br>
                    - guarantee(false) failed: OLD and/or OBSOLETE
                method(s) found in the<br>
                      VM_RedefineClasses::CheckClass::do_klass()<br>
                    - SIGSEGV in the
                ConstantPoolCacheEntry::get_interesting_method_entry()
                in context<br>
                      of VM_RedefineClasses::CheckClass::do_klass()
                execution<br>
                <br>
                  The second failure mode is rare. In is before the
                first one in the code path.<br>
                  The root cause of both is that the
                VM_RedefineClasses::AdjustAndCleanMetadata::do_klass()<br>
                  is skipping the cpcache update for classes that are
                being redefined assuming they are<br>
                  being redefined by the current VM_RedefineClasses
                operation. In such cases, the adjustment<br>
                  is not needed as the cpcache is empty. The problem is
                that the assumption above is wrong.<br>
                  The class can also be redefined by another
                VM_RedefineClasses operation <span class="changed">which
                  has already<br>
                    executed its doit_prologue</span>. The cpcache
                djustment for such class is necessary.<br>
                  The fix is to always call the
                cp_cache-&gt;adjust_method_entries() even if the class
                is<br>
                  being redefined by the current VM_RedefineClasses
                operation. It is possible to skip it<br>
                  but it will add extra complexity to the code.<br>
                  The fix also includes minor tweak in the cpCache.cpp
                to include method's class name to<br>
                  the redefinition cpcache log.<br>
                <br>
                Testing:<br>
                  Ran Kitchensink test locally on a Linux server with
                the Instrumentation module enabled.<br>
                  The test does not fail anymore.<br>
                  In progress, a mach5 tiers 1-5 and runs and separate
                mach5 Kitchensink run.<br>
                <br>
                Thanks,<br>
                Serguei<br>
              </blockquote>
              <br>
            </blockquote>
            <br>
          </blockquote>
          <br>
        </blockquote>
        <br>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic