[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR: 8227269: Slow class loading when running JVM in debug mode
From:       Roman Kennke <rkennke () redhat ! com>
Date:       2020-01-22 16:05:17
Message-ID: 13f27bc5-8f69-7b11-830c-fb08eaa80528 () redhat ! com
[Download RAW message or body]

[Attachment #2 (multipart/mixed)]


Hi Serguei,

Thanks for reviewing!

I updated the patch to reflect your suggestions, very good!
It also includes a fix to allow re-connecting an agent after disconnect,
namely move setup of the trackingEnv and deletedSignatureBag to
_activate() to ensure have those structures after re-connect.

http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.05/

Let me know what you think!
Roman

> Hi Roman,
> 
> Thank you for taking care about this scalability issue!
> 
> I have a couple of quick comments.
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c.frames.html
>  
> 72 /*
> 73 * Lock to protect deletedSignatureBag
> 74 */
> 75 static jrawMonitorID deletedSignatureLock; 76 77 /*
> 78 * A bag containing all the deleted classes' signatures. Must be
> accessed under
> 79 * deletedTagLock,
> 80  */
> 81 struct bag* deletedSignatureBag;
> 
> The comments contradict to each other.
> I guess, the lock name at line 79 has to be deletedSignatureLock
> instead of deletedTagLock.
> Also, comma at the end must be replaced with dot.
> 
> 
> 101 // Tag not found? Ignore.
> 102 if (klass == NULL) {
> 103 debugMonitorExit(deletedSignatureLock);
> 104 return;
> 105 }
> 106 
> 107 // Scan linked-list.
> 108 jlong found_tag = klass->klass_tag;
> 109 while (klass != NULL && found_tag != tag) {
> 110 klass_ptr = &klass->next;
> 111 klass = *klass_ptr;
> 112 found_tag = klass->klass_tag;
> 113     }
> 114
> 115 // Tag not found? Ignore.
> 116 if (found_tag != tag) {
> 117 debugMonitorExit(deletedSignatureLock);
> 118 return;
> 119     }
> 
> 
> The code above can be simplified, so that the lines 101-105 are not
> needed anymore.
> It can be something like this:
> 
> // Scan linked-list.
> while (klass != NULL && klass->klass_tag != tag) {
> klass_ptr = &klass->next;
> klass = *klass_ptr;
> }
> if (klass == NULL || klass->klass_tag != tag) { // klass not found - ignore.
> debugMonitorExit(deletedSignatureLock);
> return;
> }
> 
> It will take more time when I get a chance to look at the rest.
> 
> 
> Thanks,
> Serguei
> 
> 
> 
> 
> On 12/21/19 13:24, Roman Kennke wrote:
> > Here comes an update that resolves some races that happen when
> > disconnecting an agent. In particular, we need to take the lock on
> > basically every operation, and also need to check whether or not
> > class-tracking is active and return an appropriate result (e.g. an empty
> > list) when we're not.
> > 
> > Updated webrev:
> > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.04/
> > 
> > Thanks,
> > Roman
> > 
> > 
> > > So, here comes the O(1) implementation:
> > > 
> > > - Whenever a class is 'prepared', it is registered with a tag, and we
> > > set-up a listener to get notified when it is unloaded.
> > > - Prepared classes are kept in a datastructure that is a table, which
> > > each entry being the head of a linked-list of KlassNode*. The table is
> > > indexed by tag % slot-count, and then simply prepend the new KlassNode*.
> > > This is O(1) operation.
> > > - When we get notified of unloading a class, we look up the signature of
> > > the reported tag in that table, and remember it in a bag. The KlassNode*
> > > is then unlinked from the table and deallocated. This is ~O(1) operation
> > > too, depending on the depth of the table. In my testcase which hammered
> > > the code with class-loads and unloads, I usually see depths of like 2-3,
> > > but not usually more. It should be ok.
> > > - when processUnloads() gets called, we simply hand out that bag, and
> > > allocate a new one.
> > > - I also added cleanup-code in classTrack_reset() to avoid leaking the
> > > signatures and KlassNode* etc when debug agent gets detached and/or
> > > re-attached (was missing before).
> > > - I also added locks around data-structure-manipulation (was missing
> > > before).
> > > - Also, I only activate this whole process when an actual listener gets
> > > registered on EI_GC_FINISH. This seems to happen right when attaching a
> > > jdb, not sure why jdb does that though. This may be something to improve
> > > in the future?
> > > 
> > > In my tests, the performance of class-tracking itself looks really good.
> > > The bottleneck now is clearly actual synthesizing the class-unload
> > > events. I don't see how this can be helped when the debug agent asks for it?
> > > 
> > > Updated webrev:
> > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.03/
> > > 
> > > Please let me know what you think of it.
> > > 
> > > Thanks,
> > > Roman
> > > 
> > > 
> > > > Alright, the perfectionist in me got me. I am implementing the even more
> > > > efficient ~O(1) class tracking. Please hold off reviewing for now.
> > > > 
> > > > Thanks,Roman
> > > > 
> > > > Hi Chris,
> > > > > > I'll have a look at this, although it might not be for a few days. In
> > > > > > the meantime, maybe you can describe your new implementation in
> > > > > > classTrack.c so it's easier to look through the changes.
> > > > > Sure.
> > > > > 
> > > > > The purpose of this class-tracking is to be able to determine the
> > > > > signatures of unloaded classes when GC/class-unloading happened, so that
> > > > > we can generate the appropriate JDWP event.
> > > > > 
> > > > > The current implementation does so by maintaining a table of currently
> > > > > prepared classes by building that table when classTrack is initialized,
> > > > > and then add new classes whenever a class gets loaded. When unloading
> > > > > occurs, that cache is rebuilt into a new table, and compared with the
> > > > > old table, and whatever is in the old, but not in the new table gets
> > > > > returned. The problem is that when GCs happen frequently and/or many
> > > > > classes get loaded+unloaded, this amounts to O(classCount*gcCount)
> > > > > complexity.
> > > > > 
> > > > > The new implementation keeps a linked-list of prepared classes, and also
> > > > > tracks unloads via the listener cbTrackingObjectFree(). Whenever an
> > > > > unload/GC occurs, the list of prepared classes is scanned, and classes
> > > > > that are also in the deletedTagBag are unlinked (thus maintaining the
> > > > > prepared-classes-list) and its signature put in the list that gets \
> > > > > returned. 
> > > > > The implementation is not perfect. In order to determine whether or not
> > > > > a class is unloaded, it needs to scan the deletedTagBag. That process is
> > > > > therefore still O(unloadedClassCount). The assumption here is that
> > > > > unloadedClassCount << classCount. In my experiments this seems to be
> > > > > true, and also reasonable to expect.
> > > > > 
> > > > > (I have some ideas how to improve the implementation to ~O(1) but it
> > > > > would be considerably more complex: have to maintain a (hash)table that
> > > > > maps tags -> KlassNode*, unlink them directly upon unload, and build the
> > > > > unloaded-signatures list there, but I don't currently see that it's
> > > > > worth the effort).
> > > > > 
> > > > > In addition to all that, this process is only activated when there's an
> > > > > actual listener registered for EI_GC_FINISH.
> > > > > 
> > > > > Thanks,
> > > > > Roman
> > > > > 
> > > > > 
> > > > > > Chris
> > > > > > 
> > > > > > On 12/18/19 5:05 AM, Roman Kennke wrote:
> > > > > > > Hello all,
> > > > > > > 
> > > > > > > Issue:
> > > > > > > https://bugs.openjdk.java.net/browse/JDK-8227269
> > > > > > > 
> > > > > > > I am proposing what amounts to a rewrite of classTrack.c. It avoids
> > > > > > > throwing away the class cache on GC, and instead keeps track of
> > > > > > > loaded/unloaded classes one-by-one.
> > > > > > > 
> > > > > > > In addition to that, it avoids this whole dance until an agent
> > > > > > > registers interest in EI_GC_FINISH.
> > > > > > > 
> > > > > > > Webrev:
> > > > > > > http://cr.openjdk.java.net/~rkennke/JDK-8227269/webrev.01/
> > > > > > > 
> > > > > > > Testing: manual testing of provided test scenarios and timing.
> > > > > > > 
> > > > > > > Eg with the testcase provided here:
> > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1751985
> > > > > > > 
> > > > > > > I am getting those numbers:
> > > > > > > unpatched: no debug: 84s with debug: 225s
> > > > > > > patched:     no debug: 85s with debug: 95s
> > > > > > > 
> > > > > > > I also tested successfully through jdk/submit repo
> > > > > > > 
> > > > > > > Can I please get a review?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Roman
> > > > > > > 
> 


["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic