[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread c
From:       dean.long () oracle ! com
Date:       2018-10-24 21:18:00
Message-ID: 57b7cbe0-78ed-4a27-b1b0-3f733321ecee () oracle ! com
[Download RAW message or body]

On 10/23/18 11:04 PM, Mandy Chung wrote:
>
>
> On 10/23/18 10:34 PM, David Holmes wrote:
>> On 24/10/2018 8:30 AM, dean.long@oracle.com wrote:
>>> On 10/23/18 2:51 PM, David Holmes wrote:
>>>> Hi Dean,
>>>>
>>>> On 24/10/2018 4:05 AM, dean.long@oracle.com wrote:
>>>>> On 10/23/18 9:46 AM, dean.long@oracle.com wrote:
>>>>>> On 10/22/18 3:31 PM, David Holmes wrote:
>>>>>>> Sorry Dean I'm concerned about a thread termination bottleneck 
>>>>>>> with this. A simple microbenchmark that creates 500,000 threads 
>>>>>>> that have to run and terminate, shows a 15+% slowdown on my 
>>>>>>> Linux box. I tried to find some kind of real benchmarks that 
>>>>>>> covers thread termination but couldn't see anything specific.
>>>>>>>
>>>>>>> Can you at least run this through our performance system to see 
>>>>>>> if any of the regular benchmarks are affected.
>>>>>>>
>>>>>>
>>>>>> OK, but even if the regular benchmarks don't show a difference, 
>>>>>> I'd feel better if microbenchmarks were not affected.   What if I 
>>>>>> go back to the original approach and add locking:
>>>>>>
>>>>>>      static jlong get_live_thread_count()               { MutexLocker 
>>>>>> mu(Threads_lock); return _live_threads_count->get_value() - 
>>>>>> _exiting_threads_count; }
>>>>>>      static jlong get_daemon_thread_count()           { MutexLocker 
>>>>>> mu(Threads_lock); return _daemon_threads_count->get_value() - 
>>>>>> _exiting_daemon_threads_count; }
>>>>>>
>>>>>> along with the other cleanups around is_daemon and is_hidden_thread?
>>>>>>
>>>>>
>>>>> Some micro-benchmarks like SecureRandomBench show a regression 
>>>>> with webrev.7.   I could go back to webrev.5 and then we shouldn't 
>>>>> need any locking in the get_*() functions.
>>>>
>>>> I don't see version 5 discussed but I took a look and it seems okay. 
>>>
>>> Mandy had questions about the asserts in .5, and it seemed like we 
>>> could just set the perf counters to the same value as the atomic 
>>> counters, which resulted in .6.   I think the only problem with .6 is 
>>> that I set the perf counters in decrement_thread_counts without the 
>>> lock.   If I "sync" the perf counters to the atomic counters only in 
>>> add_thread and remove_thread, with the lock, then it's about the 
>>> same as .5, but without the asserts and parallel inc/dec.   If anyone 
>>> likes the sound of that, I can send out a new webrev.   Or we can go 
>>> with webrev.5.
>>
>> I'm not sure what the concern was with the asserts - if they mis-fire 
>> we'll know soon enough. So I'm okay with .5
>>
>>>> My only query with that version is that it appears the actual 
>>>> perfCounters no longer get read by anything - is that the case?
>>>>
>>>
>>> There does seem to be code that references them, through their name 
>>> string.   That's a difference interface that I'm not familiar with, 
>>> so I didn't want to break it.
>>
>> Right - they can be accessed directly through other means. I was 
>> concerned that the perfCounter was out of sync with 
>> get_live_thread-count() but that's already the case so not an issue.
>>
>> If all tests and benchmarks are happy I say go with version .5
>>
>
> I have no objection to version .5 if most people prefer that.   My 
> comment was that I don't think the asserts are necessary.
>

OK, I'll rerun some performance benchmarks and push .5 if the results 
look OK.   David, can you send me your micro-benchmark?
Thanks for the reviews!

dl
> Mandy


[Attachment #3 (text/html)]

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    On 10/23/18 11:04 PM, Mandy Chung wrote:<br>
    <blockquote type="cite"
      cite="mid:4febfc04-21c3-fd91-db83-54777ebd7ca2@oracle.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <br>
      <br>
      <div class="moz-cite-prefix">On 10/23/18 10:34 PM, David Holmes
        wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:98af2e32-b9e7-e06a-91fc-447baa445bed@oracle.com">On
        24/10/2018 8:30 AM, <a class="moz-txt-link-abbreviated"
          href="mailto:dean.long@oracle.com" moz-do-not-send="true">dean.long@oracle.com</a>
        wrote: <br>
        <blockquote type="cite">On 10/23/18 2:51 PM, David Holmes wrote:
          <br>
          <blockquote type="cite">Hi Dean, <br>
            <br>
            On 24/10/2018 4:05 AM, <a class="moz-txt-link-abbreviated"
              href="mailto:dean.long@oracle.com" moz-do-not-send="true">dean.long@oracle.com</a>
            wrote: <br>
            <blockquote type="cite">On 10/23/18 9:46 AM, <a
                class="moz-txt-link-abbreviated"
                href="mailto:dean.long@oracle.com"
                moz-do-not-send="true">dean.long@oracle.com</a> wrote: <br>
              <blockquote type="cite">On 10/22/18 3:31 PM, David Holmes
                wrote: <br>
                <blockquote type="cite">Sorry Dean I'm concerned about a
                  thread termination bottleneck with this. A simple
                  microbenchmark that creates 500,000 threads that have
                  to run and terminate, shows a 15+% slowdown on my
                  Linux box. I tried to find some kind of real
                  benchmarks that covers thread termination but couldn't
                  see anything specific. <br>
                  <br>
                  Can you at least run this through our performance
                  system to see if any of the regular benchmarks are
                  affected. <br>
                  <br>
                </blockquote>
                <br>
                OK, but even if the regular benchmarks don't show a
                difference, I'd feel better if microbenchmarks were not
                affected.   What if I go back to the original approach
                and add locking: <br>
                <br>
                     static jlong get_live_thread_count()               {
                MutexLocker mu(Threads_lock); return
                _live_threads_count-&gt;get_value() -
                _exiting_threads_count; } <br>
                     static jlong get_daemon_thread_count()           {
                MutexLocker mu(Threads_lock); return
                _daemon_threads_count-&gt;get_value() -
                _exiting_daemon_threads_count; } <br>
                <br>
                along with the other cleanups around is_daemon and
                is_hidden_thread? <br>
                <br>
              </blockquote>
              <br>
              Some micro-benchmarks like SecureRandomBench show a
              regression with webrev.7.   I could go back to webrev.5 and
              then we shouldn't need any locking in the get_*()
              functions. <br>
            </blockquote>
            <br>
            I don't see version 5 discussed but I took a look and it
            seems okay. </blockquote>
          <br>
          Mandy had questions about the asserts in .5, and it seemed
          like we could just set the perf counters to the same value as
          the atomic counters, which resulted in .6.   I think the only
          problem with .6 is that I set the perf counters in
          decrement_thread_counts without the lock.   If I "sync" the
          perf counters to the atomic counters only in add_thread and
          remove_thread, with the lock, then it's about the same as .5,
          but without the asserts and parallel inc/dec.   If anyone likes
          the sound of that, I can send out a new webrev.   Or we can go
          with webrev.5. <br>
        </blockquote>
        <br>
        I'm not sure what the concern was with the asserts - if they
        mis-fire we'll know soon enough. So I'm okay with .5 <br>
        <br>
        <blockquote type="cite">
          <blockquote type="cite">My only query with that version is
            that it appears the actual perfCounters no longer get read
            by anything - is that the case? <br>
            <br>
          </blockquote>
          <br>
          There does seem to be code that references them, through their
          name string.   That's a difference interface that I'm not
          familiar with, so I didn't want to break it. <br>
        </blockquote>
        <br>
        Right - they can be accessed directly through other means. I was
        concerned that the perfCounter was out of sync with
        get_live_thread-count() but that's already the case so not an
        issue. <br>
        <br>
        If all tests and benchmarks are happy I say go with version .5 <br>
        <br>
      </blockquote>
      <br>
      I have no objection to version .5 if most people prefer that.   My
      comment was that I don't think the asserts are necessary.<br>
      <br>
    </blockquote>
    <br>
    OK, I'll rerun some performance benchmarks and push .5 if the
    results look OK.   David, can you send me your micro-benchmark?<br>
    Thanks for the reviews!<br>
    <br>
    dl<br>
    <blockquote type="cite"
      cite="mid:4febfc04-21c3-fd91-db83-54777ebd7ca2@oracle.com"> Mandy<br>
    </blockquote>
    <br>
  </body>
</html>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic