[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR 6712222: 6712222: Race condition in java/lang/management/ThreadMXBean/AllThreadIds.java
From:       David Holmes <david.holmes () oracle ! com>
Date:       2015-03-10 11:09:20
Message-ID: 54FED0E0.2010004 () oracle ! com
[Download RAW message or body]

On 10/03/2015 6:10 PM, Jaroslav Bachorik wrote:
> On 10.3.2015 00:35, David Holmes wrote:
>> On 9/03/2015 11:26 PM, Jaroslav Bachorik wrote:
>>> On 9.3.2015 13:41, Jaroslav Bachorik wrote:
>>>> Hi David,
>>>>
>>>> On 9.3.2015 12:15, David Holmes wrote:
>>>>> On 9/03/2015 5:37 PM, Jaroslav Bachorik wrote:
>>>>>> On 8.3.2015 14:32, David Holmes wrote:
>>>>>>> Hi Jaroslav,
>>>>>>>
>>>>>>> A couple of quick comments ...
>>>>>>>
>>>>>>> On 7/03/2015 2:08 AM, Jaroslav Bachorik wrote:
>>>>>>>> Please, review the following test change
>>>>>>>>
>>>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-6712222
>>>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/6712222/webrev.00
>>>>>>>>
>>>>>>>> As the issue subject states this test is susceptible to race
>>>>>>>> condition.
>>>>>>>> Firstly, it uses an arbitrary time period to wait after a thread
>>>>>>>> has
>>>>>>>> signalled its exit vie the associated barrier to 'make sure' the
>>>>>>>> thread
>>>>>>>> is actually terminated when the check for the number of live
>>>>>>>> threads is
>>>>>>>> performed. This is quite race prone and should be replaces by
>>>>>>>> Thread.join() performed on the specified threads before moving
>>>>>>>> forward
>>>>>>>> and checking the reported thread counts.
>>>>>>>
>>>>>>> So now you do the join() why do you still also have an acquire()
>>>>>>> preceding the join(s) ?
>>>>>>
>>>>>> Touché - I guess I just didn't want to mess with the test too much. I
>>>>>> suppose joining the threads that are expected to die should suffice.
>>>>>
>>>>> Yes.
>>>
>>> Here is the simplified webrev -
>>> http://cr.openjdk.java.net/~jbachorik/6712222/webrev.01
>>>
>>> Now, when it is sufficient to synchronize only on the threads startup I
>>> can use Phaser instead of Semaphore and make the code a further bit more
>>> readable.
>>
>> CyclicBarrier is even simpler :)
>
> Yes, but the test threads don't need to wait for each other -
> Phaser.arrive() maps better here.

Sorry I meant CountDownLatch :) But never mind.

David


>>
>> Thanks for checking the "alive" status.
>>
>> The setLive/isLive synchronization is going to cause a bit of a
>> bottleneck, but that mainly means threads will be blocked on the monitor
>> instead of sleeping, I think.
>
> Exactly, the bottleneck is not a problem here.
>
>>
>> Minor style nits: for(int  -> for (int
>
> Will fix before push.
>
> Thanks,
>
> -JB-
>
>>
>> Reviewed.
>>
>> Thanks,
>> David
>>
>>> -JB-
>>>
>>>>>
>>>>>>>
>>>>>>> Also what exactly do those counters check? Even after join() can
>>>>>>> return
>>>>>>> the thread may still be in the Threads_list.
>>>>>>
>>>>>> The test is supposed to test whether ThreadMXBean reports correct
>>>>>> numbers for the total number of started/finished threads and the
>>>>>> number
>>>>>> of threads being currently alive.
>>>>>>
>>>>>> Thread#join() should return only when the thread is not alive any
>>>>>> more.
>>>>>> The ThreadMXBean#getAllThreadIds() should return the IDs of only the
>>>>>> threads being alive at the time the method was invoked. I didn't dig
>>>>>> into the implementation - but according to the information available
>>>>>> through javadocs this should work.
>>>>>
>>>>> It is the actual implementation I'm concerned about - to ensure the
>>>>> notions of "alive" actually match.
>>>>
>>>> Invoking ThreadMXBean#getAllThreadIds() will eventually mean creating a
>>>> ThreadsListEnumerator (in threadService.cpp#941) and this enumerator
>>>> will skip any threads where `Thread.isAlive() == false`. The
>>>> Thread.join() method is using Thread.isAlive() to assert the thread
>>>> state and exits only when `Thread.isAlive() == false`. Therefore, a
>>>> thread T can't be in the list of live threads generated after T.join()
>>>> has returned.
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> Secondly, it is using a volatile array of boolean flags which are
>>>>>>>> used
>>>>>>>> for communication between the main thread and the test threads
>>>>>>>> (setting
>>>>>>>> an item of this array true will indicate the appropriate test
>>>>>>>> thread to
>>>>>>>> exit). Declaring the array as volatile does nothing for the thread
>>>>>>>> safety and reads and writes to this array must be properly
>>>>>>>> synchronized.
>>>>>>>>
>>>>>>>> I also took the liberty of replacing the arbitrary Barrier
>>>>>>>> synchronization class with the standard Semaphore.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -JB-
>>>>>>
>>>>
>>>
>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic