[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state
From:       David Holmes <david.holmes () oracle ! com>
Date:       2019-11-15 2:32:04
Message-ID: aaa27416-e39a-81c7-9d85-979f5dacb0ac () oracle ! com
[Download RAW message or body]

Thanks again Serguei.

David

On 15/11/2019 12:14 pm, serguei.spitsyn@oracle.com wrote:
> Hi David,
> 
> Thank you for the update!
> It looks good to me.
> 
> You are right about my first suggestion.
> The lines need to stay where they are, or additional curly brackets
> are needed to force the ThreadBlockInVM destructor earlier.
> 
> Thanks,
> Serguei
> 
> 
> On 11/14/19 2:21 PM, David Holmes wrote:
> > Hi Serguei,
> > 
> > Thanks for taking a look.
> > 
> > On 15/11/2019 4:04 am, serguei.spitsyn@oracle.com wrote:
> > > Hi David,
> > > 
> > > It looks good to me.
> > > A couple of nits below.
> > > 
> > > http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html \
> > >  
> > > 
> > > 236 if (self->is_Java_thread()) {
> > > 237 JavaThread* jt = (JavaThread*) self;
> > > 238 // Transition to VM so we can check interrupt state
> > > 239 ThreadInVMfromNative tivm(jt);
> > > 240 if (jt->is_interrupted(true)) {
> > > 241 ret = M_INTERRUPTED;
> > > 242 } else {
> > > 243 ThreadBlockInVM tbivm(jt);
> > > 244 jt->set_suspend_equivalent();
> > > 245 if (millis <= 0) {
> > > 246 self->_ParkEvent->park();
> > > 247 } else {
> > > 248 self->_ParkEvent->park(millis);
> > > 249 }
> > > 250 }
> > > 251 // Return to VM before post-check of interrupt state
> > > 252 if (jt->is_interrupted(true)) {
> > > 253 ret = M_INTERRUPTED;
> > > 254 }
> > > 255 } else {
> > > 
> > > 
> > > It seems, the fragment at lines 251-254 needs to bebefore the line 250.
> > > It will add more clarity to this code.
> > 
> > No, it has to be after line 250 as that is when we will hit the TBIVM 
> > destructor and so return to _thread_in_vm which is the state needed to 
> > read the interrupted field. Dan commented on the above and I changed 
> > it slightly by moving the comment:
> > 
> > > 250   // Return to VM before post-check of interrupt state
> > > 251 }
> > > 252 if (jt->is_interrupted(true)) {
> > > 253   ret = M_INTERRUPTED;
> > > 254 }
> > 
> > 
> > > 412   if (self->is_Java_thread()) {
> > > 413 JavaThread* jt = (JavaThread*)self;
> > > 414 jt->set_suspend_equivalent();
> > > 415     for (;;) {
> > > 416       if (!jt->handle_special_suspend_equivalent_condition()) {
> > > 417         break;
> > > 418 } else {
> > > 419 // We've been suspended whilst waiting and so we have to
> > > 420 // relinquish the raw monitor until we are resumed. Of course
> > > 421 // after reacquiring we have to re-check for suspension again.
> > > 422 // Suspension requires we are _thread_blocked, and we also have to
> > > 423 // recheck for being interrupted.
> > > 424         simple_exit(jt);
> > > 425 {
> > > 426 ThreadInVMfromNative tivm(jt);
> > > 427 {
> > > 428 ThreadBlockInVM tbivm(jt);
> > > 429             jt->java_suspend_self();
> > > 430 }
> > > 431 if (jt->is_interrupted(true)) {
> > > 432 ret = M_INTERRUPTED;
> > > 433 }
> > > 434 }
> > > 435         simple_enter(jt);
> > > 436         jt->set_suspend_equivalent();
> > > 437       }
> > > ...
> > > 
> > > This code can be simplified a little bit.
> > > The line:
> > > 
> > > 414 jt->set_suspend_equivalent();
> > > 
> > > can be placed before line 416.
> > > Then this line can be removed:
> > > 
> > > 436         jt->set_suspend_equivalent();
> > 
> > Yes you're right. I was trying to preserve the original loop 
> > structure, but then had to add the additional set_suspend_equivalent 
> > for the first iteration. But I can instead just move the existing one 
> > to the top of the loop.
> > 
> > Webrev updated in place.
> > 
> > Thanks,
> > David
> > -----
> > 
> > > 
> > > Thanks,
> > > Serguei
> > > 
> > > 
> > > On 11/11/19 20:52, David Holmes wrote:
> > > > webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
> > > > bug: https://bugs.openjdk.java.net/browse/JDK-8233549
> > > > 
> > > > In JDK-8229516 I moved the interrupted state of a thread from the 
> > > > osThread in the VM to the java.lang.Thread instance. In doing that I 
> > > > overlooked a critical aspect, which is that to access the field of a 
> > > > Java object the JavaThread must not be in a safepoint-safe state** - 
> > > > otherwise the oop, and anything referenced there from could be 
> > > > relocated by the GC whilst the JavaThread is accessing it. This 
> > > > manifested in a number of tests using JVM TI Agent threads and JVM 
> > > > TI RawMonitors because the JavaThread's were marked _thread_blocked 
> > > > and hence safepoint-safe, and we read a non-zero value for the 
> > > > interrupted field even though we had never been interrupted.
> > > > 
> > > > This problem existed in all the code that checks for interruption 
> > > > when "waiting":
> > > > 
> > > > - Parker::park (the code underpinning 
> > > > java.util.concurrent.LockSupport.park())
> > > > 
> > > > To fix this code I simply deleted a late check of the interrupted 
> > > > field. The check was not needed because if an interrupt has occurred 
> > > > then we will find the ParkEvent in a signalled state.
> > > > 
> > > > - ObjectMonitor::wait
> > > > 
> > > > Here the late check of the interrupted state is essential as we 
> > > > reset the ParkEvent after an earlier check of the interrupted state. 
> > > > But the fix was simply achieved by moving the check slightly earlier 
> > > > before we use ThreadBlockInVm to become _thread_blocked.
> > > > 
> > > > - RawMonitor::wait
> > > > 
> > > > This fix was much more involved. The RawMonitor code directly 
> > > > transitions the JavaThread from _thread_in_Native to 
> > > > _thread_blocked. This is safe from a safepoint perspective because 
> > > > they are equivalent safepoint-safe states. To allow access to the 
> > > > interrupted field I have to transition from native to _thread_in_vm, 
> > > > and that has to be done by proper thread-state transitions to ensure 
> > > > correct access to the oop and its fields. Having done that I can 
> > > > then use ThreadBlockInVM for the transitions to blocked. However, as 
> > > > the old code noted it can't use proper thread-state transitions as 
> > > > this will lead to deadlocks with the VMThread that can also use 
> > > > RawMonitors when executing various event callbacks. To deal with 
> > > > that we have to note that the real constraint is that the JavaThread 
> > > > cannot block at a safepoint whilst it holds the RawMonitor. Hence 
> > > > the fix was push all the interrupt checking code and the 
> > > > thread-state transitions to the lowest level of RawMonitorWait, 
> > > > around the final park() call, after we have enqueued the waiter and 
> > > > released the monitor. That avoids any deadlock possibility.
> > > > 
> > > > I also added checks to is_interrupted/interrupted to ensure they are 
> > > > only called by a thread in a suitable state. This should only be the 
> > > > VMThread (as a consequence of the Thread.stop implementation 
> > > > occurring at a safepoint and issuing a JavaThread::interrupt() call 
> > > > to unblock the target); or a JavaThread that is not 
> > > > _thread_in_native or _thread_blocked.
> > > > 
> > > > Testing: (still finalizing)
> > > > - tiers 1 - 6 (Oracle platforms)
> > > > - Local Linux testing
> > > > - vmTestbase/nsk/monitoring/
> > > > - vmTestbase/nsk/jdwp
> > > > - vmTestbase/nsk/jdb/
> > > > - vmTestbase/nsk/jdi/
> > > > - vmTestbase/nsk/jvmti/
> > > > - serviceability/jvmti/
> > > > - serviceability/jdwp
> > > > - JDK: java/lang/management
> > > > com/sun/management
> > > > 
> > > > ** Note that this applies to all accesses we make via code in 
> > > > javaClasses.*. For this particular code I thought about adding a 
> > > > guard in JavaThread::threadObj() but it turns out when we generate a 
> > > > crash report we access the Thread's name() field and that can happen 
> > > > when in any state, so we'd always trigger a secondary assertion 
> > > > failure during error reporting if we did that. Note that accessing 
> > > > name() can still easily lead to secondary assertions failures as I 
> > > > discovered when trying to debug this and print the thread name out - 
> > > > I would see an is_instance assertion fail checking that the Thread 
> > > > name() is an instance of java.lang.String!
> > > > 
> > > > Thanks,
> > > > David
> > > > -----
> > > 
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic