[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: Re: Fwd: Patch: Clean up fix for JDK-8047720
From: "Daniel D. Daugherty" <daniel.daugherty () oracle ! com>
Date: 2015-02-17 18:34:31
Message-ID: 54E389B7.1000909 () oracle ! com
[Download RAW message or body]
Carsten,
I believe I have all the changes that you recommended in your
better.patch file. Not 100% byte-for-byte, but I think we're
covered.
Dan
On 2/17/15 11:09 AM, Carsten Varming wrote:
> Dear Daniel,
>
> Great news.
>
> I was reading your comments on JDK-8072439 and just want to point out
> the other race that my patch fixed:
>
> WatcherThread::run sees !_should_terminate, and calls sleep.
> WatcherThread::stop takes PeriodicTask_lock, sets _should_terminate,
> notify PeriodicTask_lock, and release the lock. Sleep wait forever on
> PeriodicTask_lock. Notice that the wait forever happens only if there
> are no periodic tasks scheduled, so this might not be reproducible
> (even with suitable sleeps), yet it is a lurking bug.
>
> Let me know if there is anything I can do to help,
> Carsten
>
> On Tue, Feb 17, 2015 at 12:43 PM, Daniel D. Daugherty
> <daniel.daugherty@oracle.com <mailto:daniel.daugherty@oracle.com>> wrote:
>
> Carsten.
>
> I've started diving into this issue. Please see the updates
> that I've been making to:
>
> JDK-8072439 fix for 8047720 may need more work
> https://bugs.openjdk.java.net/browse/JDK-8072439
>
> Dan
>
>
> On 2/10/15 3:18 PM, Carsten Varming wrote:
>> Dear David,
>>
>> Any news regarding this fix? Anything I can do to help?
>>
>> Best wishes,
>> Carsten
>>
>> On Wed, Feb 4, 2015 at 12:54 PM, Carsten Varming
>> <varming@gmail.com <mailto:varming@gmail.com>> wrote:
>>
>> Dear David,
>>
>> I took a look at all occurrences of PeriodicTask_lock. We are
>> in luck:
>>
>> PeriodicTask::enroll and PeriodicTask::disenroll check for
>> safepoints when acquiring PeriodicTask_lock.
>>
>> PeriodicTask::time_to_wait and PeriodicTask::real_time_tick
>> are called only by the watcher thread.
>>
>> WatcherThread::unpark is used only in contexts where
>> PeriodicTask_lock is owned by the caller.
>>
>> WatcherThread::sleep is called only by the watcher thread.
>>
>> I took another look at WatcherThread::sleep and realized that
>> there is a path from acquiring PeriodicTask_lock to waiting
>> on the lock where _should_terminate is not checked. I added
>> that to the minimal fix attached.
>>
>> Looking at these methods made me want to clean up a little
>> more. See better.patch attached.
>>
>> I feel pretty good about the patch with the limited usage of
>> no_safepoint_check_flag with PeriodicTask_lock.
>>
>> Carsten
>>
>>
>> On Tue, Feb 3, 2015 at 11:28 PM, David Holmes
>> <david.holmes@oracle.com <mailto:david.holmes@oracle.com>> wrote:
>>
>> On 4/02/2015 1:28 PM, Carsten Varming wrote:
>>
>> Dear David Holmes,
>>
>> Please see inline response,
>>
>> On Tue, Feb 3, 2015 at 9:38 PM, David Holmes
>> <david.holmes@oracle.com <mailto:david.holmes@oracle.com>
>> <mailto:david.holmes@oracle.com
>> <mailto:david.holmes@oracle.com>>> wrote:
>>
>> On 4/02/2015 5:00 AM, Carsten Varming wrote:
>>
>> Greetings all,
>>
>> I was recently introduced to the deadlock
>> described in
>> JDK-8047720 and
>> fixed in JDK9. The fix seems a little messy
>> to me, and it looks
>> like it
>> left some very short races in the code. So I
>> decided to clean up the
>> code. See attached diff.
>>
>> The change takes a step back and acquires
>> PeriodicTask_lock in
>> WatcherThread::stop (like before the fix in
>> JDK9), but this time
>> safepoints are allowed to proceed when
>> acquiring PeriodicTask_lock,
>> preventing the deadlock.
>>
>>
>> It isn't obvious that blocking for a safepoint
>> whilst in this method
>> will always be a safe thing to do. That would
>> need to be examined in
>> detail - potential interactions can be very subtle.
>>
>>
>> Absolutely true. For reference, the pattern is
>> repeated with the
>> Terminator_lock a few lines below. The pattern is
>> also used in
>> Threads::destroy_vm before and after calling
>> before_exit, and the java
>> shutdown hooks are called right before the call to
>> before_exit. So there
>> is a lot of evidence that safepoints are allowed to
>> happen in this context.
>>
>>
>> The thread calling before_exit is a JavaThread so of
>> course it participates in safepoints. The issue is
>> whether the interaction between that thread and the
>> WatcherThread, via the PeriodicTask_lock, can allow for
>> the JavaThread blocking at a safepoint whilst owning that
>> lock. If another JavaThread can try to lock it without a
>> safepoint check then we can get a deadlock.
>>
>> Cheers,
>> David
>>
>>
>> Thank you for taking your time to look at this,
>> Carsten
>>
>>
>> David
>>
>>
>> Let me know what you think,
>> Carsten
>>
>>
>>
>>
>
>
[Attachment #3 (text/html)]
<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<tt>Carsten,<br>
<br>
I believe I have all the changes that you recommended in your<br>
better.patch file. Not 100% byte-for-byte, but I think we're<br>
covered.<br>
<br>
Dan<br>
<br>
<br>
</tt>
<div class="moz-cite-prefix">On 2/17/15 11:09 AM, Carsten Varming
wrote:<br>
</div>
<blockquote
cite="mid:CAP_pwnUe8Fd4yWjtJ-JeA70fynW_nNsks7Yqur4kP6JAuriRgg@mail.gmail.com"
type="cite">
<div dir="ltr">Dear Daniel,
<div><br>
</div>
<div>Great news.</div>
<div><br>
</div>
<div>I was reading your comments on JDK-8072439 and just want to
point out the other race that my patch fixed:</div>
<div><br>
</div>
<div>WatcherThread::run sees !_should_terminate, and calls
sleep. WatcherThread::stop takes PeriodicTask_lock, sets
_should_terminate, notify PeriodicTask_lock, and release the
lock. Sleep wait forever on PeriodicTask_lock. Notice that the
wait forever happens only if there are no periodic tasks
scheduled, so this might not be reproducible (even with
suitable sleeps), yet it is a lurking bug.</div>
<div><br>
</div>
<div>Let me know if there is anything I can do to help,</div>
<div>Carsten</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Feb 17, 2015 at 12:43 PM,
Daniel D. Daugherty <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:daniel.daugherty@oracle.com" \
target="_blank">daniel.daugherty@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> <tt>Carsten.<br>
<br>
I've started diving into this issue. Please see the
updates<br>
that I've been making to:<span class=""><br>
<br>
JDK-8072439 fix for 8047720 may need more work<br>
<a moz-do-not-send="true"
href="https://bugs.openjdk.java.net/browse/JDK-8072439"
target="_blank">https://bugs.openjdk.java.net/browse/JDK-8072439</a><br>
<br>
Dan<br>
<br>
<br>
</span></tt>
<div>
<div class="h5">
<div>On 2/10/15 3:18 PM, Carsten Varming wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Dear David,
<div><br>
</div>
<div>Any news regarding this fix? Anything I can
do to help?</div>
<div><br>
</div>
<div>Best wishes,</div>
<div>Carsten</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Feb 4, 2015 at
12:54 PM, Carsten Varming <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:varming@gmail.com"
target="_blank">varming@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div dir="ltr">Dear David,
<div><br>
</div>
<div>I took a look at all occurrences of
PeriodicTask_lock. We are in luck:</div>
<div><br>
</div>
<div>PeriodicTask::enroll and
PeriodicTask::disenroll check for
safepoints when acquiring
PeriodicTask_lock.</div>
<div><br>
</div>
<div>PeriodicTask::time_to_wait
and PeriodicTask::real_time_tick are
called only by the watcher thread.<br>
</div>
<div><br>
</div>
<div>WatcherThread::unpark is used only in
contexts where PeriodicTask_lock is owned
by the caller.<br>
</div>
<div><br>
</div>
<div>WatcherThread::sleep is called only by
the watcher thread.</div>
<div><br>
</div>
<div>I took another look at
WatcherThread::sleep and realized that
there is a path from acquiring
PeriodicTask_lock to waiting on the lock
where _should_terminate is not checked. I
added that to the minimal fix attached.</div>
<div><br>
</div>
<div>Looking at these methods made me want
to clean up a little more. See
better.patch attached.</div>
<div><br>
</div>
<div>I feel pretty good about the patch with
the limited usage of
no_safepoint_check_flag with
PeriodicTask_lock.</div>
<span><font color="#888888">
<div><br>
</div>
<div>Carsten</div>
<div><br>
</div>
</font></span></div>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Feb 3,
2015 at 11:28 PM, David Holmes <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:david.holmes@oracle.com"
\
target="_blank">david.holmes@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex"><span>On
4/02/2015 1:28 PM, Carsten Varming
wrote:<br>
</span>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex"><span>
Dear David Holmes,<br>
<br>
Please see inline response,<br>
<br>
On Tue, Feb 3, 2015 at 9:38 PM,
David Holmes <<a
moz-do-not-send="true"
href="mailto:david.holmes@oracle.com"
\
target="_blank">david.holmes@oracle.com</a><br> </span><span> <mailto:<a
moz-do-not-send="true"
href="mailto:david.holmes@oracle.com"
\
target="_blank">david.holmes@oracle.com</a>>>
wrote:<br>
<br>
On 4/02/2015 5:00 AM,
Carsten Varming wrote:<br>
<br>
Greetings all,<br>
<br>
I was recently
introduced to the deadlock
described in<br>
JDK-8047720 and<br>
fixed in JDK9. The fix
seems a little messy to me, and
it looks<br>
like it<br>
left some very short
races in the code. So I decided
to clean up the<br>
code. See attached diff.<br>
<br>
The change takes a step
back and acquires
PeriodicTask_lock in<br>
WatcherThread::stop
(like before the fix in JDK9),
but this time<br>
safepoints are allowed
to proceed when acquiring
PeriodicTask_lock,<br>
preventing the deadlock.<br>
<br>
<br>
It isn't obvious that
blocking for a safepoint whilst
in this method<br>
will always be a safe thing
to do. That would need to be
examined in<br>
detail - potential
interactions can be very subtle.<br>
<br>
<br>
Absolutely true. For reference,
the pattern is repeated with the<br>
Terminator_lock a few lines
below. The pattern is also used
in<br>
Threads::destroy_vm before and
after calling before_exit, and
the java<br>
shutdown hooks are called right
before the call to before_exit.
So there<br>
is a lot of evidence that
safepoints are allowed to happen
in this context.<br>
</span></blockquote>
<br>
The thread calling before_exit is a
JavaThread so of course it
participates in safepoints. The
issue is whether the interaction
between that thread and the
WatcherThread, via the
PeriodicTask_lock, can allow for the
JavaThread blocking at a safepoint
whilst owning that lock. If another
JavaThread can try to lock it
without a safepoint check then we
can get a deadlock.<br>
<br>
Cheers,<br>
David
<div>
<div><br>
<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex"> Thank
you for taking your time to
look at this,<br>
Carsten<br>
<br>
<br>
David<br>
<br>
<br>
Let me know what you
think,<br>
Carsten<br>
<br>
<br>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic