[prev in list] [next in list] [prev in thread] [next in thread]
List: jcifs
Subject: [jcifs] Transport thread dead lock
From: Adam Morgan <adam.morgan () Q1Labs ! com>
Date: 2011-01-18 15:13:57
Message-ID: C09B411FAE194445AC54E519DBC431BB08497F () Q1EXCH07 ! Q1LABS ! INC
[Download RAW message or body]
Hi Mike
Ok so we had a customer site with roughly 80 separate transport threads operating at \
the same time. It appears there is an effective deadlock existing between four \
separate threads, due to the synchronization strategy around using the static byte \
array BUF and the SmbTransport object. I'll try to explain below using snippets from \
a stack dump we grabbed at the time.
As you can see below, Thread A is awaiting a lock on the static byte array \
(0x00002aaac7470140), Thread B is blocked on the socket read, and has a lock on the \
byte array, Thread C is waiting for a lock on an SmbTransport instance \
(0x00002aaac84b2e80), and Thread D has the lock on that instance of SmbTransport, and \
is awaiting a lock on the static byte array that Thread B has locked.
Now, we got a couple thread dumps a couple minutes apart, and the threads were still \
in this state. All this time they were receiving no data from the code using jcifs. \
I realize it isn't conclusively a deadlock based on this output (since we don't know \
for sure that the read() call in Thread B is not going to return) but given that the \
system seemed to remain in this state until a jvm restart, it appears to me that it \
IS a deadlock. Further, I patched their system with a change to make the byte array \
BUF to be non-static, and they've yet to hit the issue again (they WERE hitting it \
within a day or so, and they've now been up for 4 days with no issue).
So, to resolve this I'd recommend that the byte array be changed from static to \
non-static. The caveat with that is that each thread will then need 65K memory. \
Preferably, I'd implement a buffer pool that all threads can use to minimize the \
impact on memory requirements.
Thoughts/Comments?
Thread A
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "pool-1-thread-415" prio=10 \
tid=0x0000000052f4b800 nid=0x1d95 waiting for monitor entry \
[0x00002aab27c7f000..0x00002aab27c80a10]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - waiting to lock <0x00002aaac7470140> \
(a [B)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac853d3e8> (a \
jcifs.smb.SmbTransport)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.smb.SmbTransport.send(SmbTransport.java:655)
...
Thread B
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon prio=10 \
tid=0x00002aab343fb800 nid=0x1f74 runnable [0x000000007466d000..0x000000007466db10]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: RUNNABLE
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
java.net.SocketInputStream.socketRead0(Native Method)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
java.net.SocketInputStream.read(SocketInputStream.java:129)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.readn(Transport.java:31)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac7470140> (a [B)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.loop(Transport.java:108)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac84ab020> (a \
jcifs.smb.SmbTransport)
...
Thread C
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport14-2547" daemon prio=10 \
tid=0x0000000050e9f800 nid=0x4588 waiting for monitor entry \
[0x00000000594bd000..0x00000000594bda90]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.disconnect(Transport.java:204)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - waiting to lock <0x00002aaac84b2e80> \
(a jcifs.smb.SmbTransport)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.loop(Transport.java:124)
...
Thread D
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary "Folder Monitor \
[10.1.228.200][smb://10.1.228.200/E$/IISlogs/]" daemon prio=10 tid=0x0000000051cd1400 \
nid=0x4587 waiting for monitor entry [0x00002aab27f83000..0x00002aab27f83d10]
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - waiting to lock <0x00002aaac7470140> \
(a [B)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary - locked <0x00002aaac84b2e80> (a \
jcifs.smb.SmbTransport)
Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.smb.SmbTransport.send(SmbTransport.java:655)
...
[Attachment #3 (text/html)]
<html xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
pre
{margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:Arial;
color:windowtext;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="Section1">
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Hi Mike<o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Ok so we had a customer site with roughly 80 separate transport \
threads operating at the same time. It appears there is an effective deadlock \
existing between four separate threads, due to the synchronization strategy around \
using the static byte array BUF and the SmbTransport object. I’ll try to \
explain below using snippets from a stack dump we grabbed at the time. \
<o:p></o:p></span></font></p> <p class="MsoNormal"><font size="2" face="Arial"><span \
style="font-size:10.0pt; font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">As you can see below, Thread A is awaiting a lock on the static \
byte array (0x00002aaac7470140), Thread B is blocked on the socket read, and has a \
lock on the byte array, Thread C is waiting for a lock on an SmbTransport instance \
(0x00002aaac84b2e80), and Thread D has the lock on that instance of SmbTransport, and \
is awaiting a lock on the static byte array that Thread B has \
locked.<o:p></o:p></span></font></p> <p class="MsoNormal"><font size="2" \
face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"><o:p> </o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; font-family:Arial">Now, we got a \
couple thread dumps a couple minutes apart, and the threads were still in this \
state. All this time they were receiving no data from the code using \
jcifs. I realize it isn’t conclusively a deadlock based on this output \
(since we don’t know for sure that the read() call in Thread B is not going to \
return) but given that the system seemed to remain in this state until a jvm restart, \
it appears to me that it IS a deadlock. Further, I patched their system with a \
change to make the byte array BUF to be non-static, and they’ve yet to hit the \
issue again (they WERE hitting it within a day or so, and they’ve now been up \
for 4 days with no issue).<o:p></o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"><o:p> </o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; font-family:Arial">So, to \
resolve this I’d recommend that the byte array be changed from static to \
non-static. The caveat with that is that each thread will then need 65K \
memory. Preferably, I’d implement a buffer pool that all threads can use \
to minimize the impact on memory requirements. <o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Thoughts/Comments?<o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p> </o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Thread A<o:p></o:p></span></font></p>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
"pool-1-thread-415" prio=10 tid=0x0000000052f4b800 nid=0x1d95 waiting for \
monitor entry [0x00002aab27c7f000..0x00002aab27c80a10]<o:p></o:p></span></font></pre> \
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 \
s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- waiting to lock <0x00002aaac7470140> (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- locked <0x00002aaac853d3e8> (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.smb.SmbTransport.send(SmbTransport.java:655)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">...<o:p></o:p></span></font></pre> <pre><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"><o:p> </o:p></span></font></pre> <p \
class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"></span>Thread B<o:p></o:p></font></p> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
"jcifs-transport19-2530" daemon prio=10 tid=0x00002aab343fb800 nid=0x1f74 \
runnable [0x000000007466d000..0x000000007466db10]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 \
s5kep-tcc-an02-apd1-primary java.lang.Thread.State: \
RUNNABLE<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary at \
java.net.SocketInputStream.socketRead0(Native Method)<o:p></o:p></span></font></pre> \
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at java.net.SocketInputStream.read(SocketInputStream.java:129)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.util.transport.Transport.readn(Transport.java:31)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- locked <0x00002aaac7470140> (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.loop(Transport.java:108)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- locked <0x00002aaac84ab020> (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p> </o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: 12.0pt">Thread \
C<o:p></o:p></span></font></p> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport14-2547" daemon \
prio=10 tid=0x0000000050e9f800 nid=0x4588 waiting for monitor entry \
[0x00000000594bd000..0x00000000594bda90]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 \
s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.util.transport.Transport.disconnect(Transport.java:204)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- waiting to lock <0x00002aaac84b2e80> (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.util.transport.Transport.loop(Transport.java:124)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p> </o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: 12.0pt">Thread \
D<o:p></o:p></span></font></p> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary "Folder Monitor \
[10.1.228.200][smb://10.1.228.200/E$/IISlogs/]" daemon prio=10 \
tid=0x0000000051cd1400 nid=0x4587 waiting for monitor entry \
[0x00002aab27f83000..0x00002aab27f83d10]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 \
s5kep-tcc-an02-apd1-primary java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- waiting to lock <0x00002aaac7470140> (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan \
8 13:37:26 s5kep-tcc-an02-apd1-primary at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
- locked <0x00002aaac84b2e80> (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan 8 13:37:26 s5kep-tcc-an02-apd1-primary \
at jcifs.smb.SmbTransport.send(SmbTransport.java:655)<o:p></o:p></span></font></pre>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p> </o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: \
12.0pt"><o:p> </o:p></span></font></p> </div>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic