[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jcifs
Subject:    [jcifs] Transport thread dead lock
From:       Adam Morgan <adam.morgan () Q1Labs ! com>
Date:       2011-01-18 15:13:57
Message-ID: C09B411FAE194445AC54E519DBC431BB08497F () Q1EXCH07 ! Q1LABS ! INC
[Download RAW message or body]

Hi Mike

Ok so we had a customer site with roughly 80 separate transport threads operating at \
the same time.  It appears there is an effective deadlock existing between four \
separate threads, due to the synchronization strategy around using the static byte \
array BUF and the SmbTransport object.  I'll try to explain below using snippets from \
a stack dump we grabbed at the time.

As you can see below, Thread A is awaiting a lock on the static byte array \
(0x00002aaac7470140), Thread B is blocked on the socket read, and has a lock on the \
byte array, Thread C is waiting for a lock on an SmbTransport instance \
(0x00002aaac84b2e80), and Thread D has the lock on that instance of SmbTransport, and \
is awaiting a lock on the static byte array that Thread B has locked.

Now, we got a couple thread dumps a couple minutes apart, and the threads were still \
in this state.  All this time they were receiving no data from the code using jcifs.  \
I realize it isn't conclusively a deadlock based on this output (since we don't know \
for sure that the read() call in Thread B is not going to return) but given that the \
system seemed to remain in this state until a jvm restart, it appears to me that it \
IS a deadlock.  Further, I patched their system with a change to make the byte array \
BUF to be non-static, and they've yet to hit the issue again (they WERE hitting it \
within a day or so, and they've now been up for 4 days with no issue).

So, to resolve this I'd recommend that the byte array be changed from static to \
non-static.  The caveat with that is that each thread will then need 65K memory.  \
Preferably, I'd implement a buffer pool that all threads can use to minimize the \
impact on memory requirements.

Thoughts/Comments?



Thread A

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "pool-1-thread-415" prio=10 \
tid=0x0000000052f4b800 nid=0x1d95 waiting for monitor entry \
[0x00002aab27c7f000..0x00002aab27c80a10]

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: BLOCKED (on \
object monitor)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - waiting to lock <0x00002aaac7470140> \
(a [B)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - locked <0x00002aaac853d3e8> (a \
jcifs.smb.SmbTransport)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.smb.SmbTransport.send(SmbTransport.java:655)

...


Thread B

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport19-2530" daemon prio=10 \
tid=0x00002aab343fb800 nid=0x1f74 runnable [0x000000007466d000..0x000000007466db10]

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: RUNNABLE

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
java.net.SocketInputStream.socketRead0(Native Method)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
java.net.SocketInputStream.read(SocketInputStream.java:129)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.readn(Transport.java:31)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - locked <0x00002aaac7470140> (a [B)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.loop(Transport.java:108)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - locked <0x00002aaac84ab020> (a \
jcifs.smb.SmbTransport)

...

Thread C

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "jcifs-transport14-2547" daemon prio=10 \
tid=0x0000000050e9f800 nid=0x4588 waiting for monitor entry \
[0x00000000594bd000..0x00000000594bda90]

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: BLOCKED (on \
object monitor)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.disconnect(Transport.java:204)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - waiting to lock <0x00002aaac84b2e80> \
(a jcifs.smb.SmbTransport)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.loop(Transport.java:124)

...

Thread D

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary "Folder Monitor \
[10.1.228.200][smb://10.1.228.200/E$/IISlogs/]" daemon prio=10 tid=0x0000000051cd1400 \
nid=0x4587 waiting for monitor entry [0x00002aab27f83000..0x00002aab27f83d10]

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    java.lang.Thread.State: BLOCKED (on \
object monitor)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - waiting to lock <0x00002aaac7470140> \
(a [B)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    - locked <0x00002aaac84b2e80> (a \
jcifs.smb.SmbTransport)

Jan  8 13:37:26 s5kep-tcc-an02-apd1-primary    at \
jcifs.smb.SmbTransport.send(SmbTransport.java:655)

...


[Attachment #3 (text/html)]

<html xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 11 (filtered medium)">
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
pre
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:Arial;
	color:windowtext;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
	{page:Section1;}
-->
</style>
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="Section1">
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Hi Mike<o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Ok so we had a customer site with roughly 80 separate transport \
threads operating at the same time.&nbsp; It appears there is an effective deadlock \
existing between  four separate threads, due to the synchronization strategy around \
using the static byte array BUF and the SmbTransport object.&nbsp; I&#8217;ll try to \
explain below using snippets from a stack dump we grabbed at the time.&nbsp; \
<o:p></o:p></span></font></p> <p class="MsoNormal"><font size="2" face="Arial"><span \
style="font-size:10.0pt; font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">As you can see below, Thread A is awaiting a lock on the static \
byte array (0x00002aaac7470140), Thread B is blocked on the socket read, and has a \
lock on the  byte array, Thread C is waiting for a lock on an SmbTransport instance \
(0x00002aaac84b2e80), and Thread D has the lock on that instance of SmbTransport, and \
is awaiting a lock on the static byte array that Thread B has \
locked.<o:p></o:p></span></font></p> <p class="MsoNormal"><font size="2" \
face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; font-family:Arial">Now, we got a \
couple thread dumps a couple minutes apart, and the threads were still in this \
state.&nbsp; All this time they were receiving no data from the code  using \
jcifs.&nbsp; I realize it isn&#8217;t conclusively a deadlock based on this output \
(since we don&#8217;t know for sure that the read() call in Thread B is not going to \
return) but given that the system seemed to remain in this state until a jvm restart, \
it appears to  me that it IS a deadlock.&nbsp; Further, I patched their system with a \
change to make the byte array BUF to be non-static, and they&#8217;ve yet to hit the \
issue again (they WERE hitting it within a day or so, and they&#8217;ve now been up \
for 4 days with no issue).<o:p></o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p> <p class="MsoNormal"><font \
size="2" face="Arial"><span style="font-size:10.0pt; font-family:Arial">So, to \
resolve this I&#8217;d recommend that the byte array be changed from static to \
non-static.&nbsp; The caveat with that is that each thread will then need 65K \
memory.&nbsp;  Preferably, I&#8217;d implement a buffer pool that all threads can use \
to minimize the impact on memory requirements.&nbsp; <o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Thoughts/Comments?<o:p></o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial"><o:p>&nbsp;</o:p></span></font></p>
<p class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt;
font-family:Arial">Thread A<o:p></o:p></span></font></p>
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&quot;pool-1-thread-415&quot; prio=10 tid=0x0000000052f4b800 nid=0x1d95 waiting for \
monitor entry [0x00002aab27c7f000..0x00002aab27c80a10]<o:p></o:p></span></font></pre> \
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 \
s5kep-tcc-an02-apd1-primary&nbsp;&nbsp;&nbsp; java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - waiting to lock &lt;0x00002aaac7470140&gt; (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &nbsp;&nbsp; at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - locked &lt;0x00002aaac853d3e8&gt; (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.smb.SmbTransport.send(SmbTransport.java:655)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">...<o:p></o:p></span></font></pre> <pre><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"><o:p>&nbsp;</o:p></span></font></pre> <p \
class="MsoNormal"><font size="2" face="Arial"><span style="font-size:10.0pt; \
font-family:Arial"></span>Thread B<o:p></o:p></font></p> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&quot;jcifs-transport19-2530&quot; daemon prio=10 tid=0x00002aab343fb800 nid=0x1f74 \
runnable [0x000000007466d000..0x000000007466db10]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 \
s5kep-tcc-an02-apd1-primary&nbsp;&nbsp;&nbsp; java.lang.Thread.State: \
RUNNABLE<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &nbsp;&nbsp; at \
java.net.SocketInputStream.socketRead0(Native Method)<o:p></o:p></span></font></pre> \
<pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at java.net.SocketInputStream.read(SocketInputStream.java:129)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.util.transport.Transport.readn(Transport.java:31)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.smb.SmbTransport.doRecv(SmbTransport.java:499)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - locked &lt;0x00002aaac7470140&gt; (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &nbsp;&nbsp; at \
jcifs.util.transport.Transport.loop(Transport.java:108)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - locked &lt;0x00002aaac84ab020&gt; (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p>&nbsp;</o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: 12.0pt">Thread \
C<o:p></o:p></span></font></p> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &quot;jcifs-transport14-2547&quot; daemon \
prio=10 tid=0x0000000050e9f800 nid=0x4588 waiting for monitor entry \
[0x00000000594bd000..0x00000000594bda90]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 \
s5kep-tcc-an02-apd1-primary&nbsp;&nbsp;&nbsp; java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.util.transport.Transport.disconnect(Transport.java:204)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - waiting to lock &lt;0x00002aaac84b2e80&gt; (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.util.transport.Transport.loop(Transport.java:124)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p>&nbsp;</o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: 12.0pt">Thread \
D<o:p></o:p></span></font></p> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &quot;Folder Monitor \
[10.1.228.200][smb://10.1.228.200/E$/IISlogs/]&quot; daemon prio=10 \
tid=0x0000000051cd1400 nid=0x4587 waiting for monitor entry \
[0x00002aab27f83000..0x00002aab27f83d10]<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 \
s5kep-tcc-an02-apd1-primary&nbsp;&nbsp;&nbsp; java.lang.Thread.State: BLOCKED (on \
object monitor)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font \
size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.smb.SmbTransport.doSend(SmbTransport.java:437)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - waiting to lock &lt;0x00002aaac7470140&gt; (a \
[B)<o:p></o:p></span></font></pre> <pre style="margin-left:.5in"><font size="2" \
color="black" face="Courier New"><span style="font-size:10.0pt;color:black">Jan&nbsp; \
8 13:37:26 s5kep-tcc-an02-apd1-primary &nbsp;&nbsp; at \
jcifs.util.transport.Transport.sendrecv(Transport.java:69)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; - locked &lt;0x00002aaac84b2e80&gt; (a \
jcifs.smb.SmbTransport)<o:p></o:p></span></font></pre> <pre \
style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black">Jan&nbsp; 8 13:37:26 s5kep-tcc-an02-apd1-primary \
&nbsp;&nbsp; at jcifs.smb.SmbTransport.send(SmbTransport.java:655)<o:p></o:p></span></font></pre>
 <pre style="margin-left:.5in"><font size="2" color="black" face="Courier New"><span \
style="font-size:10.0pt;color:black"></span>...<o:p></o:p></font></pre> <p \
class="MsoNormal"><font size="3" face="Times New Roman"><span style="font-size: \
12.0pt"><o:p>&nbsp;</o:p></span></font></p> <p class="MsoNormal"><font size="3" \
face="Times New Roman"><span style="font-size: \
12.0pt"><o:p>&nbsp;</o:p></span></font></p> </div>
</body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic