[prev in list] [next in list] [prev in thread] [next in thread] 

List:       asterisk-dev
Subject:    Re: [asterisk-dev] Deadlock in chan_sip
From:       Nir Simionovich <nir.simionovich () gmail ! com>
Date:       2016-04-05 6:26:54
Message-ID: CAE+pvDpMwTiy3JMYO-0pOMAM_SgmaL4SKoiRRBX_c4xvpjgcPw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hey Ryan,

  Well, I'm not using Realtime hear and the setup is much simpler - but the
overall deadlock pathology is very similar.
Currently, I've managed to mitigate the issue on my systems by doing the
following:

1. Migration of my inbound SIP channel from chan_sip to chan_pjsip
2. Forcing Asterisk to "transcode" between ulaw to alaw and back, so that
re-invites don't work on the problematic path.

  I've tested with version running as back as 13.0 and 12.4 - all
manifested the same scenario.

  This is not version specific or setup specific, this is something a bit
more lower level then it looks.

On Mon, Apr 4, 2016 at 7:11 PM, Ryan Rittgarn <rrittgarn@techpro.com> wrote:

> Nir, is your bug possibly related to:
> https://issues.asterisk.org/jira/browse/ASTERISK-25468
>
> I've been experiencing the bug referenced and have had to roll back to
> 13.4 as the issue seems to have been introduced going into 13.5
>
>
> -----Original Message-----
> From: asterisk-dev-bounces@lists.digium.com [mailto:
> asterisk-dev-bounces@lists.digium.com] On Behalf Of
> asterisk-dev-request@lists.digium.com
> Sent: Sunday, April 03, 2016 12:00 PM
> To: asterisk-dev@lists.digium.com
> Subject: asterisk-dev Digest, Vol 141, Issue 4
>
> Send asterisk-dev mailing list submissions to
>         asterisk-dev@lists.digium.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.digium.com/mailman/listinfo/asterisk-dev
> or, via email, send a message with subject or body 'help' to
>         asterisk-dev-request@lists.digium.com
>
> You can reach the person managing the list at
>         asterisk-dev-owner@lists.digium.com
>
> When replying, please edit your Subject line so it is more specific than
> "Re: Contents of asterisk-dev digest..."
>
>
> Today's Topics:
>
>    1. Deadlock in chan_sip,     caused by weird media re-invite from
>       remote side (Nir Simionovich)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 3 Apr 2016 12:03:19 +0300
> From: Nir Simionovich <nir.simionovich@gmail.com>
> To: Asterisk Developers Mailing List <asterisk-dev@lists.digium.com>
> Subject: [asterisk-dev] Deadlock in chan_sip,   caused by weird media
>         re-invite from remote side
> Message-ID:
>         <
> CAE+pvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi All,
>
>   We have several systems, some running Asterisk 13 some 12. We have
> recently discovered a possible dead-lock scenario in chan_sip. The
> dead-lock seems to manifest as the below:
>
> LCR-AMS-01*CLI> core show locks
>
> =======================================================================
> === 12.8.2
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock
> name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7f922f88b700 (do_monitor           started at [29073]
> chan_sip.c restart_monitor())
> === ---> Lock #0 (astobj2_container.c): MUTEX 333 internal_ao2_traverse
> self 0x36a9880 (1)
>         /usr/sbin/asterisk(__ast_bt_get_addresses+0x1d) [0x46556d]
>         /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xc9) [0x5317d5]
>         /usr/sbin/asterisk(__ao2_lock+0x96) [0x45a21f]
>         /usr/sbin/asterisk() [0x45b9b1]
>         /usr/sbin/asterisk(__ao2_callback+0x5f) [0x45bd83]
>         /usr/lib/asterisk/modules/chan_sip.so(+0x96a7d) [0x7f9249c2fa7d]
>         /usr/sbin/asterisk() [0x5edbd1]
>         /lib64/libpthread.so.0(+0x7a51) [0x7f92da37aa51]
>         /lib64/libc.so.6(clone+0x6d) [0x7f92dbf8d93d] ===
> -------------------------------------------------------------------
> ===
> =======================================================================
>
>   Now, the funny bit is how it happens. This is the scenario:
>
> Soft Phone -> Asterisk A -> Asterisk B -> Carrier
>
>   Soft phone is behind a NAT. Asterisk servers are not, same as the
> carrier.
>
>   We've noticed that the carrier tries to run a media re-invite, after the
> call had basically dropped from Asterisk B, and tries to do it over and
> over again, without stopping. Eventually, that would dead-lock chan_sip
> completely, requiring a full blown asterisk restart.
>
>   Any of you ever encountered anything like this?
>
>   I've mitigated the issue by forcing two different codecs on the two
> sides of Asterisk B, basically, preventing the media re-invite - but it
> isn't the proper solution.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html
> >
>
> ------------------------------
>
> _______________________________________________
> --Bandwidth and Colocation Provided by http://www.api-digital.com--
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
>    http://lists.digium.com/mailman/listinfo/asterisk-dev
>
> End of asterisk-dev Digest, Vol 141, Issue 4
> ********************************************
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
>    http://lists.digium.com/mailman/listinfo/asterisk-dev
>

[Attachment #5 (text/html)]

<div dir="ltr">Hey Ryan,<div><br></div><div>   Well, I&#39;m not using Realtime hear \
and the setup is much simpler - but the overall deadlock pathology is very \
similar.</div><div>Currently, I&#39;ve managed to mitigate the issue on my systems by \
doing the following:</div><div><br></div><div>1. Migration of my inbound SIP channel \
from chan_sip to chan_pjsip</div><div>2. Forcing Asterisk to &quot;transcode&quot; \
between ulaw to alaw and back, so that re-invites don&#39;t work on the problematic \
path.</div><div><br></div><div>   I&#39;ve tested with version running as back as \
13.0 and 12.4 - all manifested the same scenario.</div><div><br></div><div>   This is \
not version specific or setup specific, this is something a bit more lower level then \
it looks.  </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, \
Apr 4, 2016 at 7:11 PM, Ryan Rittgarn <span dir="ltr">&lt;<a \
href="mailto:rrittgarn@techpro.com" \
target="_blank">rrittgarn@techpro.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">Nir, is your bug possibly related to: <a \
href="https://issues.asterisk.org/jira/browse/ASTERISK-25468" rel="noreferrer" \
target="_blank">https://issues.asterisk.org/jira/browse/ASTERISK-25468</a><br> <br>
I&#39;ve been experiencing the bug referenced and have had to roll back to 13.4 as \
the issue seems to have been introduced going into 13.5<br> <br>
<br>
-----Original Message-----<br>
From: <a href="mailto:asterisk-dev-bounces@lists.digium.com">asterisk-dev-bounces@lists.digium.com</a> \
[mailto:<a href="mailto:asterisk-dev-bounces@lists.digium.com">asterisk-dev-bounces@lists.digium.com</a>] \
On Behalf Of <a href="mailto:asterisk-dev-request@lists.digium.com">asterisk-dev-request@lists.digium.com</a><br>
                
Sent: Sunday, April 03, 2016 12:00 PM<br>
To: <a href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a><br>
                
Subject: asterisk-dev Digest, Vol 141, Issue 4<br>
<br>
Send asterisk-dev mailing list submissions to<br>
            <a href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a><br>
 <br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
            <a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" \
rel="noreferrer" target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br>
 or, via email, send a message with subject or body &#39;help&#39; to<br>
            <a href="mailto:asterisk-dev-request@lists.digium.com">asterisk-dev-request@lists.digium.com</a><br>
 <br>
You can reach the person managing the list at<br>
            <a href="mailto:asterisk-dev-owner@lists.digium.com">asterisk-dev-owner@lists.digium.com</a><br>
 <br>
When replying, please edit your Subject line so it is more specific than &quot;Re: \
Contents of asterisk-dev digest...&quot;<br> <br>
<br>
Today&#39;s Topics:<br>
<br>
     1. Deadlock in chan_sip,        caused by weird media re-invite from<br>
         remote side (Nir Simionovich)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Sun, 3 Apr 2016 12:03:19 +0300<br>
From: Nir Simionovich &lt;<a \
                href="mailto:nir.simionovich@gmail.com">nir.simionovich@gmail.com</a>&gt;<br>
                
To: Asterisk Developers Mailing List &lt;<a \
                href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a>&gt;<br>
                
Subject: [asterisk-dev] Deadlock in chan_sip,     caused by weird media<br>
            re-invite from remote side<br>
Message-ID:<br>
            &lt;<a href="mailto:CAE%2BpvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@ \
mail.gmail.com">CAE+pvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@mail.gmail.com</a>&gt;<br>
                
Content-Type: text/plain; charset=&quot;utf-8&quot;<br>
<br>
Hi All,<br>
<br>
   We have several systems, some running Asterisk 13 some 12. We have recently \
discovered a possible dead-lock scenario in chan_sip. The dead-lock seems to manifest \
as the below:<br> <br>
LCR-AMS-01*CLI&gt; core show locks<br>
<br>
=======================================================================<br>
=== 12.8.2<br>
=== Currently Held Locks<br>
=======================================================================<br>
===<br>
=== &lt;pending&gt; &lt;lock#&gt; (&lt;file&gt;): &lt;lock type&gt; &lt;line num&gt; \
&lt;function&gt; &lt;lock<br> name&gt; &lt;lock addr&gt; (times locked)<br>
===<br>
=== Thread ID: 0x7f922f88b700 (do_monitor                 started at [29073]<br>
chan_sip.c restart_monitor())<br>
=== ---&gt; Lock #0 (astobj2_container.c): MUTEX 333 internal_ao2_traverse self \
0x36a9880 (1)<br>  /usr/sbin/asterisk(__ast_bt_get_addresses+0x1d) [0x46556d]<br>
            /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xc9) [0x5317d5]<br>
            /usr/sbin/asterisk(__ao2_lock+0x96) [0x45a21f]<br>
            /usr/sbin/asterisk() [0x45b9b1]<br>
            /usr/sbin/asterisk(__ao2_callback+0x5f) [0x45bd83]<br>
            /usr/lib/asterisk/modules/chan_sip.so(+0x96a7d) [0x7f9249c2fa7d]<br>
            /usr/sbin/asterisk() [0x5edbd1]<br>
            /lib64/libpthread.so.0(+0x7a51) [0x7f92da37aa51]<br>
            /lib64/libc.so.6(clone+0x6d) [0x7f92dbf8d93d] === \
-------------------------------------------------------------------<br> ===<br>
=======================================================================<br>
<br>
   Now, the funny bit is how it happens. This is the scenario:<br>
<br>
Soft Phone -&gt; Asterisk A -&gt; Asterisk B -&gt; Carrier<br>
<br>
   Soft phone is behind a NAT. Asterisk servers are not, same as the carrier.<br>
<br>
   We&#39;ve noticed that the carrier tries to run a media re-invite, after the call \
had basically dropped from Asterisk B, and tries to do it over and over again, \
without stopping. Eventually, that would dead-lock chan_sip completely, requiring a \
full blown asterisk restart.<br> <br>
   Any of you ever encountered anything like this?<br>
<br>
   I&#39;ve mitigated the issue by forcing two different codecs on the two sides of \
Asterisk B, basically, preventing the media re-invite - but it isn&#39;t the proper \
                solution.<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: &lt;<a href="http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html" \
rel="noreferrer" target="_blank">http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html</a>&gt;<br>
 <br>
------------------------------<br>
<br>
_______________________________________________<br>
--Bandwidth and Colocation Provided by <a href="http://www.api-digital.com--" \
rel="noreferrer" target="_blank">http://www.api-digital.com--</a><br> <br>
asterisk-dev mailing list<br>
To UNSUBSCRIBE or update options visit:<br>
     <a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" rel="noreferrer" \
target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br> <br>
End of asterisk-dev Digest, Vol 141, Issue 4<br>
********************************************<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
_____________________________________________________________________<br>
-- Bandwidth and Colocation Provided by <a href="http://www.api-digital.com" \
rel="noreferrer" target="_blank">http://www.api-digital.com</a> --<br> <br>
asterisk-dev mailing list<br>
To UNSUBSCRIBE or update options visit:<br>
     <a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" rel="noreferrer" \
target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br> \
</font></span></blockquote></div><br></div>



-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic