[prev in list] [next in list] [prev in thread] [next in thread]
List: asterisk-dev
Subject: Re: [asterisk-dev] Deadlock in chan_sip
From: Nir Simionovich <nir.simionovich () gmail ! com>
Date: 2016-04-05 6:26:54
Message-ID: CAE+pvDpMwTiy3JMYO-0pOMAM_SgmaL4SKoiRRBX_c4xvpjgcPw () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
Hey Ryan,
Well, I'm not using Realtime hear and the setup is much simpler - but the
overall deadlock pathology is very similar.
Currently, I've managed to mitigate the issue on my systems by doing the
following:
1. Migration of my inbound SIP channel from chan_sip to chan_pjsip
2. Forcing Asterisk to "transcode" between ulaw to alaw and back, so that
re-invites don't work on the problematic path.
I've tested with version running as back as 13.0 and 12.4 - all
manifested the same scenario.
This is not version specific or setup specific, this is something a bit
more lower level then it looks.
On Mon, Apr 4, 2016 at 7:11 PM, Ryan Rittgarn <rrittgarn@techpro.com> wrote:
> Nir, is your bug possibly related to:
> https://issues.asterisk.org/jira/browse/ASTERISK-25468
>
> I've been experiencing the bug referenced and have had to roll back to
> 13.4 as the issue seems to have been introduced going into 13.5
>
>
> -----Original Message-----
> From: asterisk-dev-bounces@lists.digium.com [mailto:
> asterisk-dev-bounces@lists.digium.com] On Behalf Of
> asterisk-dev-request@lists.digium.com
> Sent: Sunday, April 03, 2016 12:00 PM
> To: asterisk-dev@lists.digium.com
> Subject: asterisk-dev Digest, Vol 141, Issue 4
>
> Send asterisk-dev mailing list submissions to
> asterisk-dev@lists.digium.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.digium.com/mailman/listinfo/asterisk-dev
> or, via email, send a message with subject or body 'help' to
> asterisk-dev-request@lists.digium.com
>
> You can reach the person managing the list at
> asterisk-dev-owner@lists.digium.com
>
> When replying, please edit your Subject line so it is more specific than
> "Re: Contents of asterisk-dev digest..."
>
>
> Today's Topics:
>
> 1. Deadlock in chan_sip, caused by weird media re-invite from
> remote side (Nir Simionovich)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 3 Apr 2016 12:03:19 +0300
> From: Nir Simionovich <nir.simionovich@gmail.com>
> To: Asterisk Developers Mailing List <asterisk-dev@lists.digium.com>
> Subject: [asterisk-dev] Deadlock in chan_sip, caused by weird media
> re-invite from remote side
> Message-ID:
> <
> CAE+pvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi All,
>
> We have several systems, some running Asterisk 13 some 12. We have
> recently discovered a possible dead-lock scenario in chan_sip. The
> dead-lock seems to manifest as the below:
>
> LCR-AMS-01*CLI> core show locks
>
> =======================================================================
> === 12.8.2
> === Currently Held Locks
> =======================================================================
> ===
> === <pending> <lock#> (<file>): <lock type> <line num> <function> <lock
> name> <lock addr> (times locked)
> ===
> === Thread ID: 0x7f922f88b700 (do_monitor started at [29073]
> chan_sip.c restart_monitor())
> === ---> Lock #0 (astobj2_container.c): MUTEX 333 internal_ao2_traverse
> self 0x36a9880 (1)
> /usr/sbin/asterisk(__ast_bt_get_addresses+0x1d) [0x46556d]
> /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xc9) [0x5317d5]
> /usr/sbin/asterisk(__ao2_lock+0x96) [0x45a21f]
> /usr/sbin/asterisk() [0x45b9b1]
> /usr/sbin/asterisk(__ao2_callback+0x5f) [0x45bd83]
> /usr/lib/asterisk/modules/chan_sip.so(+0x96a7d) [0x7f9249c2fa7d]
> /usr/sbin/asterisk() [0x5edbd1]
> /lib64/libpthread.so.0(+0x7a51) [0x7f92da37aa51]
> /lib64/libc.so.6(clone+0x6d) [0x7f92dbf8d93d] ===
> -------------------------------------------------------------------
> ===
> =======================================================================
>
> Now, the funny bit is how it happens. This is the scenario:
>
> Soft Phone -> Asterisk A -> Asterisk B -> Carrier
>
> Soft phone is behind a NAT. Asterisk servers are not, same as the
> carrier.
>
> We've noticed that the carrier tries to run a media re-invite, after the
> call had basically dropped from Asterisk B, and tries to do it over and
> over again, without stopping. Eventually, that would dead-lock chan_sip
> completely, requiring a full blown asterisk restart.
>
> Any of you ever encountered anything like this?
>
> I've mitigated the issue by forcing two different codecs on the two
> sides of Asterisk B, basically, preventing the media re-invite - but it
> isn't the proper solution.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html
> >
>
> ------------------------------
>
> _______________________________________________
> --Bandwidth and Colocation Provided by http://www.api-digital.com--
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-dev
>
> End of asterisk-dev Digest, Vol 141, Issue 4
> ********************************************
>
> --
> _____________________________________________________________________
> -- Bandwidth and Colocation Provided by http://www.api-digital.com --
>
> asterisk-dev mailing list
> To UNSUBSCRIBE or update options visit:
> http://lists.digium.com/mailman/listinfo/asterisk-dev
>
[Attachment #5 (text/html)]
<div dir="ltr">Hey Ryan,<div><br></div><div> Well, I'm not using Realtime hear \
and the setup is much simpler - but the overall deadlock pathology is very \
similar.</div><div>Currently, I've managed to mitigate the issue on my systems by \
doing the following:</div><div><br></div><div>1. Migration of my inbound SIP channel \
from chan_sip to chan_pjsip</div><div>2. Forcing Asterisk to "transcode" \
between ulaw to alaw and back, so that re-invites don't work on the problematic \
path.</div><div><br></div><div> I've tested with version running as back as \
13.0 and 12.4 - all manifested the same scenario.</div><div><br></div><div> This is \
not version specific or setup specific, this is something a bit more lower level then \
it looks. </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, \
Apr 4, 2016 at 7:11 PM, Ryan Rittgarn <span dir="ltr"><<a \
href="mailto:rrittgarn@techpro.com" \
target="_blank">rrittgarn@techpro.com</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">Nir, is your bug possibly related to: <a \
href="https://issues.asterisk.org/jira/browse/ASTERISK-25468" rel="noreferrer" \
target="_blank">https://issues.asterisk.org/jira/browse/ASTERISK-25468</a><br> <br>
I've been experiencing the bug referenced and have had to roll back to 13.4 as \
the issue seems to have been introduced going into 13.5<br> <br>
<br>
-----Original Message-----<br>
From: <a href="mailto:asterisk-dev-bounces@lists.digium.com">asterisk-dev-bounces@lists.digium.com</a> \
[mailto:<a href="mailto:asterisk-dev-bounces@lists.digium.com">asterisk-dev-bounces@lists.digium.com</a>] \
On Behalf Of <a href="mailto:asterisk-dev-request@lists.digium.com">asterisk-dev-request@lists.digium.com</a><br>
Sent: Sunday, April 03, 2016 12:00 PM<br>
To: <a href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a><br>
Subject: asterisk-dev Digest, Vol 141, Issue 4<br>
<br>
Send asterisk-dev mailing list submissions to<br>
<a href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
<a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" \
rel="noreferrer" target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br>
or, via email, send a message with subject or body 'help' to<br>
<a href="mailto:asterisk-dev-request@lists.digium.com">asterisk-dev-request@lists.digium.com</a><br>
<br>
You can reach the person managing the list at<br>
<a href="mailto:asterisk-dev-owner@lists.digium.com">asterisk-dev-owner@lists.digium.com</a><br>
<br>
When replying, please edit your Subject line so it is more specific than "Re: \
Contents of asterisk-dev digest..."<br> <br>
<br>
Today's Topics:<br>
<br>
1. Deadlock in chan_sip, caused by weird media re-invite from<br>
remote side (Nir Simionovich)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Sun, 3 Apr 2016 12:03:19 +0300<br>
From: Nir Simionovich <<a \
href="mailto:nir.simionovich@gmail.com">nir.simionovich@gmail.com</a>><br>
To: Asterisk Developers Mailing List <<a \
href="mailto:asterisk-dev@lists.digium.com">asterisk-dev@lists.digium.com</a>><br>
Subject: [asterisk-dev] Deadlock in chan_sip, caused by weird media<br>
re-invite from remote side<br>
Message-ID:<br>
<<a href="mailto:CAE%2BpvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@ \
mail.gmail.com">CAE+pvDpZBzJ0b4aq3kyedGbcgzPXev2zqrdhrhqQya4StzEQow@mail.gmail.com</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
Hi All,<br>
<br>
We have several systems, some running Asterisk 13 some 12. We have recently \
discovered a possible dead-lock scenario in chan_sip. The dead-lock seems to manifest \
as the below:<br> <br>
LCR-AMS-01*CLI> core show locks<br>
<br>
=======================================================================<br>
=== 12.8.2<br>
=== Currently Held Locks<br>
=======================================================================<br>
===<br>
=== <pending> <lock#> (<file>): <lock type> <line num> \
<function> <lock<br> name> <lock addr> (times locked)<br>
===<br>
=== Thread ID: 0x7f922f88b700 (do_monitor started at [29073]<br>
chan_sip.c restart_monitor())<br>
=== ---> Lock #0 (astobj2_container.c): MUTEX 333 internal_ao2_traverse self \
0x36a9880 (1)<br> /usr/sbin/asterisk(__ast_bt_get_addresses+0x1d) [0x46556d]<br>
/usr/sbin/asterisk(__ast_pthread_mutex_lock+0xc9) [0x5317d5]<br>
/usr/sbin/asterisk(__ao2_lock+0x96) [0x45a21f]<br>
/usr/sbin/asterisk() [0x45b9b1]<br>
/usr/sbin/asterisk(__ao2_callback+0x5f) [0x45bd83]<br>
/usr/lib/asterisk/modules/chan_sip.so(+0x96a7d) [0x7f9249c2fa7d]<br>
/usr/sbin/asterisk() [0x5edbd1]<br>
/lib64/libpthread.so.0(+0x7a51) [0x7f92da37aa51]<br>
/lib64/libc.so.6(clone+0x6d) [0x7f92dbf8d93d] === \
-------------------------------------------------------------------<br> ===<br>
=======================================================================<br>
<br>
Now, the funny bit is how it happens. This is the scenario:<br>
<br>
Soft Phone -> Asterisk A -> Asterisk B -> Carrier<br>
<br>
Soft phone is behind a NAT. Asterisk servers are not, same as the carrier.<br>
<br>
We've noticed that the carrier tries to run a media re-invite, after the call \
had basically dropped from Asterisk B, and tries to do it over and over again, \
without stopping. Eventually, that would dead-lock chan_sip completely, requiring a \
full blown asterisk restart.<br> <br>
Any of you ever encountered anything like this?<br>
<br>
I've mitigated the issue by forcing two different codecs on the two sides of \
Asterisk B, basically, preventing the media re-invite - but it isn't the proper \
solution.<br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html" \
rel="noreferrer" target="_blank">http://lists.digium.com/pipermail/asterisk-dev/attachments/20160403/8a579fd8/attachment-0001.html</a>><br>
<br>
------------------------------<br>
<br>
_______________________________________________<br>
--Bandwidth and Colocation Provided by <a href="http://www.api-digital.com--" \
rel="noreferrer" target="_blank">http://www.api-digital.com--</a><br> <br>
asterisk-dev mailing list<br>
To UNSUBSCRIBE or update options visit:<br>
<a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" rel="noreferrer" \
target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br> <br>
End of asterisk-dev Digest, Vol 141, Issue 4<br>
********************************************<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
_____________________________________________________________________<br>
-- Bandwidth and Colocation Provided by <a href="http://www.api-digital.com" \
rel="noreferrer" target="_blank">http://www.api-digital.com</a> --<br> <br>
asterisk-dev mailing list<br>
To UNSUBSCRIBE or update options visit:<br>
<a href="http://lists.digium.com/mailman/listinfo/asterisk-dev" rel="noreferrer" \
target="_blank">http://lists.digium.com/mailman/listinfo/asterisk-dev</a><br> \
</font></span></blockquote></div><br></div>
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic