[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ipfire-development
Subject:    VoIP connection tracking oddities
From:       Peter_Müller <peter.mueller () ipfire ! org>
Date:       2021-03-28 6:14:50
Message-ID: 54b81a5b-439e-e5c6-7df2-15a9c974de1d () ipfire ! org
[Download RAW message or body]

Hello development folks,

broken VoIP calls involving VoIP telephone equipment behind an IPFire machine have \
been an ongoing nuisance for me for years by now. While I cannot pinpoint their first \
occurrence anymore, I recall them to happen ever since we moved to Linux 4.14.x - \
since VoIP is the only technology requiring advanced connection tracking I have in \
use, there might be more related bugs.

While VoIP calls to my ISP using SIP over UDP and RTP with opportunistic SRTP support \
enabled worked in most (but not all) cases, using the same equipment to make a phone \
call via an IPsec VPN between two IPFire machines failed with a chance 30 to 50 \
percent per call. The failure mode has been always the same: At least one participant \
could not hear the other after picking up the phone. Sometimes, both callers could \
not hear each other.

Initially, I blamed the netfilter ALGs we ship, as they were error-prone and tampered \
with traffic they should not have tampered with (Arne mentioned the SIP ALG \
interfered with IPsec traffic as well - for whatever reason it does). Since ALGs do \
not work on encrypted traffic, switching to SIP over TLS and mandatory SRTP should do \
the trick, I assumed.

It did not. After running Core Update 155 (where we disabled all ALGs), I recently \
experienced a broken call again, with SIP over TLS and SRTP in place.

Since I am able to rule out a faulty configuration of the VoIP equipment with a high \
level of confidence, this leaves me with the suggestion that there is a more \
fundamental flaw in the Linux 4.14.x connection tracking, causing establishment of \
RTP streams to fail sometimes.

Worse, this is not reproducible at all - at least all attempts of mine to provoke \
this failure did not accomplish anything. (For the sake of completeness, I should \
mention that all needed firewall rules are present and no dropped packets were \
logged. IPS is not triggering, either, at least there are no corresponding log \
                messages in
/var/log/suricata/fast.log .) Since involved IPFire machines handle between 1k and 5k \
connections at any time, increasing the size of the connection tracking table by \
running

> sysctl net.netfilter.nf_conntrack_max=655360;

seemed useful to me. It did, however, not improve the reliability of VoIP call \
establishment.

All in all, this situation is quite unsatisfying. _Something_ in IPFire sometimes \
messes up with RTP streams, without doing so reproducible, logging anything or being \
otherwise reasonably debuggable. After Core Update 155, we can strike ALGs of the \
list of potential failure sources.

I have no idea where - and even how - to look further.

Hopefully Linux 5.x will our connection tracking reliability. I am pretty much out of \
ideas for Linux 4.14.x, though.

Thanks, and best regards,
Peter Müller


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic