[prev in list] [next in list] [prev in thread] [next in thread] 

List:       libreswan-dev
Subject:    [Swan-dev] pluto crashes after ikelifetime expires
From:       Oleg Rosowiecki <orosowiecki () gmail ! com>
Date:       2016-10-05 16:03:46
Message-ID: CAHq0LqUqPPkGyiHR-Jpnh+1Y8YuwDA3pdh0JAbamxdzReWcfgg () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Dear All,

Libreswan 3.18 runs in my embedded Linux x86_64 environment (kernel version
3.14 with NETKEY). Libreswan was built with the default set of build
options. pluto is able to establish a tunnel to my remote peer just fine.
However, pluto crashes with SIGSEGV after the "ikelifetime" period expires.

I have to mention that my configuration uses XAUTH to assign an IP address
to my eth0 interface, so at the time when the tunnel is up eth0 has two
adresses. 10.3.111.136 is the XAUTH-assigned address, 76.60.87.181 is the
local address. And 10.253.254.117 is the remote peer:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen
1000
    link/ether 00:18:7d:96:11:7e brd ff:ff:ff:ff:ff:ff
    inet 76.60.87.181/24 brd 76.60.87.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.3.111.136/32 scope global eth0
       valid_lft forever preferred_lft forever

I was able to identify the immediate cause of the crash. While
EVENT_SA_REPLACE was being processed for state object #2 (see logs below),
a new state object #6 got created. As far as I understand, state #6 is the
intended replacement for state #2. During the creation of state #6 the
orient() function gets called for the related connection. orient() outputs
a message that is worrying me:

Oct  4 02:01:30: "my-vpn" #2: both sides of "my-vpn" are our interface eth0!

I don't quite understand the purpose of orient(); I believe it is used to
determine which end of the connection is local and which is the remote
peer. My impression is that orient() got confused by multiple IP addresses
assigned to eth0.

When orient() detects this condition, it decides to terminate the
connection (and clean up all related states), i.e. it calls
teminate_connection(). This, in turn, leads to state #2 being wiped out
(memset() with poison value 0xFB). When processing ultimately returns to
EVENT_SA_REPLACE portion in timer_event_cb(), there is no valid state #2
object anymore and this leads to SIGSEGV in the subsequent call to
delete_liveness_event() -- called from timer.c:736.

There was another worrying output along the way:

Oct  4 02:19:00: "my-vpn" #6: Cannot send packet - interface vanished!

But I believe it's caused by the eth0-related condition, too.

One final note: the same configuration used to work with Libreswan 3.12 in
my environment. I'm including the ipsec.conf file, the snippet of the log
with relevant output and the stack trace from gdb.
This has already taken me a considerable amount of time to investigate, so
any suggestions will be greatly appreciated.
Thanks,
Oleg Rosowiecki

--- CONFIG ---
config setup
    protostack=netkey
    interfaces=%none
    nat_traversal=yes
    virtual_private=%v4:10.0.0.0/8,%v4:192.168.0.0/16,%v4:172.16.0.0/12
    plutodebug=all
conn %default
    type=tunnel
    auto=add
    authby=secret
    ike=aes128-sha1;modp1536
    phase2alg=aes128-sha1;modp1536
    ikev2=never
    leftxauthclient=yes
    leftmodecfgclient=yes
    modecfgpull=yes
    dpddelay=30
    dpdtimeout=120
    dpdaction=restart
    pfs=no
    aggrmode=no
    salifetime=28800s
    rekey=no
    keyingtries=%forever
    ikelifetime=300s
# Add connections here
conn my-vpn
    left=76.60.87.181
    leftid=@my-vpn
    leftnexthop=76.60.87.1
    right=10.253.254.117
    rightid=10.253.254.117
--- LOG ---
Oct  3 23:50:38: | handling event EVENT_SA_REPLACE for child state #2
Oct  3 23:50:38: | processing connection "my-vpn"
Oct  3 23:50:38: | EVENT_SA_REPLACE{IF_USED} picked newest_ipsec_sa #2
... Snip ...
Oct  4 01:57:47: | creating state object #6 at 0x7f0c1203ceb0
Oct  4 01:57:47: | parent state #6: new > STATE_UNDEFINED(ignore)
Oct  4 02:01:30: "my-vpn" #2: both sides of "my-vpn" are our interface
eth0!  <------------ WHY IS THIS?
Oct  4 02:01:30: | processing connection "my-vpn"
Oct  4 02:01:30: "my-vpn" #2: terminating SAs using this connection
Oct  4 02:01:30: | connection not shared pkilling phase1 and phase2
Oct  4 02:01:30: | Deleting states for connection - not including other
IPsec SA's
Oct  4 02:01:30: | pass 0
Oct  4 02:01:30: | index 16 state #2
Oct  4 02:01:30: | processing connection "my-vpn"
Oct  4 02:01:30: "my-vpn" #2: deleting state (STATE_QUICK_I2)
Oct  4 02:01:30: | child state #2: STATE_QUICK_I2(authenticated-ipsec) >
delete
Oct  4 02:01:30: "my-vpn" #2: ESP traffic information: in=0B out=0B
XAUTHuser=oleg33
Oct  4 02:01:30: | no Phase 1 state for Delete
Oct  4 02:01:30: | state: #2 requesting to delete non existing event
Oct  4 02:01:30: | unhashing state object #2
Oct  4 02:01:30: | removing state 0x7f0c12042280 entry 0x7f0c120428f0 next
(nil) prev-next 0x7f0c10d05408 from list
Oct  4 02:01:30: | updated next entry is (nil)
Oct  4 02:01:30: | removing state 0x7f0c12042280 entry 0x7f0c12042908 next
(nil) prev-next 0x7f0c10d05538 from list
Oct  4 02:01:30: | updated next entry is (nil)
Oct  4 02:01:30: | command executing down-client
Oct  4 02:01:30: | get esp.93ac0344@10.253.254.117
Oct  4 02:01:30: | get esp.e2937957@76.60.87.181
Oct  4 02:01:30: | executing down-client:
... Snip ...
Oct  4 02:01:30: | in connection_discard for connection my-vpn
Oct  4 02:01:30: | child state #2: STATE_QUICK_I2(authenticated-ipsec) >
STATE_UNDEFINED(ignore)
... Snip ...
Oct  4 02:19:00: "my-vpn" #6: Cannot send packet - interface
vanished!   <--- THIS IS WORRYING, TOO!
... Snip ....
--- GDB backtrace ---
(gdb) bt
#0  0x00007f6e12c802e8 in memset (__len=2072, __ch=251,
__dest=0x7f6e13576470)
    at /usr/include/bits/string3.h:84
#1  delete_state (st=st@entry=0x7f6e13576470) at programs/pluto/state.c:975
#2  0x00007f6e12c80973 in foreach_state_by_connection_func_delete (c=c@entry
=0x7f6e1356f6b0,
    comparefunc=comparefunc@entry=0x7f6e12c7cd40 <same_phase1_sa>)
    at programs/pluto/state.c:1045
#3  0x00007f6e12c80a71 in delete_states_by_connection
(c=c@entry=0x7f6e1356f6b0,
relations=relations@entry=0)
    at programs/pluto/state.c:1140
#4  0x00007f6e12c78129 in terminate_a_connection (c=c@entry=0x7f6e1356f6b0,
arg=arg@entry=0x0)
    at programs/pluto/terminate.c:83
#5  0x00007f6e12c781e7 in terminate_connection (name=0x7f6e13568160
"my-vpn")
    at programs/pluto/terminate.c:106
#6  0x00007f6e12c76397 in orient (c=c@entry=0x7f6e1356f6b0)
    at programs/pluto/initiate.c:122
#7  0x00007f6e12c7f04f in set_state_ike_endpoints (st=st@entry=0x7f6e135710a0,
c=c@entry=0x7f6e1356f6b0)
    at programs/pluto/state.c:2292
#8  0x00007f6e12c87a90 in initialize_new_state (st=st@entry=0x7f6e135710a0,
c=c@entry=0x7f6e1356f6b0, policy=policy@entry=19884687433,
    try=try@entry=1, whack_sock=whack_sock@entry=-1,
importance=importance@entry=pcim_demand_crypto)
    at programs/pluto/ipsec_doi.c:497
#9  0x00007f6e12c8c39b in main_outI1 (whack_sock=-1, c=0x7f6e1356f6b0,
predecessor=0x0, policy=19884687433, try=1,
    importance=pcim_demand_crypto) at programs/pluto/ikev1_main.c:145
#10 0x00007f6e12c85aca in timer_event_cb (fd=<optimized out>,
event=<optimized out>, arg=0x7f6e135718c0)
    at programs/pluto/timer.c:732
#11 0x00007f6e116feb53 in event_process_active_single_queue () from
/lib64/libevent-2.0.so.5
#12 0x00007f6e116fedc5 in event_process_active () from
/lib64/libevent-2.0.so.5
#13 0x00007f6e116ff42d in event_base_loop () from /lib64/libevent-2.0.so.5
#14 0x00007f6e12c83592 in main_loop () at programs/pluto/server.c:628
#15 call_server () at programs/pluto/server.c:742
#16 0x00007f6e12c6cfa5 in main (argc=<optimized out>, argv=<optimized out>)
    at programs/pluto/plutomain.c:1642

[Attachment #5 (text/html)]

<div dir="ltr"><p>Dear All,</p><p>Libreswan 3.18 runs in my embedded Linux x86_64 \
environment (kernel version 3.14 with NETKEY). Libreswan was built with the default \
set of build options. pluto is able to establish a tunnel to my remote peer just \
fine. However, pluto crashes with SIGSEGV after the &quot;ikelifetime&quot; period \
expires.</p><p>I have to mention that my configuration uses XAUTH to assign an IP \
address to my eth0 interface, so at the time when the tunnel is up eth0 has two \
adresses. 10.3.111.136 is the XAUTH-assigned address, 76.60.87.181 is the local \
address. And 10.253.254.117 is the remote peer:</p><p>2: eth0: \
&lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc pfifo_fast qlen 1000<br>       \
link/ether 00:18:7d:96:11:7e brd ff:ff:ff:ff:ff:ff<br>       inet <a \
href="http://76.60.87.181/24">76.60.87.181/24</a> brd 76.60.87.255 scope global \
eth0<br>             valid_lft forever preferred_lft forever<br>       inet <a \
href="http://10.3.111.136/32">10.3.111.136/32</a> scope global eth0<br>             \
valid_lft forever preferred_lft forever</p><p>I was able to identify the immediate \
cause of the crash. While EVENT_SA_REPLACE was being processed for state object #2 \
(see logs below), a new state object #6 got created. As far as I understand, state #6 \
is the intended replacement for state #2. During the creation of state #6 the \
orient() function gets called for the related connection. orient() outputs a message \
that is worrying me:</p><p>Oct   4 02:01:30: &quot;my-vpn&quot; #2: both sides of \
&quot;my-vpn&quot; are our interface eth0!</p><p>I don&#39;t quite understand the \
purpose of orient(); I believe it is used to determine which end of the connection is \
local and which is the remote peer. My impression is that orient() got confused by \
multiple IP addresses assigned to eth0.</p><p>When orient() detects this condition, \
it decides to terminate the connection (and clean up all related states), i.e. it \
calls teminate_connection(). This, in turn, leads to state #2 being wiped out \
(memset() with poison value 0xFB). When processing ultimately returns to \
EVENT_SA_REPLACE portion in timer_event_cb(), there is no valid state #2 object \
anymore and this leads to SIGSEGV in the subsequent call to delete_liveness_event() \
-- called from timer.c:736.</p><p>There was another worrying output along the \
way:</p><p>Oct   4 02:19:00: &quot;my-vpn&quot; #6: Cannot send packet - interface \
vanished!</p><p>But I believe it&#39;s caused by the eth0-related condition, \
too.</p><p>One final note: the same configuration used to work with Libreswan 3.12 in \
my environment. I&#39;m including the ipsec.conf file, the snippet of the log with \
relevant output and the stack trace from gdb.</p><div>This has already taken me a \
considerable amount of time to investigate, so any suggestions will be greatly \
appreciated.<br>Thanks,<br>Oleg Rosowiecki</div><div><br></div><div>--- CONFIG \
---</div><div>config setup<br>       protostack=netkey<br>       interfaces=%none<br> \
nat_traversal=yes<br>       virtual_private=%v4:<a \
href="http://10.0.0.0/8,%v4:192.168.0.0/16,%v4:172.16.0.0/12">10.0.0.0/8,%v4:192.168.0.0/16,%v4:172.16.0.0/12</a><br> \
plutodebug=all</div><div>conn %default<br>       type=tunnel<br>       auto=add<br>   \
authby=secret<br>       ike=aes128-sha1;modp1536<br>       \
phase2alg=aes128-sha1;modp1536<br>       ikev2=never<br>       \
leftxauthclient=yes<br>       leftmodecfgclient=yes<br>       modecfgpull=yes<br>     \
dpddelay=30<br>       dpdtimeout=120<br>       dpdaction=restart<br>       pfs=no<br> \
aggrmode=no<br>       salifetime=28800s<br>       rekey=no<br>       \
keyingtries=%forever<br>       ikelifetime=300s</div><div># Add connections \
here</div><div>conn my-vpn<br>       left=76.60.87.181<br>       leftid=@my-vpn<br>   \
leftnexthop=76.60.87.1<br>       right=10.253.254.117<br>       \
rightid=10.253.254.117</div><div>--- LOG ---</div><div>Oct   3 23:50:38: | handling \
event EVENT_SA_REPLACE for child state #2<br>Oct   3 23:50:38: | processing \
connection &quot;my-vpn&quot;<br>Oct   3 23:50:38: | EVENT_SA_REPLACE{IF_USED} picked \
newest_ipsec_sa #2</div><div>... Snip ...</div><div>Oct   4 01:57:47: | creating \
state object #6 at 0x7f0c1203ceb0<br>Oct   4 01:57:47: | parent state #6: new &gt; \
STATE_UNDEFINED(ignore)<br>Oct   4 02:01:30: &quot;my-vpn&quot; #2: both sides of \
&quot;my-vpn&quot; are our interface eth0!    &lt;------------ WHY IS THIS?<br>Oct   \
4 02:01:30: | processing connection &quot;my-vpn&quot;<br>Oct   4 02:01:30: \
&quot;my-vpn&quot; #2: terminating SAs using this connection<br>Oct   4 02:01:30: | \
connection not shared pkilling phase1 and phase2<br>Oct   4 02:01:30: | Deleting \
states for connection - not including other IPsec SA&#39;s<br>Oct   4 02:01:30: | \
pass 0<br>Oct   4 02:01:30: | index 16 state #2<br>Oct   4 02:01:30: | processing \
connection &quot;my-vpn&quot;<br>Oct   4 02:01:30: &quot;my-vpn&quot; #2: deleting \
state (STATE_QUICK_I2)<br>Oct   4 02:01:30: | child state #2: \
STATE_QUICK_I2(authenticated-ipsec) &gt; delete<br>Oct   4 02:01:30: \
&quot;my-vpn&quot; #2: ESP traffic information: in=0B out=0B XAUTHuser=oleg33<br>Oct  \
4 02:01:30: | no Phase 1 state for Delete<br>Oct   4 02:01:30: | state: #2 requesting \
to delete non existing event<br>Oct   4 02:01:30: | unhashing state object #2<br>Oct  \
4 02:01:30: | removing state 0x7f0c12042280 entry 0x7f0c120428f0 next (nil) prev-next \
0x7f0c10d05408 from list<br>Oct   4 02:01:30: | updated next entry is (nil)<br>Oct   \
4 02:01:30: | removing state 0x7f0c12042280 entry 0x7f0c12042908 next (nil) prev-next \
0x7f0c10d05538 from list<br>Oct   4 02:01:30: | updated next entry is (nil)<br>Oct   \
4 02:01:30: | command executing down-client<br>Oct   4 02:01:30: | get <a \
href="mailto:esp.93ac0344@10.253.254.117">esp.93ac0344@10.253.254.117</a><br>Oct   4 \
02:01:30: | get <a href="mailto:esp.e2937957@76.60.87.181">esp.e2937957@76.60.87.181</a><br>Oct \
4 02:01:30: | executing down-client:</div><div>... Snip ...</div><div>Oct   4 \
02:01:30: | in connection_discard for connection my-vpn<br>Oct   4 02:01:30: | child \
state #2: STATE_QUICK_I2(authenticated-ipsec) &gt; \
STATE_UNDEFINED(ignore)</div><div>... Snip ...</div><div>Oct   4 02:19:00: \
&quot;my-vpn&quot; #6: Cannot send packet - interface vanished!      &lt;--- THIS IS \
WORRYING, TOO!</div><div>... Snip ....</div><div>--- GDB backtrace \
---</div><div>(gdb) bt<br>#0   0x00007f6e12c802e8 in memset (__len=2072, __ch=251, \
__dest=0x7f6e13576470)<br>       at /usr/include/bits/string3.h:84<br>#1   \
delete_state (st=st@entry=0x7f6e13576470) at programs/pluto/state.c:975<br>#2   \
0x00007f6e12c80973 in foreach_state_by_connection_func_delete \
(c=c@entry=0x7f6e1356f6b0,<br>       comparefunc=comparefunc@entry=0x7f6e12c7cd40 \
&lt;same_phase1_sa&gt;)<br>       at programs/pluto/state.c:1045<br>#3   \
0x00007f6e12c80a71 in delete_states_by_connection (c=c@entry=0x7f6e1356f6b0, \
relations=relations@entry=0)<br>       at programs/pluto/state.c:1140<br>#4   \
0x00007f6e12c78129 in terminate_a_connection (c=c@entry=0x7f6e1356f6b0, \
arg=arg@entry=0x0)<br>       at programs/pluto/terminate.c:83<br>#5   \
0x00007f6e12c781e7 in terminate_connection (name=0x7f6e13568160 \
&quot;my-vpn&quot;)<br>       at programs/pluto/terminate.c:106<br>#6   \
0x00007f6e12c76397 in orient (c=c@entry=0x7f6e1356f6b0)<br>       at \
programs/pluto/initiate.c:122<br>#7   0x00007f6e12c7f04f in set_state_ike_endpoints \
(st=st@entry=0x7f6e135710a0, c=c@entry=0x7f6e1356f6b0)<br>       at \
programs/pluto/state.c:2292<br>#8   0x00007f6e12c87a90 in initialize_new_state \
(st=st@entry=0x7f6e135710a0, c=c@entry=0x7f6e1356f6b0, \
policy=policy@entry=19884687433,<br>       try=try@entry=1, \
whack_sock=whack_sock@entry=-1, importance=importance@entry=pcim_demand_crypto)<br>   \
at programs/pluto/ipsec_doi.c:497<br>#9   0x00007f6e12c8c39b in main_outI1 \
(whack_sock=-1, c=0x7f6e1356f6b0, predecessor=0x0, policy=19884687433, try=1,<br>     \
importance=pcim_demand_crypto) at programs/pluto/ikev1_main.c:145<br>#10 \
0x00007f6e12c85aca in timer_event_cb (fd=&lt;optimized out&gt;, event=&lt;optimized \
out&gt;, arg=0x7f6e135718c0)<br>       at programs/pluto/timer.c:732<br>#11 \
0x00007f6e116feb53 in event_process_active_single_queue () from \
/lib64/libevent-2.0.so.5<br>#12 0x00007f6e116fedc5 in event_process_active () from \
/lib64/libevent-2.0.so.5<br>#13 0x00007f6e116ff42d in event_base_loop () from \
/lib64/libevent-2.0.so.5<br>#14 0x00007f6e12c83592 in main_loop () at \
programs/pluto/server.c:628<br>#15 call_server () at \
programs/pluto/server.c:742<br>#16 0x00007f6e12c6cfa5 in main (argc=&lt;optimized \
out&gt;, argv=&lt;optimized out&gt;)<br>       at \
programs/pluto/plutomain.c:1642<span><span><span></span></span></span></div></div>



_______________________________________________
Swan-dev mailing list
Swan-dev@lists.libreswan.org
https://lists.libreswan.org/mailman/listinfo/swan-dev


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic