[prev in list] [next in list] [prev in thread] [next in thread]
List: freebsd-bugs
Subject: kern/57380: Connections are reset by the kernel without any
From: Jan Kneschke <jan () kneschke ! de>
Date: 2003-09-30 0:30:30
[Download RAW message or body]
> Number: 57380
> Category: kern
> Synopsis: Connections are reset by the kernel without any application \
> intervention.
> Confidential: no
> Severity: serious
> Priority: high
> Responsible: freebsd-bugs
> State: open
> Quarter:
> Keywords:
> Date-Required:
> Class: sw-bug
> Submitter-Id: current-users
> Arrival-Date: Mon Sep 29 17:40:14 PDT 2003
> Closed-Date:
> Last-Modified:
> Originator: Jan Kneschke
> Release: FreeBSD 5.1-RELEASE i386
> Organization:
kneschke.de
> Environment:
System: FreeBSD doubleheart.home.kneschke.de 5.1-RELEASE FreeBSD 5.1-RELEASE #0: Sun \
Sep 28 17:09:52 CEST 2003 \
jan@doubleheart.home.kneschke.de:/usr/src/sys/i386/compile/MYKERNEL i386
MYKERNEL is GENERIC + SMP enabled
FreeBSD-5.1-RELEASE-SMP
2 * Pentium Pro 200 MHz
192.168.2.38 (doubleheart)
(webserver [lighttpd at port 1025, thttpd at port 1027])
Linux 2.4.20
1 * AMD 2000+
192.168.2.10 (grisu)
(ab)
> Description:
the webservers use poll() as there fd-event-source and are designed as a
non-blocking, single process webserver.
'ab' is called on a remote machine the test the performance of the servers
under some load. grisu is used as the test machine. doubleheart is running
a thttpd 2.20c at port 1027 and a lighttpd
(http://jan.kneschke.de/projects/lighttpd/) at port 1025.
After handling about 4000 connections the webserver doesn't receive
any new POLLIN events at all for the next 10-15 seconds. Any
connection-attempt within the period is, as you can see in the tcpdump
output, accepted by the kernel and the received data is thrown away as
the kernel sends a RST.
After those 10-15 seconds the application gets a POLLIN event for the
server socket and the normal data-transfer takes place for the next 4000
requests.
This behaviour is reproducable with thttpd 2.20c and the current lighttpd.
> How-To-Repeat:
Calling 'ab' (apachebench) at grisu with the following paramters:
/usr/sbin/ab -n 10000 -c 10 http://192.168.2.38:1025/index.html
results in the following output:
...
Time taken for tests: 40.610 seconds
Complete requests: 10000
Failed requests: 5980
(Connect: 0, Length: 5980, Exceptions: 0)
...
This is reproducable and the number of failed requests is always
5980 +/- 50 requests. In other words: after 4000 requests tcpdump shows
the following output:
00:53:48.923029 192.168.2.10.39774 > 192.168.2.38.1025: S [tcp sum ok]
1013737315:1013737315(0) win 5840 <mss 1460,sackOK,timestamp 5208461
0,nop,wscale0> (DF) (ttl 64, id 7918, len 60)
0x0000 4500 003c 1eee 4000 4006 964d c0a8 020a E..<..@.@..M....
0x0010 c0a8 0226 9b5e 0401 3c6c 6763 0000 0000 ...&.^..<lgc....
0x0020 a002 16d0 eeaa 0000 0204 05b4 0402 080a ................
0x0030 004f 798d 0000 0000 0103 0300 .Oy.........
00:53:48.923330 192.168.2.38.1025 > 192.168.2.10.39774: S [tcp sum ok]
1803860672:1803860672(0) ack 1013737316 win 65535 <mss 1460,nop,wscale
1,nop,nop,timestamp 4459794 5208461> (DF) (ttl 64, id 6821, len 60)
0x0000 4500 003c 1aa5 4000 4006 9a96 c0a8 0226 E..<..@.@......&
0x0010 c0a8 020a 0401 9b5e 6b84 bac0 3c6c 6764 .......^k...<lgd
0x0020 a012 ffff d4ce 0000 0204 05b4 0103 0301 ................
0x0030 0101 080a 0044 0d12 004f 798d .....D...Oy.
00:53:48.924009 192.168.2.10.39774 > 192.168.2.38.1025: . [tcp sum ok] ack 1
win 5840 <nop,nop,timestamp 5208461 4459794> (DF)
(ttl 64, id 7919, len 52)
0x0000 4500 0034 1eef 4000 4006 9654 c0a8 020a E..4..@.@..T....
0x0010 c0a8 0226 9b5e 0401 3c6c 6764 6b84 bac1 ...&.^..<lgdk...
0x0020 8010 16d0 e9c3 0000 0101 080a 004f 798d .............Oy.
0x0030 0044 0d12
00:53:48.924150 192.168.2.10.39774 > 192.168.2.38.1025: P [tcp sum ok]
1:29(28) ack 1 win 5840 <nop,nop,timestamp 5208461 4459794> (DF)
(ttl 64, id 7920, len 80)
0x0000 4500 0050 1ef0 4000 4006 9637 c0a8 020a E..P..@.@..7....
0x0010 c0a8 0226 9b5e 0401 3c6c 6764 6b84 bac1 ...&.^..<lgdk...
0x0020 8018 16d0 27e4 0000 0101 080a 004f 798d ....'........Oy.
0x0030 0044 0d12 4745 5420 2f69 6e64 6578 2e68 .D..GET./index.h
0x0040 746d 6c20 4854 5450 2f31 2e30 0d0a 0d0a tml.HTTP/1.0....
00:53:48.924728 192.168.2.38.1025 > 192.168.2.10.39774: R [tcp sum ok]
1803860673:1803860673(0) win 0 (ttl 64, id 6831, len 40)
0x0000 4500 0028 1aaf 0000 4006 daa0 c0a8 0226 E..(....@......&
0x0010 c0a8 020a 0401 9b5e 6b84 bac1 0000 0000 .......^k.......
0x0020 5004 0000 64ba 0000
This is SYN, SYN+ACK, ACK, DATA, RST.
strace shows that no connection attempt has been reported to the
application which is poll()'ing the server socket.
/* the common loop without any block attempts */
accept(3, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, [0]) = 5
fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(5, "GET /index.html HTTP/1.0\r\n\r\n", 4095) = 28
stat("/home/jan/lighttpd-0.1.0/servers/", {st_mode=S_IFDIR|0755, st_size=512, ...}) \
= 0 fstat(6, {st_mode=S_IFREG|0644, st_size=4348, ...}) = 0
write(5, "HTTP/1.0 200 OK\r\nConnection: clo"..., 235) = 235
write(2, "1064875136: (network.c.210) 235 "..., 33) = 33
syscall_393(0x6, 0x5, 0, 0, 0x10fc, 0, 0xbfbff2c0, 0) = 0
close(5) = 0
/* no futher waiting connections */
accept(3, 0xbfbff700, [1852702730]) = -1 EAGAIN (Resource temporarily \
unavailable)
/* enter the main-loop */
gettimeofday({1769235301, 1663069807}, NULL) = 0
poll([{fd=3, events=POLLIN}], 1, 1000) = 0
gettimeofday({4294967295, 65537}, NULL) = 0
poll([{fd=3, events=POLLIN}], 1, 1000) = 0
gettimeofday({4294967295, 65537}, NULL) = 0
poll([{fd=3, events=POLLIN}], 1, 1000) = 0
...
(strace is broken for accept() and gettimeofday(), syscall_393() is sendfile())
> Fix:
no fix nor workaround is known yet.
Jan
--
Jan Kneschke http://jan.kneschke.de/
Perhaps you want to say 'thank you, jan': http://jk.123.org/wishlist/
> Release-Note:
> Audit-Trail:
> Unformatted:
_______________________________________________
freebsd-bugs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic