[prev in list] [next in list] [prev in thread] [next in thread] 

List:       haproxy
Subject:    Re: Feature Requests: march native and cwnd setting param
From:       Willy Tarreau <w () 1wt ! eu>
Date:       2010-11-28 11:33:43
Message-ID: 20101128113343.GB30402 () 1wt ! eu
[Download RAW message or body]

Hi Hank,

On Sat, Nov 27, 2010 at 06:56:34AM -0800, Hank A. Paulson wrote:
> 1 - With recent CPUs Intel 5300/5400/5500/5600 and AMD 6100 the set of 
> optimal compiler settings for optimizations :) is not something anyone can 
> keep up with - not to mention different versions of gcc that understand 
> none, some or all of the features of these CPUs. march native allows gcc to 
> take on the burden of optimizing the compile time settings, so if that 
> could be added as one of the options in the makefile, it would be helpful 
> because then I could use the same "make..." line on every machine but it 
> would self-adjust for that machine.
(...)

That's a good idea, I have implemented it and even ported it to 1.4.
I have also added ARCH=32 and ARCH=64 do be used in combination with
CPU=native, so that you can select whether you explicitly want a 32
or 64-bit executable.

> 2 - Google has pushed via both tcp related RFCs and patches to the 
> networking code for the linux kernel to allow the initial cwnd to be set as 
> a socket option - this would be a huge help to sites that communicate with 
> the same clients over and over and/or with many small requests allowing a 
> full response in one (or at least fewer) round trips. For one site that I 
> work on that is over 250 ms away with a very reliable gateway on the other 
> end, I burn through several round trips to deliver an icon/small gif/etc - 
> an icon that could have all the necessary packets in flight before the 
> first ack. It turns out the small initial cwnd creates more traffic across 
> the under sea cables than an initial cwnd of 8 or 10 or 12.
> 
> http://www.amailbox.org/mailarchive/linux-netdev/2010/5/26/6278007

Indeed it can be nice in mobile environments for instance, where the
RTT is quite high. It does not seem too hard to add, I'm adding this
to the 1.5 TODO list.

> I also wanted to see if you were aware of two other recent kernel changes 
> that could be helpful to haproxy performance, the first could be helpful 
> for the new UNIX socket connections in recent haproxy versions:
> 
> Implementation of recvmmsg:
> recvmmsg() is a new syscall that allows to receive with a single syscall 
> multiple messages that would require multiple calls to recvmsg(). For 
> high-bandwith, small packet applications, throughput and latency are 
> improved greatly.

Unfortunately, this will have no effect here because recvmmsg()'s goal is
to receive multiple datagrams at once, but we're not working with datagrams
but with streams, and segments are already combined to return as many of
them as possible.

A small improvement we can work on is to use accept4() instead of accept()
to save one setsockopt().

> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a2e2725541fad72416326798c2d7fa4dafb7d337
>  
> The second is "RPS" from google to improve network processing performance 
> with multiple CPUs - similar to MSI-X but google found that both together 
> had even more performance than just MSI-X:
> 
> http://kernelnewbies.org/Linux_2_6_35#head-94daf753b96280181e79a71ca4bb7f7a423e302a
> 
> http://lwn.net/Articles/362339/

Yes I've followed that. There's is nothing to do to make use of that,
you just need to upgrade your kernel :-)

Cheers,
Willy


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic