[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-smp
Subject: IRQ affinity for network IRQs on x86-64 (and IA64) SMP platforms
From: John Lumby <johnlumby () hotmail ! com>
Date: 2008-09-18 2:53:14
Message-ID: BAY137-W42C60EAE538F55F45CD5D4A34F0 () phx ! gbl
[Download RAW message or body]
I am interested in investigating being able to distribute softirq work from a network \
NIC across multiple processor cores on an x86-64 machine - this particular one has \
two Dual Core AMD Opteron Processor 275 and two broadcom gigabit NICs . But in \
general, where the number of cores is a multiple of the number of NICs, I'd like to \
be able to distribute the IRQs of each NIC over that multiple of cores.
The background is that I am running a network-intensive bidirectional workload on two \
of these machines, using a single bonded IP interface on each machine interconnected \
by a switch, each bond consisting of the two gigabit interfaces running full-duplex, \
with multiple sessions each establishing connections between these two IP endpoints; \
and I am seeing that :
. total network throughput of around 2660 Megabits/sec through each bond
(aggregated over send and receive)
is rather less than the network is capable of (CPU power permitting, the \
network is capable of somewhere nearer 3950 Megabits/sec)
. overall CPU uitilization is only around 85%, so some to spare ...
. ... but /proc/stat shows that the CPU utilization is very uneven over the 4 \
cores, with all the softirq processing confined to two cores.
I believe that for this workload, the network throughput would increase to around \
3000 Megabits/sec if the softirq load could be spread evenly over all 4 cores.
I switched off the irqbalance daemon and then tried altering the \
/proc/irq//smp_affinity proc files myself manually for the two IRQs (one for each \
NIC) to specify, for each one, two cores e.g. 05 for irq 225 and 0a for irq 201, \
At the time, the machine was running a 2.6.16 kernel. The result was - no \
dsitribution at alll. That is, for each NIC, as reported in /proc/interrupts, all \
interrupts were being directed to a single core - which was the "first" (in \
little-endian sense) of the bits in my smp_affinity mask. It ignored the second.
I then came across the paper in /Documentation/ia64/IRQ-redir.txt that documents this \
behaviour for ia64 (but I don't see anything saying this is also the case on \
x86-64). The paper says
"Because of the usage of SAPIC mode and physical destination mode the IRQ target \
is one particular CPU and cannot be a mask of several CPUs. Only the first non-zero \
bit is taken into account."
Ok - so that is exactly what saw (on 2.6.16)
Here is a clip of /proc/interrupts showing my two NICs after a run on 2.6.16
CPU0 CPU1 CPU2 CPU3
217: 2828591 551570 14406281 2734679 IO-APIC-level eth5
225: 18986626 0 2643626 14 IO-APIC-level eth3
(Note - I know the ratios are not all:0 - I had been experimenting with different \
masks - and don't see any way of resetting counters)
I then upgraded the kernel to 2.6.26.5 and tried again, and now I see something \
different. with the same masks (05, 0a) I see that, for each NIC, IRQs are now \
distributed over the two cores I specify in the mask - but not evenly. The ratio \
is around 7:1. This is better than all:0 and raises the throughput from 2660 \
Mbits/sec to over 2810 Mbits/sec with no other changes.
Here is a clip of /proc/interrupts showing my two NICs after a run on 2.6.26
CPU0 CPU1 CPU2 CPU3
24: 144 1145810 0 612858 IO-APIC-fasteoi eth5
25: 83517 7 575415 849336 IO-APIC-fasteoi eth3
again, the ratios are from several runs with different masks, but the counts for CPUs \
0 and 2 for IRQ 25 are representative.
A couple of obvious changes from 2.6.16 -
IRQ numbers are smaller
IRQ method has changed from IO-APIC-level to IO-APIC-fasteoi
I see better CPU utilization over the 4 cores from /proc/stat, in particular, softirq \
work spread in that 7:1 ratio. So it seems that smp_affinity does partially work \
for a network device and several cores. So I am happier but left with a number \
of questions and hoping someone can answer :
1) As far as I can tell, SAPIC, aka IOSAPIC, is specific to Itanium, but in the \
literature I see something which appears to be similar called X2APIC on other Intel \
64-bit architectures. Does X2APIC have the same behaviour as regards IRQ \
balancing and smp affinity? And does the AMD Opteron(tm) Processor 275 also use \
X2APIC or AMD equivalent?
2) Is it expected that something changed in this area between 2.6.16 and 2.6.26 \
and if so what? ((maybe related to the external changes in output of \
/proc/interrupts I noted?)
3) Is it now possible, on this current kernel and with my hardware (or any \
gigabit NIC) to distribute softirq work approx 50:50 over two cores? If so, how?
I can supply more information about the runs and config etc if needed
John
_________________________________________________________________
--
To unsubscribe from this list: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic