[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Re: ipfw pipe config bw tun0
From:       Luigi Rizzo <rizzo () iet ! unipi ! it>
Date:       2014-06-29 20:17:21
Message-ID: CA+hQ2+gtRK1tAGMiyr6KU0F8S=Sm5J4Djc-uTaVdinHAVp9xXw () mail ! gmail ! com
[Download RAW message or body]

On Sun, Jun 29, 2014 at 6:29 PM, Adrian Chadd <adrian@freebsd.org> wrote:

> We can start adding that. How should it behave for multi-queue devices?
>

​Long reply, sorry about that:
​
​if i remember well, this feature was implemented assuming
that at most one packet was outstanding, so multiqueue
was not really an issue: any time you get a completion
interrupt from any queue you push out the next packet
from the pipe.
The goal was to provide weighted fair queueing using
the actual NIC's bandwidth to clock packets out.

Remember, this was done in '99 and on hardware that
did not have queues or interrupt moderation.

These days, between deep NIC queues, interrupt moderation,
multiqueue and very high bandwidths, the assumption of one
outstanding packet is a bad one for performance.
You'd also have the option to tie a pipe to an individual queue or
to the entire NIC (the user API changes to do this
is trivial, e.g. you can append a :queue_number to to the
interface name as i did in netmap).

This said:
1. if you don't mind the fact that
the interface has a deep queue, you could just push
packets from a PIPE to an interface until if_transmit
returns an error (make sure the packet is not lost
by adding a reference to the mbuf or something),
and then any interrupt completion from any queue would
be used to 'clock' packets out.

2. if the NIC's queue bothers you (it might, because
it adds an equivalent error to the nice properties
of the scheduler), then the pipe could try to track
how many bytes are queued, stop after a given threshold,
and then when an interrupt completion is received
decrease the 'outstanding' counter by the actual
number of bytes sent. Essentially, what ALTQ does,
but with the classification flexibility of ipfw/dummynet.

Surely, keeping only one outstanding packet is too
expensive and would kill throughput. But a modern
interface with 256..1024 buffers of 1.5K each is
up to 3..12 Mbits which is way too high.

If we want to (re)implement this feature, we should
preliminarly introduce some way to control the outstanding
traffic on an interface -- can be done in dummynet
as #2 above, or within the NIC's driver if we
eventually build something like ethtool/bql .


cheers
luigi
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic