[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Re: Permit init(8) use its own cpuset group.
From:       "Alexander V. Chernikov" <melifaro () FreeBSD ! org>
Date:       2014-06-12 9:23:27
Message-ID: 5399718F.9090700 () FreeBSD ! org
[Download RAW message or body]

On 12.06.2014 07:57, Julian Elischer wrote:
> On 6/12/14, 12:52 AM, Alexander V. Chernikov wrote:
>> On 09.06.2014 21:43, John Baldwin wrote:
>>> On Friday, June 06, 2014 6:09:13 pm Alexander V. Chernikov wrote:
>>>> On 06.06.2014 01:04, Alexander V. Chernikov wrote:
>>>>> On 05.06.2014 23:59, John Baldwin wrote:
>>>>>> On Thursday, June 05, 2014 3:46:15 pm Alexander V. Chernikov wrote:
>>>>>>> On 05.06.2014 18:09, John Baldwin wrote:
>>>>>>>> On Wednesday, June 04, 2014 3:16:59 pm Alexander V. Chernikov 
>>>>>>>> wrote:
>>>>>>>>> On 04.06.2014 19:06, John Baldwin wrote:
>>>>>>>>>> On Monday, June 02, 2014 12:48:50 pm Konstantin Belousov wrote:
>>>>>>>>>>> On Mon, Jun 02, 2014 at 06:52:10PM +0400, Alexander V. 
>>>>>>>>>>> Chernikov wrote:
>>>>>>>>>>>> Hello list!
>>>>>>>>>>>>
>>>>>>>>>>>> Currently init(8) uses group 1 which is root group.
>>>>>>>>>>>> Modifications of this group affects both kernel and 
>>>>>>>>>>>> userland threads.
>>>>>>>>>>>> Additionally, such modifications are impossible, for 
>>>>>>>>>>>> example, in
>>>>>>>> presence
>>>>>>>>>>>> of multi-queue NIC drivers (like igb or ixgbe) which binds 
>>>>>>>>>>>> their
>>>>>> threads
>>>>>>>>>> to
>>>>>>>>>>>> particular cpus.
>>>>>>>>>>>>
>>>>>>>>>>>> Proposed change ("init_cpuset" loader tunable) permits 
>>>>>>>>>>>> changing cpu
>>>>>>>>>>>> masks for
>>>>>>>>>>>> userland more easily. Restricting user processes to migrate 
>>>>>>>>>>>> to/from
>>>>>> CPU
>>>>>>>>>>>> cores
>>>>>>>>>>>> used for network traffic processing is one of the cases.
>>>>>>>>>>>>
>>>>>>>>>>>> Phabricator: https://phabric.freebsd.org/D141 (the same 
>>>>>>>>>>>> version
>>>>>> attached
>>>>>>>>>>>> inline)
>>>>>>>>>>>>
>>>>>>>>>>>> If there are no objections, I'll commit this next week.
>>>>>>>>>>> Why is the tunable needed ?
>>>>>>>>>> Because some people already depend on doing 'cpuset -l 0 -s 
>>>>>>>>>> 1'.  It is
>>>>>>>> also
>>>>>>>>>> documented in our manpages that processes start in cpuset 1 
>>>>>>>>>> by default
>>>>>> so
>>>>>>>>>> that you can use 'cpuset -l 0 -s 1' to move all processes, etc.
>>>>>>>>>>
>>>>>>>>>> For the stated problem (bound ithreads in drivers), I would 
>>>>>>>>>> actually
>>>>>> like
>>>>>>>> to
>>>>>>>>>> fix ithreads that are bound to a specific CPU to create a 
>>>>>>>>>> different
>>>>>> cpuset
>>>>>>>>>> instead so they don't conflict with set 1.
>>>>>>>>> Yes, this seem to be much better approach.
>>>>>>>>> Please take a look on the new patch (here or in the phabricator).
>>>>>>>>> Comments:
>>>>>>>>>
>>>>>>>>> Use different approach for modifyable root set:
>>>>>>>>>
>>>>>>>>>    * Make sets in 0..15 internal
>>>>>>>>>    * Define CPUSET_SET1 & CPUSET_ITHREAD in that range
>>>>>>>>>    * Add cpuset_lookup_builtin() to retrieve such cpu sets by id
>>>>>>>>>    * Create additional root set for ithreads
>>>>>>>>>    * Use this set in ithread_create()
>>>>>>>>>    * Change intr_setaffinity() to use cpuset_iroot (do we 
>>>>>>>>> really need
>>>>>> this)?
>>>>>>>>> We can probably do the same for kprocs, but I'm unsure if we 
>>>>>>>>> really need
>>>>>> it.
>>>>>>>> I imagined something a bit simpler.  Just create a new set in
>>>>>> intr_event_bind
>>>>>>>> and bind the ithread to the new set.  No need to have more 
>>>>>>>> magic set ids,
>>>>>> etc.
>>>>>>> Well, we also have userland which can modify given changes via 
>>>>>>> `cpuset
>>>>>>> -x', so we need to be able to add some more logic on set
>>>>>>> allocation/keeping. Additionally, we can try to do the same via 
>>>>>>> `cpuset
>>>>>>> -t', so introducing something like cpuset_setIthread() and 
>>>>>>> hooking into
>>>>>>> intr_event_bind() won't probably be enough. At least I can't 
>>>>>>> think out a
>>>>>>> quick and easy way to do this.
>>>>>> cpuset -x calls intr_event_bind().  If you just do it there you 
>>>>>> fix both
>>>>>> places.
>>>> Yup. I was wrong. This version seems to be simpler while doing the 
>>>> right
>>>> thing.
>>> Hmm, I do think this patch is doing the right thing, but I'm worried 
>>> you
>>> might have weird effects, e.g. I worry that because the 'intr' proc 
>>> is in
>>> set 1 still, the thread masks will be (incorrectly) checked when 
>>> changing
>> Well, as far as I understand we don't have any cpu sets bound to 
>> program ("pid" search returns set of first thread).
>> Anyway, each original ithread mask is released via cpuset_rel() so 
>> I'm a bit unsure about remaining thread masks.
>> Playing with cpuset and ixgbe driver, does not reveal given problem:
>>
>> 12:59 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 
>> 16, 17, 18, 19, 20, 21, 22, 23
>> 13:00 [0] test15# kldload ixgbe
>> 13:00 [0] test15# vmstat -i | grep ix
>> irq263: ix0:que 0                    116          0
>> irq264: ix0:que 1                    108          0
>> irq265: ix0:que 2                    108          0
>> irq266: ix0:que 3                    108          0
>> irq267: ix0:que 4                    108          0
>> irq268: ix0:que 5                    108          0
>> irq269: ix0:que 6                    108          0
>> irq270: ix0:que 7                    108          0
>> ...
>> 13:00 [0] test15# cpuset -g -x 263
>> irq 263 mask: 0
>> 13:00 [0] test15# cpuset -g -x 264
>> irq 264 mask: 1
>> 13:00 [0] test15# cpuset -l 21,22,23 -s 1
>> 13:01 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 21, 22, 23
>> 13:01 [0] test15# cpuset -g -x 263
>> irq 263 mask: 0
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 1
>> 13:01 [0] test15# cpuset -l 12 -x 264
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 12
>> 13:01 [0] test15# cpuset -l 22,23 -s 1
>> 13:01 [0] test15# echo $?
>> 0
>> 13:01 [0] test15# cpuset -g -x 264
>> irq 264 mask: 12
>> 13:01 [0] test15# cpuset -g -s 1
>> cpuset 1 mask: 22, 23
>>
>>
>>> set 1 and you still won't be able to 'cpuset -l 0 -s 1'.
>>>
>>> Also, strictly speaking, it would probably be best to return threads to
>>> set 1 when NOCPU is passed to intr_event_bind() to unbind an interrupt.
>> Yup.
>
> in a quick look I didn't see anywhere that checked if the old set and 
> new set are the same set..
Well, checking if old cpu _mask_ is the same as old one hasn't been 
implemented for any other object type inside cpuset.
Additionally, intr_event_bind() is usually called once per irq, so the 
benefit for that particular object type is questionable.

>>
>>
>> _______________________________________________
>> freebsd-hackers@freebsd.org  mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to"freebsd-hackers-unsubscribe@freebsd.org"
>
>
>

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic