[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-api
Subject: Re: [RFC PATCH v2 09/11] sched: Introduce per memory space current virtual cpu id
From: Mathieu Desnoyers <mathieu.desnoyers () efficios ! com>
Date: 2022-02-25 21:21:02
Message-ID: 1136157594.109786.1645824062005.JavaMail.zimbra () efficios ! com
[Download RAW message or body]
----- On Feb 25, 2022, at 12:56 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:
> ----- On Feb 25, 2022, at 12:35 PM, Jonathan Corbet corbet@lwn.net wrote:
>
>> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:
>>
>>> This feature allows the scheduler to expose a current virtual cpu id
>>> to user-space. This virtual cpu id is within the possible cpus range,
>>> and is temporarily (and uniquely) assigned while threads are actively
>>> running within a memory space. If a memory space has fewer threads than
>>> cores, or is limited to run on few cores concurrently through sched
>>> affinity or cgroup cpusets, the virtual cpu ids will be values close
>>> to 0, thus allowing efficient use of user-space memory for per-cpu
>>> data structures.
>>
>> So I have one possibly (probably) dumb question: if I'm writing a
>> program to make use of virtual CPU IDs, how do I know what the maximum
>> ID will be? It seems like one of the advantages of this mechanism would
>> be not having to be prepared for anything in the physical ID space, but
>> is there any guarantee that the virtual-ID space will be smaller?
>> Something like "no larger than the number of threads", say?
>
> Hi Jonathan,
>
> This is a very relevant question. Let me quote what I answered to Florian
> on the last round of review for this series:
>
> Some effective upper bounds for the number of vcpu ids observable in a process:
>
> - sysconf(3) _SC_NPROCESSORS_CONF,
> - the number of threads which exist concurrently in the process,
One small detail I forgot to mention: on a NUMA system, a single-threaded
process will observe (typically) vcpu_id=numa_node_id. So it can jump around
between vcpu_id values depending on which numa node it runs on at the moment.
So the vcpu_id is not strictly bound by the number of concurrently running
threads.
Thanks,
Mathieu
> - the number of cpus in the cpu affinity mask applied by sched_setaffinity,
> except in corner-case situations such as cpu hotplug removing all cpus from
> the affinity set,
> - cgroup cpuset "partition" limits,
>
> Note that AFAIR non-partition cgroup cpusets allow a cgroup to "borrow"
> additional cores from the rest of the system if they are idle, therefore
> allowing the number of concurrent threads to go beyond the specified limit.
>
> AFAIR the sched affinity mask is tweaked independently of the cgroup cpuset.
> Those are two mechanisms both affecting the scheduler task placement.
>
> I would expect the user-space code to use some sensible upper bound as a
> hint about how many per-vcpu data structure elements to expect (and how many
> to pre-allocate), but have a "lazy initialization" fall-back in case the
> vcpu id goes up to the number of configured processors - 1. And I suspect
> that even the number of configured processors may change with CRIU.
>
> If the above explanation makes sense (please let me know if I am wrong
> or missed something), I suspect I should add it to the commit message.
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic