'Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-kernel
Subject:    Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling
From:       Shrikanth Hegde <sshegde () linux ! ibm ! com>
Date:       2024-05-15 14:43:44
Message-ID: 17cc54c4-2e75-4964-9155-84db081ce209 () linux ! ibm ! com
[Download RAW message or body]



On 5/8/24 10:48 AM, Ankur Arora wrote:
> 
> Shrikanth Hegde <sshegde@linux.ibm.com> writes:
> 
>> On 4/27/24 12:30 AM, Ankur Arora wrote:
>>>
>>
>> Hi Ankur,
>>
>> Sorry for the delay, I was on leave last week.
>> There might be a delay in response as I am recovering from fever.
> 
> Please take your your time.
> 
>>> Great. I'm guessing these tests are when running in voluntary preemption
>>> mode (under PREEMPT_AUTO).
>>>
>>
>> It was run under preempt=none.
>>
>>> If you haven't, could you also try full preemption? There you should see
>>> identical results unless something is horribly wrong.
>>
>> I tried preempt=full with patch you provided below. ran the hackbench for much longer
>> with 100000 loops. I don't see any regression on the larger system.
>> I see slight improvement in some cases.  I dont see any major regression with 10k ops
>> which was tried earlier as well.
> 
> Great, so no surprises with preempt=full.

Right. 

> 
>> ==========================================================
>> 1L ops.
>> ==========================================================
>> Process 10 groups          :       9.85,       9.87(-0.20)
>> Process 20 groups          :      17.69,      17.32(2.09)
>> Process 30 groups          :      25.89,      25.96(-0.27)
>> Process 40 groups          :      34.70,      34.61(0.26)
>> Process 50 groups          :      44.02,      43.79(0.52)
>> Process 60 groups          :      52.72,      52.10(1.18)
>> Thread  10 groups          :      10.50,      10.52(-0.19)
>> Thread  20 groups          :      18.79,      18.60(1.01)
>> Process(Pipe) 10 groups    :      10.39,      10.37(0.19)
>> Process(Pipe) 20 groups    :      18.45,      18.54(-0.49)
>> Process(Pipe) 30 groups    :      25.63,      25.92(-1.13)
>> Process(Pipe) 40 groups    :      33.79,      33.49(0.89)
>> Process(Pipe) 50 groups    :      43.15,      41.83(3.06)
>> Process(Pipe) 60 groups    :      51.94,      50.32(3.12)
>> Thread(Pipe)  10 groups    :      10.73,      10.85(-1.12)
>> Thread(Pipe)  20 groups    :      19.24,      19.35(-0.57)
> 
> I presume the ones on the left are the baseline numbers with
> PREEMPT_AUTO on the right?
> 
> Also, these are with preempt=none/voluntary/full?
> 
One on the left is 6.9 baseline (preempt=full) and one on the right is preempt auto. (preempt=full)
These numbers are with preempt=full. 


>>>> However, I still see 20-50%
>>>> regression on the larger system(320 CPUS). I will continue to debug why.
>>>
>>> Could you try this patch? This is needed because PREEMPT_AUTO turns on
>>> CONFIG_PREEMPTION, but not CONFIG_PREEMPT:
> 
> Just wanted to check if the regression you were seeing with preempt=none
> is fixed?

I ran the tests again today with this patch for preempt=none. 
Unfortunately, the numbers indicate that there is regression with preempt=none on the 
large system. I will continue to debug why.

				6.9 		preempt_auto
				preempt=none	preempt=none

Process 10 groups          :       9.66,       9.75(-0.93)
Process 20 groups          :      16.82,      17.21(-2.32)
Process 30 groups          :      24.56,      25.51(-3.87)
Process 40 groups          :      31.79,      33.99(-6.92)
Process 50 groups          :      39.55,      42.85(-8.34)
Process 60 groups          :      47.39,      51.34(-8.34)
thread 10 groups           :      10.21,      10.36(-1.47)
thread 20 groups           :      18.00,      18.39(-2.17)
Process(Pipe) 10 groups    :       8.78,       8.71(0.80)
Process(Pipe) 20 groups    :      12.19,      13.42(-10.09)
Process(Pipe) 30 groups    :      15.82,      18.32(-15.80)
Process(Pipe) 40 groups    :      19.15,      22.04(-15.09)
Process(Pipe) 50 groups    :      22.23,      27.40(-23.26)
Process(Pipe) 60 groups    :      25.35,      31.37(-23.75)
thread(Pipe) 10 groups     :       8.84,       8.72(1.36)
thread(Pipe) 20 groups     :      12.90,      13.45(-4.26)



> 
>> This patch can be considered as the enablement patch for Powerpc for preempt_auto.
>> Michael, Nick, Do you see any concerns?
>>
>> Ankur, Could you please add this patch, if there are no objections.
> 
> Of course. Thanks for the patch.
> 
> The patch overall looks good. A minor comment below.
> 
>  Will add it to v2.

FWIW, we can add this support for PowerPC and fix the issues along the way. 

> 
>> ---
>> From 878a5a7c990e3459758a5d19d7697b07d8d27d0f Mon Sep 17 00:00:00 2001
>> From: Shrikanth Hegde <sshegde@linux.ibm.com>
>> Date: Tue, 7 May 2024 04:42:04 -0500
>> Subject: [PATCH] powerpc: add support for preempt_auto
>>
>> Add PowerPC arch support for PREEMPT_AUTO by defining LAZY bits.
>>
>> Since PowerPC doesn't use generic exit to functions, Add
>> NR_LAZY check in exit to user and exit to kernel from interrupt
>> routines.
>>
>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>> ---
>>  arch/powerpc/Kconfig                   |  1 +
>>  arch/powerpc/include/asm/thread_info.h | 11 ++++++++++-
>>  arch/powerpc/kernel/interrupt.c        |  6 ++++--
>>  3 files changed, 15 insertions(+), 3 deletions(-)
> 
> ...
> 
>> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
>> index eca293794a1e..0c0b7010995a 100644
>> --- a/arch/powerpc/kernel/interrupt.c
>> +++ b/arch/powerpc/kernel/interrupt.c
>> @@ -185,7 +185,7 @@ interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
>>  	ti_flags = read_thread_flags();
>>  	while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
>>  		local_irq_enable();
>> -		if (ti_flags & _TIF_NEED_RESCHED) {
>> +		if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
>>  			schedule();
>>  		} else {
>>  			/*
>> @@ -396,7 +396,9 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>>  		/* Returning to a kernel context with local irqs enabled. */
>>  		WARN_ON_ONCE(!(regs->msr & MSR_EE));
>>  again:
>> -		if (IS_ENABLED(CONFIG_PREEMPT)) {
>> +
>> +		if ((IS_ENABLED(CONFIG_PREEMPT_AUTO) && IS_ENABLED(CONFIG_PREEMPTION)) ||
>> +		    (!IS_ENABLED(CONFIG_PREEMPT_AUTO) && IS_ENABLED(CONFIG_PREEMPT))) {
> 
> Seems to me, this could just be:
> 
>         if (IS_EANBLED(CONFIG_PREEMPTION))

You are right, that should be fine. Since CONFIG_PREEMPTION is enabled 
iff preempt=full or preempt_auto. So its the same thing. 

Maybe putting a comment around it to just describe is good enough. 

> 
> But, maybe the intent is to exclude support for, say
> (PREEMPT_DYNAMIC && PREEMPT_NONE), then maybe it is best to explicitly
> encode that?
> 
> Alternately, a comment on the disparate treatment for PREEMPT_AUTO and
> !PREEMPT_AUTO might be helpful.
> 
> 
> Thanks
> Ankur
> 
>>  			/* Return to preemptible kernel context */
>>  			if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
>>  				if (preempt_count() == 0)

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic