[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-rt-users
Subject:    Re: [PATCH v4.4.y] mm, vmstat: make quiet_vmstat lighter
From:       Greg KH <gregkh () linuxfoundation ! org>
Date:       2019-05-13 7:02:14
Message-ID: 20190513070214.GA26553 () kroah ! com
[Download RAW message or body]

On Mon, May 13, 2019 at 08:12:37AM +0200, Daniel Wagner wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> [ Upstream commit f01f17d3705bb6081c9e5728078f64067982be36 ]
> 
> Mike has reported a considerable overhead of refresh_cpu_vm_stats from
> the idle entry during pipe test:
> 
>     12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
>      4.75%  [kernel]       [k] __schedule
>      4.70%  [kernel]       [k] mutex_unlock
>      3.14%  [kernel]       [k] __switch_to
> 
> This is caused by commit 0eb77e988032 ("vmstat: make vmstat_updater
> deferrable again and shut down on idle") which has placed quiet_vmstat
> into cpu_idle_loop.  The main reason here seems to be that the idle
> entry has to get over all zones and perform atomic operations for each
> vmstat entry even though there might be no per cpu diffs.  This is a
> pointless overhead for _each_ idle entry.
> 
> Make sure that quiet_vmstat is as light as possible.
> 
> First of all it doesn't make any sense to do any local sync if the
> current cpu is already set in oncpu_stat_off because vmstat_update puts
> itself there only if there is nothing to do.
> 
> Then we can check need_update which should be a cheap way to check for
> potential per-cpu diffs and only then do refresh_cpu_vm_stats.
> 
> The original patch also did cancel_delayed_work which we are not doing
> here.  There are two reasons for that.  Firstly cancel_delayed_work from
> idle context will blow up on RT kernels (reported by Mike):
> 
>   CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
>   Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
>   Call Trace:
>     dump_stack+0x49/0x67
>     ___might_sleep+0xf5/0x180
>     rt_spin_lock+0x20/0x50
>     try_to_grab_pending+0x69/0x240
>     cancel_delayed_work+0x26/0xe0
>     quiet_vmstat+0x75/0xa0
>     cpu_idle_loop+0x38/0x3e0
>     cpu_startup_entry+0x13/0x20
>     start_secondary+0x114/0x140
> 
> And secondly, even on !RT kernels it might add some non trivial overhead
> which is not necessary.  Even if the vmstat worker wakes up and preempts
> idle then it will be most likely a single shot noop because the stats
> were already synced and so it would end up on the oncpu_stat_off anyway.
> We just need to teach both vmstat_shepherd and vmstat_update to stop
> scheduling the worker if there is nothing to do.
> 
> [mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Daniel Wagner <wagi@monom.org>
> ---
> Hi Greg,
> 
> Upstream commmit 0eb77e988032 ("vmstat: make vmstat_updater deferrable
> again and shut down on idle") was back ported in v4.4.178
> (bdf3c006b9a2). For -rt we definitely need the bugfix f01f17d3705b
> ("mm, vmstat: make quiet_vmstat lighter") as well.
> 
> Since the offending patch was back ported to v4.4 stable only, the
> other stable branches don't need an update (offending patch and bug
> fix are already in).
> 
> Could you please queue the above patch for v4.4.y?

Now queued up, thanks.

greg k-h
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic