[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-edac
Subject:    Re: [PATCH 1/2] x86/mce: Remove old CMCI storm mitigation code
From:       Borislav Petkov <bp () alien8 ! de>
Date:       2022-02-24 15:14:05
Message-ID: YhegvWKq913TEd0M () zn ! tnic
[Download RAW message or body]

On Thu, Feb 17, 2022 at 09:35:52AM -0800, Luck, Tony wrote:
> When a "storm" of CMCI is detected this code mitigates by
> disabling CMCI interrupt signalling from all of the banks
> owned by the CPU that saw the storm.
> 
> There are problems with this approach:
> 
> 1) It is very coarse grained. In all liklihood only one of the

Unknown word [liklihood] in commit message.
Suggestions: ['likelihood', 'livelihood']

> banks was geenrating the interrupts, but CMCI is disabled for all.

Unknown word [geenrating] in commit message.
Suggestions: ['generating', 'penetrating', 'germinating', 'entreating', 'ingratiating']

Do I need to give you the whole spiel about using a spellchecker?

:)

> This means Linux may delay seeing and processing errors logged
> from other banks.
> 
> 2) Although CMCI stands for Corrected Machine Check Interrupt, it
> is also used to signal when an uncorrected error is logged. This
> is a problem because these errors should be handled in a timely
> manner.
> 
> Delete all this code in preparation for a finer grained solution.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/mce/core.c     |  20 +---
>  arch/x86/kernel/cpu/mce/intel.c    | 145 -----------------------------
>  arch/x86/kernel/cpu/mce/internal.h |   6 --
>  3 files changed, 1 insertion(+), 170 deletions(-)

Yah, can't complain about diffstats like that.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic