[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-edac
Subject:    [PATCH v2] x86/mce: Do not use bank 1 for APEI generated error logs.
From:       Tony Luck <tony.luck () intel ! com>
Date:       2016-05-27 21:11:06
Message-ID: b7fffb2b326bc1dd150ffceb9919a803f9496e0e.1464805958.git.tony.luck () intel ! com
[Download RAW message or body]

BIOS can report a memory error to Linux using ACPI/APEI mechanism.
When it does this, we create a fictitious machine check error record and
feed it into the standard mce_Log() function. The error record needs a
machine check bank number, and for some reason we chose "1" for this.

But "1" is a valid bank number, and this causes confusion and heartburn
among h/w folks who are concerned that a memory error signature was
somehow logged in bank 1.

Change to use "-1" (field is a "u8" so will typically print as 255).
This should make it clearer that this error did not originate in a
machine check bank.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
V1-V2: Boris: Use -1 as bank number as more definitive indicator.

 arch/x86/kernel/cpu/mcheck/mce-apei.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index 34c89a3e8260..83f1a98d37db 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -46,7 +46,7 @@ void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
 		return;
 
 	mce_setup(&m);
-	m.bank = 1;
+	m.bank = -1;
 	/* Fake a memory read error with unknown channel */
 	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
 
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic