[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-edac
Subject:    [PATCH v2 1/1] EDAC, sb_edac: Assign EDAC memory controller per h/w controller
From:       Qiuxu Zhuo <qiuxu.zhuo () intel ! com>
Date:       2016-07-18 15:30:38
Message-ID: 20160718153038.51978-1-qiuxu.zhuo () intel ! com
[Download RAW message or body]

1) Tony pointed out that "currently the driver pretends there is
   one big 8-channel memory controller per socket instead of 2
   4-channel controllers. This is fine with all memory controller
   populated with symmetrical DIMM configurations, but runs into
   difficulties on asymmetric setups". Restrcture the driver to
   assign an EDAC memory controller to each actual h/w memory
   controller to resolve the issue.

2) This patch includes the code to make a better mci->ctl_name from:
   https://lkml.kernel.org/r/17395491201298d994a18247ff1561b2e69fb32c.1467412094.git.tony.luck@intel.com


3) This patch supports for unallocated cpu/ha which was supplied
   by Patrick and the bug fix of home agent channel offset for
   "edac_mc_handle_error()".

4) Compare the console output log between insmod old driver and
   insmod new driver on Broadwell (4 sockets, 2 MC per socket):

   old mc0..new mc0
   < edac_mc_alloc: allocating 2872 bytes for mci data\
		(24 dimms, 24 csrows/channels)
   > edac_mc_alloc: allocating 2664 bytes for mci data\
		(12 dimms, 12 csrows/channels)
   < edac_create_sysfs_mci_device: creating dimm12, located\
		at channel 4 slot 0
   < edac_create_dimm_object: creating rank/dimm device dimm12
   < edac_create_sysfs_mci_device: creating dimm15, located at\
		channel 5 slot 0
   < edac_create_dimm_object: creating rank/dimm device dimm15
   < edac_create_sysfs_mci_device: creating dimm18, located at\
		channel 6 slot 0
   < edac_create_dimm_object: creating rank/dimm device dimm18
   < edac_create_sysfs_mci_device: creating dimm21, located at\
		channel 7 slot 0
   < edac_create_dimm_object: creating rank/dimm device dimm21
   < MC0: Giving out device to module sbridge_edac.c controller\
		Broadwell Socket#0: DEV 0000:ff:12.0 (INTERRUPT)
   > MC0: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#3_Ha#0: DEV 0000:ff:12.0 (INTERRUPT)

   old mc1..new mc1
   ...
   < MC1: Giving out device to module sbridge_edac.c controller\
		Broadwell Socket#1: DEV 0000:bf:12.0 (INTERRUPT)
   > MC1: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#2_Ha#0: DEV 0000:bf:12.0 (INTERRUPT)

   old mc2..new mc2
   ...
   < MC2: Giving out device to module sbridge_edac.c controller\
		Broadwell Socket#2: DEV 0000:7f:12.0 (INTERRUPT)
   > MC2: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#1_Ha#0: DEV 0000:7f:12.0 (INTERRUPT)

   old mc3..new mc3
   ...
   < MC3: Giving out device to module sbridge_edac.c controller\
		Broadwell Socket#3: DEV 0000:3f:12.0 (INTERRUPT)
   > MC3: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#0_Ha#0: DEV 0000:3f:12.0 (INTERRUPT)
   < sbridge:  Ver: 1.1.1

   add mc4
   > edac_mc_alloc: allocating 2664 bytes for mci data\
		(12 dimms, 12 csrows/channels)
   > edac_mc_add_mc_with_groups:
   > edac_create_sysfs_mci_device: creating bus mc4
   > edac_create_sysfs_mci_device: creating device mc4
   > edac_create_sysfs_mci_device: creating dimm0, located\
		at channel 0 slot 0
   > edac_create_dimm_object: creating rank/dimm device dimm0
   > edac_create_sysfs_mci_device: creating dimm3, located\
		at channel 1 slot 0
   > edac_create_dimm_object: creating rank/dimm device dimm3
   > edac_create_sysfs_mci_device: creating dimm6, located\
		at channel 2 slot 0
   > edac_create_dimm_object: creating rank/dimm device dimm6
   > edac_create_sysfs_mci_device: creating dimm9, located\
		at channel 3 slot 0
   > edac_create_dimm_object: creating rank/dimm device dimm9
   > MC4: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#3_Ha#1: DEV 0000:ff:12.4 (INTERRUPT)

   add mc5
   ...
   > MC5: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#2_Ha#1: DEV 0000:bf:12.4 (INTERRUPT)

   add mc6
   ...
   > MC6: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#1_Ha#1: DEV 0000:7f:12.4 (INTERRUPT)

   add mc7
   ...
   > MC7: Giving out device to module sbridge_edac.c controller\
		Broadwell SrcID#0_Ha#1: DEV 0000:3f:12.4 (INTERRUPT)
   > sbridge:  Ver: 1.1.2

5) Compare the console output log between insmod old driver and
   insmod new driver on Sandy Bridge (2 sockets, 1 MC per socket):

   old mc0..new mc0
   < edac_mc_alloc: allocating 2872 bytes for mci data\
		(24 dimms, 24 csrows/channels)
   > edac_mc_alloc: allocating 2664 bytes for mci data\
		(12 dimms, 12 csrows/channels)
   < MC0: Giving out device to module sbridge_edac.c controller\
		Sandy Bridge Socket#0: DEV 0000:7f:0e.0 (INTERRUPT)
   > MC0: Giving out device to module sbridge_edac.c controller\
		Sandy Bridge SrcID#0_Ha#0: DEV 0000:7f:0e.0 (INTERRUPT)

   old mc1..new mc1
   < edac_mc_alloc: allocating 2872 bytes for mci data\
		(24 dimms, 24 csrows/channels)
   > edac_mc_alloc: allocating 2664 bytes for mci data\
		(12 dimms, 12 csrows/channels)
   < MC1: Giving out device to module sbridge_edac.c controller\
		Sandy Bridge Socket#1: DEV 0000:ff:0e.0 (INTERRUPT)
   < sbridge:  Ver: 1.1.1
   > MC1: Giving out device to module sbridge_edac.c controller\
		Sandy Bridge SrcID#1_Ha#0: DEV 0000:ff:0e.0 (INTERRUPT)
   > sbridge:  Ver: 1.1.2

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
 drivers/edac/sb_edac.c | 603 ++++++++++++++++++++++++-------------------------
 1 file changed, 296 insertions(+), 307 deletions(-)
---
Change log:

v1->v2:
   - Add Patrick's bug fix of offseting of the home agent channel
     in edac_mc_handle_error() function.
   - Add Patrick's unallocated cpu/ha support.
   - Add "allow_dups" param in sbridge_get_onedevice() function,
     which allows for multiple devices with the same devid.

diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 6744d88..6505919 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -34,7 +34,7 @@ static LIST_HEAD(sbridge_edac_list);
 /*
  * Alter this version for the module when modifications are made
  */
-#define SBRIDGE_REVISION    " Ver: 1.1.1 "
+#define SBRIDGE_REVISION    " Ver: 1.1.2 "
 #define EDAC_MOD_STR      "sbridge_edac"
 
 /*
@@ -278,7 +278,7 @@ static const u32 correrrthrsld[] = {
  * sbridge structs
  */
 
-#define NUM_CHANNELS		8	/* 2MC per socket, four chan per MC */
+#define NUM_CHANNELS		4	/* Max channles per MC */
 #define MAX_DIMMS		3	/* Max DIMMS per channel */
 #define KNL_MAX_CHAS		38	/* KNL max num. of Cache Home Agents */
 #define KNL_MAX_CHANNELS	6	/* KNL max num. of PCI channels */
@@ -293,6 +293,12 @@ enum type {
 	KNIGHTS_LANDING,
 };
 
+enum domain {
+	IMC0 = 0, /* Must be 0 */
+	IMC1,
+	SOCKET,
+};
+
 struct sbridge_pvt;
 struct sbridge_info {
 	enum type	type;
@@ -324,6 +330,7 @@ struct sbridge_channel {
 struct pci_id_descr {
 	int			dev_id;
 	int			optional;
+	enum domain domain;
 };
 
 struct pci_id_table {
@@ -337,6 +344,7 @@ struct sbridge_dev {
 	u8			bus, mc;
 	u8			node_id, source_id;
 	struct pci_dev		**pdev;
+	enum domain domain; /* can only be IMC[0-9] */
 	int			n_devs;
 	struct mem_ctl_info	*mci;
 };
@@ -352,11 +360,14 @@ struct knl_pvt {
 };
 
 struct sbridge_pvt {
-	struct pci_dev		*pci_ta, *pci_ddrio, *pci_ras;
-	struct pci_dev		*pci_sad0, *pci_sad1;
-	struct pci_dev		*pci_ha0, *pci_ha1;
+	/* Devices per socket */
+	struct pci_dev		*pci_ddrio;
 	struct pci_dev		*pci_br0, *pci_br1;
-	struct pci_dev		*pci_ha1_ta;
+	struct pci_dev		*pci_sad0, *pci_sad1;
+	/* Devices per memory controller */
+	struct pci_dev		*pci_ha;
+	struct pci_dev		*pci_ta;
+	struct pci_dev		*pci_ras;
 	struct pci_dev		*pci_tad[NUM_CHANNELS];
 
 	struct sbridge_dev	*sbridge_dev;
@@ -373,29 +384,31 @@ struct sbridge_pvt {
 	struct knl_pvt knl;
 };
 
-#define PCI_DESCR(device_id, opt)	\
+#define PCI_DESCR(device_id, opt, dom)	\
 	.dev_id = (device_id),		\
-	.optional = opt
+	.optional = opt,	\
+	.domain = dom
 
 static const struct pci_id_descr pci_dev_descr_sbridge[] = {
 		/* Processor Home Agent */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0, 0)	},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0, 0, IMC0) },
 
 		/* Memory controller */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_RAS, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD1, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD2, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD3, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_DDRIO, 1)	},
-
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_RAS, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD0, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD1, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD2, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TAD3, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_DDRIO, 1, SOCKET) },
+
+		/* Items of SOCKET type must be put at last */
 		/* System Address Decoder */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_SAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_SAD1, 0)	},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_SAD0, 0, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_SAD1, 0, SOCKET) },
 
 		/* Broadcast Registers */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_BR, 0)		},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_SBRIDGE_BR, 0, SOCKET) },
 };
 
 #define PCI_ID_TABLE_ENTRY(A, T) {	\
@@ -439,36 +452,36 @@ static const struct pci_id_table pci_dev_descr_sbridge_table[] \
= {  
 static const struct pci_id_descr pci_dev_descr_ibridge[] = {
 		/* Processor Home Agent */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0, 0)		},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0, 0, IMC0) },
 
 		/* Memory controller */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA, 0)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS, 0)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD1, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD2, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3, 0)	},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD0, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD1, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD2, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3, 0, IMC0) },
+
+		/* Optional, mode 2HA */
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD1, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD2, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD3, 1, IMC1) },
+
+		/* Items of SOCKET type must be put at last */
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_1HA_DDRIO0, 1, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_2HA_DDRIO0, 1, SOCKET) },
 
 		/* System Address Decoder */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_SAD, 0)			},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_SAD, 0, SOCKET) },
 
 		/* Broadcast Registers */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_BR0, 1)			},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_BR1, 0)			},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_BR0, 1, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_BR1, 0, SOCKET) },
 
-		/* Optional, mode 2HA */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1, 1)		},
-#if 0
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS, 1)	},
-#endif
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD1, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD2, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD3, 1)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_1HA_DDRIO0, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_2HA_DDRIO0, 1)	},
 };
 
 static const struct pci_id_table pci_dev_descr_ibridge_table[] = {
@@ -517,31 +530,31 @@ static const struct pci_id_table pci_dev_descr_ibridge_table[] \
= {  #define PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO3 0x2fbb
 static const struct pci_id_descr pci_dev_descr_haswell[] = {
 	/* first item must be the HA */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0, 0)		},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_CBO_SAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_CBO_SAD1, 0)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1, 1)		},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TA, 0)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_THERMAL, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD1, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD2, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD3, 1)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO0, 1)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO1, 1)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO2, 1)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO3, 1)		},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TA, 1)		},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_THERMAL, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD0, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD1, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD2, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD3, 1)	},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0, 0, IMC0) },
+
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1, 1, IMC1) },
+
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TA, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_THERMAL, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD0, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD1, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD2, 1, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD3, 1, IMC0) },
+
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TA, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_THERMAL, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD0, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD1, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD2, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD3, 1, IMC1) },
+
+		/* Items of SOCKET type must be put at last */
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_CBO_SAD0, 0, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_CBO_SAD1, 0, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO0, 1, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO1, 1, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO2, 1, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_HASWELL_IMC_DDRIO3, 1, SOCKET) },
 };
 
 static const struct pci_id_table pci_dev_descr_haswell_table[] = {
@@ -579,13 +592,14 @@ static const struct pci_id_table pci_dev_descr_haswell_table[] \
                = {
  */
 
 static const struct pci_id_descr pci_dev_descr_knl[] = {
-	[0]         = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_SAD0, 0) },
-	[1]         = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_SAD1, 0) },
-	[2 ... 3]   = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_MC, 0)},
-	[4 ... 41]  = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_CHA, 0) },
-	[42 ... 47] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_CHANNEL, 0) },
-	[48]        = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_TA, 0) },
-	[49]        = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_TOLHM, 0) },
+	[0 ... 1] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_MC, 0, IMC0)},
+	[2 ... 7] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_CHANNEL, 0, IMC0) },
+	[8] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_TA, 0, IMC0) },
+	[9] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_TOLHM, 0, IMC0) },
+	/* Items of SOCKET type must be put at last */
+	[10] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_SAD0, 0, SOCKET) },
+	[11] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_SAD1, 0, SOCKET) },
+	[12 ... 49] = { PCI_DESCR(PCI_DEVICE_ID_INTEL_KNL_IMC_CHA, 0, SOCKET) },
 };
 
 static const struct pci_id_table pci_dev_descr_knl_table[] = {
@@ -632,28 +646,27 @@ static const struct pci_id_table pci_dev_descr_knl_table[] = {
 
 static const struct pci_id_descr pci_dev_descr_broadwell[] = {
 	/* first item must be the HA */
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0, 0)		},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_CBO_SAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_CBO_SAD1, 0)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1, 1)		},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TA, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_THERMAL, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD0, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD1, 0)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD2, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD3, 1)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_DDRIO0, 1)	},
-
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TA, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_THERMAL, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD0, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD1, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD2, 1)	},
-	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD3, 1)	},
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1, 1, IMC1) },
+
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TA, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_THERMAL, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD0, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD1, 0, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD2, 1, IMC0) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD3, 1, IMC0) },
+
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TA, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_THERMAL, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD0, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD1, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD2, 1, IMC1) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD3, 1, IMC1) },
+
+	/* Items of SOCKET type must be put at last */
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_CBO_SAD0, 0, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_CBO_SAD1, 0, SOCKET) },
+	{ PCI_DESCR(PCI_DEVICE_ID_INTEL_BROADWELL_IMC_DDRIO0, 1, SOCKET) },
 };
 
 static const struct pci_id_table pci_dev_descr_broadwell_table[] = {
@@ -709,29 +722,62 @@ static inline int numcol(u32 mtr)
 	return 1 << cols;
 }
 
-static struct sbridge_dev *get_sbridge_dev(u8 bus, int multi_bus)
+/**
+ * The returned sbridge_dev pointer array is terminated with NULL
+ */
+static struct sbridge_dev **get_sbridge_dev(u8 bus, int multi_bus,
+					enum domain domain)
 {
-	struct sbridge_dev *sbridge_dev;
+	struct sbridge_dev *sbridge_dev, **devs, **tmp;
+	int n = 2;
+
+	devs = kcalloc(n, sizeof(*devs), GFP_KERNEL);
+	if (!devs)
+		return NULL;
 
 	/*
 	 * If we have devices scattered across several busses that pertain
 	 * to the same memory controller, we'll lump them all together.
 	 */
 	if (multi_bus) {
-		return list_first_entry_or_null(&sbridge_edac_list,
+		sbridge_dev = list_first_entry_or_null(&sbridge_edac_list,
 				struct sbridge_dev, list);
+		devs[0] = sbridge_dev;
+		return devs;
 	}
 
 	list_for_each_entry(sbridge_dev, &sbridge_edac_list, list) {
-		if (sbridge_dev->bus == bus)
-			return sbridge_dev;
+		if (sbridge_dev->bus == bus) {
+			/*
+			 * The device is per socket, find all memory
+			 * controllers in that socket.
+			 */
+			if (domain == SOCKET) {
+				tmp = krealloc(devs, sizeof(*devs) * n, GFP_KERNEL);
+				if (!tmp) {
+					kfree(devs);
+					return NULL;
+				}
+				devs = tmp;
+				devs[n-2] = sbridge_dev;
+				devs[n-1] = NULL;
+				n++;
+			/*
+			 * The device is per memory controller, find
+			 * that memory controller only in that socket.
+			 */
+			} else if (sbridge_dev->domain == domain) {
+				devs[0]	= sbridge_dev;
+				return devs;
+			}
+		}
 	}
 
-	return NULL;
+	return devs;
 }
 
 static struct sbridge_dev *alloc_sbridge_dev(u8 bus,
-					   const struct pci_id_table *table)
+					   enum domain domain)
 {
 	struct sbridge_dev *sbridge_dev;
 
@@ -739,15 +785,8 @@ static struct sbridge_dev *alloc_sbridge_dev(u8 bus,
 	if (!sbridge_dev)
 		return NULL;
 
-	sbridge_dev->pdev = kzalloc(sizeof(*sbridge_dev->pdev) * table->n_devs,
-				   GFP_KERNEL);
-	if (!sbridge_dev->pdev) {
-		kfree(sbridge_dev);
-		return NULL;
-	}
-
 	sbridge_dev->bus = bus;
-	sbridge_dev->n_devs = table->n_devs;
+	sbridge_dev->domain = domain;
 	list_add_tail(&sbridge_dev->list, &sbridge_edac_list);
 
 	return sbridge_dev;
@@ -1065,76 +1104,6 @@ static int haswell_chan_hash(int idx, u64 addr)
 /****************************************************************************
 			Memory check routines
  ****************************************************************************/
-static struct pci_dev *get_pdev_same_bus(u8 bus, u32 id)
-{
-	struct pci_dev *pdev = NULL;
-
-	do {
-		pdev = pci_get_device(PCI_VENDOR_ID_INTEL, id, pdev);
-		if (pdev && pdev->bus->number == bus)
-			break;
-	} while (pdev);
-
-	return pdev;
-}
-
-/**
- * check_if_ecc_is_active() - Checks if ECC is active
- * @bus:	Device bus
- * @type:	Memory controller type
- * returns: 0 in case ECC is active, -ENODEV if it can't be determined or
- *	    disabled
- */
-static int check_if_ecc_is_active(const u8 bus, enum type type)
-{
-	struct pci_dev *pdev = NULL;
-	u32 mcmtr, id;
-
-	switch (type) {
-	case IVY_BRIDGE:
-		id = PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA;
-		break;
-	case HASWELL:
-		id = PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TA;
-		break;
-	case SANDY_BRIDGE:
-		id = PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA;
-		break;
-	case BROADWELL:
-		id = PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TA;
-		break;
-	case KNIGHTS_LANDING:
-		/*
-		 * KNL doesn't group things by bus the same way
-		 * SB/IB/Haswell does.
-		 */
-		id = PCI_DEVICE_ID_INTEL_KNL_IMC_TA;
-		break;
-	default:
-		return -ENODEV;
-	}
-
-	if (type != KNIGHTS_LANDING)
-		pdev = get_pdev_same_bus(bus, id);
-	else
-		pdev = pci_get_device(PCI_VENDOR_ID_INTEL, id, 0);
-
-	if (!pdev) {
-		sbridge_printk(KERN_ERR, "Couldn't find PCI device "
-					"%04x:%04x! on bus %02d\n",
-					PCI_VENDOR_ID_INTEL, id, bus);
-		return -ENODEV;
-	}
-
-	pci_read_config_dword(pdev,
-			type == KNIGHTS_LANDING ? KNL_MCMTR : MCMTR, &mcmtr);
-	if (!IS_ECC_ENABLED(mcmtr)) {
-		sbridge_printk(KERN_ERR, "ECC is disabled. Aborting\n");
-		return -ENODEV;
-	}
-	return 0;
-}
-
 /* Low bits of TAD limit, and some metadata. */
 static const u32 knl_tad_dram_limit_lo[] = {
 	0x400, 0x500, 0x600, 0x700,
@@ -1605,6 +1574,23 @@ static int knl_get_dimm_capacity(struct sbridge_pvt *pvt, u64 \
*mc_sizes)  return 0;
 }
 
+static void get_source_id(struct mem_ctl_info *mci)
+{
+	struct sbridge_pvt *pvt = mci->pvt_info;
+	u32 reg;
+
+	if (pvt->info.type == HASWELL || pvt->info.type == BROADWELL ||
+			pvt->info.type == KNIGHTS_LANDING)
+		pci_read_config_dword(pvt->pci_sad1, SAD_TARGET, &reg);
+	else
+		pci_read_config_dword(pvt->pci_br0, SAD_TARGET, &reg);
+
+	if (pvt->info.type == KNIGHTS_LANDING)
+		pvt->sbridge_dev->source_id = SOURCE_ID_KNL(reg);
+	else
+		pvt->sbridge_dev->source_id = SOURCE_ID(reg);
+}
+
 static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct sbridge_pvt *pvt = mci->pvt_info;
@@ -1619,20 +1605,9 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u64 knl_mc_sizes[KNL_MAX_CHANNELS];
 
 	if (pvt->info.type == HASWELL || pvt->info.type == BROADWELL) {
-		pci_read_config_dword(pvt->pci_ha0, HASWELL_HASYSDEFEATURE2, &reg);
+		pci_read_config_dword(pvt->pci_ha, HASWELL_HASYSDEFEATURE2, &reg);
 		pvt->is_chan_hash = GET_BITFIELD(reg, 21, 21);
 	}
-	if (pvt->info.type == HASWELL || pvt->info.type == BROADWELL ||
-			pvt->info.type == KNIGHTS_LANDING)
-		pci_read_config_dword(pvt->pci_sad1, SAD_TARGET, &reg);
-	else
-		pci_read_config_dword(pvt->pci_br0, SAD_TARGET, &reg);
-
-	if (pvt->info.type == KNIGHTS_LANDING)
-		pvt->sbridge_dev->source_id = SOURCE_ID_KNL(reg);
-	else
-		pvt->sbridge_dev->source_id = SOURCE_ID(reg);
-
 	pvt->sbridge_dev->node_id = pvt->info.get_node_id(pvt);
 	edac_dbg(0, "mc#%d: Node ID: %d, source ID: %d\n",
 		 pvt->sbridge_dev->mc,
@@ -1648,6 +1623,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 
 		if (knl_get_dimm_capacity(pvt, knl_mc_sizes) != 0)
 			return -1;
+		pci_read_config_dword(pvt->pci_ta, KNL_MCMTR, &pvt->info.mcmtr);
 	} else {
 		pci_read_config_dword(pvt->pci_ras, RASENABLES, &reg);
 		if (IS_MIRROR_ENABLED(reg)) {
@@ -1717,6 +1693,13 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			}
 			edac_dbg(4, "Channel #%d  MTR%d = %x\n", i, j, mtr);
 			if (IS_DIMM_PRESENT(mtr)) {
+				if (!IS_ECC_ENABLED(pvt->info.mcmtr)) {
+					sbridge_printk(KERN_ERR,
+						"CPU SrcID #%d, Ha #%d, Channel #%d has dimms, but ecc is disabled\n",
+						pvt->sbridge_dev->source_id,
+						pvt->sbridge_dev->domain, i);
+					return -ENODEV;
+				}
 				pvt->channel[i].dimms++;
 
 				ranks = numrank(pvt->info.type, mtr);
@@ -1735,7 +1718,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				npages = MiB_TO_PAGES(size);
 
 				edac_dbg(0, "mc#%d: ha %d channel %d, dimm %d, %lld Mb (%d pages) bank: %d, \
                rank: %d, row: %#x, col: %#x\n",
-					 pvt->sbridge_dev->mc, i/4, i%4, j,
+					 pvt->sbridge_dev->mc, pvt->sbridge_dev->domain, i, j,
 					 size, npages,
 					 banks, ranks, rows, cols);
 
@@ -1746,7 +1729,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				dimm->edac_mode = mode;
 				snprintf(dimm->label, sizeof(dimm->label),
 					 "CPU_SrcID#%u_Ha#%u_Chan#%u_DIMM#%u",
-					 pvt->sbridge_dev->source_id, i/4, i%4, j);
+					 pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain, i, j);
 			}
 		}
 	}
@@ -1834,7 +1817,7 @@ static void get_memory_layout(const struct mem_ctl_info *mci)
 	 */
 	prv = 0;
 	for (n_tads = 0; n_tads < MAX_TAD; n_tads++) {
-		pci_read_config_dword(pvt->pci_ha0, tad_dram_rule[n_tads],
+		pci_read_config_dword(pvt->pci_ha, tad_dram_rule[n_tads],
 				      &reg);
 		limit = TAD_LIMIT(reg);
 		if (limit <= prv)
@@ -1917,12 +1900,12 @@ static void get_memory_layout(const struct mem_ctl_info *mci)
 	}
 }
 
-static struct mem_ctl_info *get_mci_for_node_id(u8 node_id)
+static struct mem_ctl_info *get_mci_for_node_id(u8 node_id, u8 ha)
 {
 	struct sbridge_dev *sbridge_dev;
 
 	list_for_each_entry(sbridge_dev, &sbridge_edac_list, list) {
-		if (sbridge_dev->node_id == node_id)
+		if (sbridge_dev->node_id == node_id && sbridge_dev->domain == ha)
 			return sbridge_dev->mci;
 	}
 	return NULL;
@@ -1943,7 +1926,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	int			interleave_mode, shiftup = 0;
 	unsigned		sad_interleave[pvt->info.max_interleave];
 	u32			reg, dram_rule;
-	u8			ch_way, sck_way, pkg, sad_ha = 0, ch_add = 0;
+	u8			ch_way, sck_way, pkg, sad_ha = 0;
 	u32			tad_offset;
 	u32			rir_way;
 	u32			mb, gb;
@@ -2056,12 +2039,10 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		pkg = sad_pkg(pvt->info.interleave_pkg, reg, idx);
 		*socket = sad_pkg_socket(pkg);
 		sad_ha = sad_pkg_ha(pkg);
-		if (sad_ha)
-			ch_add = 4;
 
 		if (a7mode) {
 			/* MCChanShiftUpEnable */
-			pci_read_config_dword(pvt->pci_ha0,
+			pci_read_config_dword(pvt->pci_ha,
 					      HASWELL_HASYSDEFEATURE2, &reg);
 			shiftup = GET_BITFIELD(reg, 22, 22);
 		}
@@ -2074,8 +2055,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		pkg = sad_pkg(pvt->info.interleave_pkg, reg, idx);
 		*socket = sad_pkg_socket(pkg);
 		sad_ha = sad_pkg_ha(pkg);
-		if (sad_ha)
-			ch_add = 4;
 		edac_dbg(0, "SAD interleave package: %d = CPU socket %d, HA %d\n",
 			 idx, *socket, sad_ha);
 	}
@@ -2086,7 +2065,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	 * Move to the proper node structure, in order to access the
 	 * right PCI registers
 	 */
-	new_mci = get_mci_for_node_id(*socket);
+	new_mci = get_mci_for_node_id(*socket, sad_ha);
 	if (!new_mci) {
 		sprintf(msg, "Struct for socket #%u wasn't initialized",
 			*socket);
@@ -2099,14 +2078,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	 * Step 2) Get memory channel
 	 */
 	prv = 0;
-	if (pvt->info.type == SANDY_BRIDGE)
-		pci_ha = pvt->pci_ha0;
-	else {
-		if (sad_ha)
-			pci_ha = pvt->pci_ha1;
-		else
-			pci_ha = pvt->pci_ha0;
-	}
+	pci_ha = pvt->pci_ha;
 	for (n_tads = 0; n_tads < MAX_TAD; n_tads++) {
 		pci_read_config_dword(pci_ha, tad_dram_rule[n_tads], &reg);
 		limit = TAD_LIMIT(reg);
@@ -2157,7 +2129,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	}
 	*channel_mask = 1 << base_ch;
 
-	pci_read_config_dword(pvt->pci_tad[ch_add + base_ch],
+	pci_read_config_dword(pvt->pci_tad[base_ch],
 				tad_ch_nilv_offset[n_tads],
 				&tad_offset);
 
@@ -2210,7 +2182,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	 * Step 3) Decode rank
 	 */
 	for (n_rir = 0; n_rir < MAX_RIR_RANGES; n_rir++) {
-		pci_read_config_dword(pvt->pci_tad[ch_add + base_ch],
+		pci_read_config_dword(pvt->pci_tad[base_ch],
 				      rir_way_limit[n_rir],
 				      &reg);
 
@@ -2240,7 +2212,7 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		idx = (ch_addr >> 13);	/* FIXME: Datasheet says to shift by 15 */
 	idx %= 1 << rir_way;
 
-	pci_read_config_dword(pvt->pci_tad[ch_add + base_ch],
+	pci_read_config_dword(pvt->pci_tad[base_ch],
 			      rir_offset[n_rir][idx],
 			      &reg);
 	*rank = RIR_RNK_TGT(pvt->info.type, reg);
@@ -2293,12 +2265,14 @@ static int sbridge_get_onedevice(struct pci_dev **prev,
 				 u8 *num_mc,
 				 const struct pci_id_table *table,
 				 const unsigned devno,
+				 const int allow_dups,
 				 const int multi_bus)
 {
-	struct sbridge_dev *sbridge_dev;
+	struct sbridge_dev **sbridge_dev, *d;
 	const struct pci_id_descr *dev_descr = &table->descr[devno];
-	struct pci_dev *pdev = NULL;
+	struct pci_dev *pdev = NULL, **tmp;
 	u8 bus = 0;
+	int i = 0;
 
 	sbridge_printk(KERN_DEBUG,
 		"Seeking for: PCI ID %04x:%04x\n",
@@ -2329,25 +2303,51 @@ static int sbridge_get_onedevice(struct pci_dev **prev,
 	}
 	bus = pdev->bus->number;
 
-	sbridge_dev = get_sbridge_dev(bus, multi_bus);
+	sbridge_dev = get_sbridge_dev(bus, multi_bus, dev_descr->domain);
 	if (!sbridge_dev) {
-		sbridge_dev = alloc_sbridge_dev(bus, table);
-		if (!sbridge_dev) {
+		pci_dev_put(pdev);
+		return -ENOMEM;
+	}
+	if (!sbridge_dev[0]) {
+		sbridge_dev[0] = alloc_sbridge_dev(bus, dev_descr->domain);
+		if (!sbridge_dev[0]) {
 			pci_dev_put(pdev);
+			kfree(sbridge_dev);
 			return -ENOMEM;
 		}
 		(*num_mc)++;
 	}
 
-	if (sbridge_dev->pdev[devno]) {
-		sbridge_printk(KERN_ERR,
-			"Duplicated device for %04x:%04x\n",
-			PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
-		pci_dev_put(pdev);
-		return -ENODEV;
+	while ((d = sbridge_dev[i])) {
+		if (!allow_dups && *prev && d->n_devs > 0) {
+			if (d->pdev[d->n_devs-1]->device == dev_descr->dev_id) {
+				sbridge_printk(KERN_ERR,
+					"Duplicated device for %04x:%04x\n",
+					PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+				pci_dev_put(pdev);
+				kfree(sbridge_dev);
+				return -ENODEV;
+			}
+		}
+		tmp = krealloc(d->pdev, sizeof(*d->pdev) * (++d->n_devs),
+				GFP_KERNEL);
+		if (!tmp) {
+			pci_dev_put(pdev);
+			kfree(sbridge_dev);
+			return -ENOMEM;
+		}
+		d->pdev = tmp;
+		d->pdev[d->n_devs-1] = pdev;
+		i++;
+		/*
+		 * If the pdev device in socket domain(besides being
+		 * shared by imc0) is also shared by imc1, imc2, ...
+		 * we need to do extra gets for it
+		 */
+		if (i > 1)
+			pci_dev_get(pdev);
 	}
-
-	sbridge_dev->pdev[devno] = pdev;
+	kfree(sbridge_dev);
 
 	/* Be sure that the device is enabled */
 	if (unlikely(pci_enable_device(pdev) < 0)) {
@@ -2403,7 +2403,7 @@ static int sbridge_get_all_devices_full(u8 *num_mc,
 			}
 			do {
 				rc = sbridge_get_onedevice(&pdev, num_mc,
-							   table, i, multi_bus);
+							   table, i, allow_dups, multi_bus);
 				if (rc < 0) {
 					if (i == 0) {
 						i = table->n_devs;
@@ -2425,6 +2425,13 @@ static int sbridge_get_all_devices_full(u8 *num_mc,
 #define sbridge_get_all_devices_knl(num_mc, table) \
 		sbridge_get_all_devices_full(num_mc, table, 1, 1)
 
+/*
+ * Device IDs for {SBRIDGE,IBRIDGE,HASWELL,BROADWELL}_IMC_HA0_TAD0 are in
+ * the format: XXXa.  So we can convert from a device to the corresponding
+ * channel like this
+*/
+#define TAD_DEV_TO_CHAN(dev) (((dev) & 0xf) - 0xa)
+
 static int sbridge_mci_bind_devs(struct mem_ctl_info *mci,
 				 struct sbridge_dev *sbridge_dev)
 {
@@ -2449,7 +2456,7 @@ static int sbridge_mci_bind_devs(struct mem_ctl_info *mci,
 			pvt->pci_br0 = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0:
-			pvt->pci_ha0 = pdev;
+			pvt->pci_ha = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA:
 			pvt->pci_ta = pdev;
@@ -2481,7 +2488,7 @@ static int sbridge_mci_bind_devs(struct mem_ctl_info *mci,
 	}
 
 	/* Check if everything were registered */
-	if (!pvt->pci_sad0 || !pvt->pci_sad1 || !pvt->pci_ha0 ||
+	if (!pvt->pci_sad0 || !pvt->pci_sad1 || !pvt->pci_ha ||
 	    !pvt-> pci_tad || !pvt->pci_ras  || !pvt->pci_ta)
 		goto enodev;
 
@@ -2514,19 +2521,26 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info *mci,
 
 		switch (pdev->device) {
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0:
-			pvt->pci_ha0 = pdev;
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1:
+			pvt->pci_ha = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TA:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TA:
 			pvt->pci_ta = pdev;
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_RAS:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_RAS:
 			pvt->pci_ras = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD0:
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD1:
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD2:
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD3:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD1:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD2:
+		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD3:
 		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA0_TAD0;
+			int id = TAD_DEV_TO_CHAN(pdev->device);
 			pvt->pci_tad[id] = pdev;
 			saw_chan_mask |= 1 << id;
 		}
@@ -2546,19 +2560,6 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info *mci,
 		case PCI_DEVICE_ID_INTEL_IBRIDGE_BR1:
 			pvt->pci_br1 = pdev;
 			break;
-		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1:
-			pvt->pci_ha1 = pdev;
-			break;
-		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0:
-		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD1:
-		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD2:
-		case PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD3:
-		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_IBRIDGE_IMC_HA1_TAD0 + 4;
-			pvt->pci_tad[id] = pdev;
-			saw_chan_mask |= 1 << id;
-		}
-			break;
 		default:
 			goto error;
 		}
@@ -2570,14 +2571,13 @@ static int ibridge_mci_bind_devs(struct mem_ctl_info *mci,
 	}
 
 	/* Check if everything were registered */
-	if (!pvt->pci_sad0 || !pvt->pci_ha0 || !pvt->pci_br0 ||
+	if (!pvt->pci_sad0 || !pvt->pci_ha || !pvt->pci_br0 ||
 	    !pvt->pci_br1 || !pvt->pci_tad || !pvt->pci_ras  ||
 	    !pvt->pci_ta)
 		goto enodev;
 
-	if (saw_chan_mask != 0x0f && /* -EN */
-	    saw_chan_mask != 0x33 && /* -EP */
-	    saw_chan_mask != 0xff)   /* -EX */
+	if (saw_chan_mask != 0x0f && /* -EN/-EX */
+	    saw_chan_mask != 0x03)   /* -EP */
 		goto enodev;
 	return 0;
 
@@ -2620,32 +2620,27 @@ static int haswell_mci_bind_devs(struct mem_ctl_info *mci,
 			pvt->pci_sad1 = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0:
-			pvt->pci_ha0 = pdev;
+		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1:
+			pvt->pci_ha = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TA:
+		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TA:
 			pvt->pci_ta = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_THERMAL:
+		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_THERMAL:
 			pvt->pci_ras = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD0:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD1:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD2:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD3:
-		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA0_TAD0;
-
-			pvt->pci_tad[id] = pdev;
-			saw_chan_mask |= 1 << id;
-		}
-			break;
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD0:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD1:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD2:
 		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD3:
 		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TAD0 + 4;
-
+			int id = TAD_DEV_TO_CHAN(pdev->device);
 			pvt->pci_tad[id] = pdev;
 			saw_chan_mask |= 1 << id;
 		}
@@ -2657,12 +2652,6 @@ static int haswell_mci_bind_devs(struct mem_ctl_info *mci,
 			if (!pvt->pci_ddrio)
 				pvt->pci_ddrio = pdev;
 			break;
-		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1:
-			pvt->pci_ha1 = pdev;
-			break;
-		case PCI_DEVICE_ID_INTEL_HASWELL_IMC_HA1_TA:
-			pvt->pci_ha1_ta = pdev;
-			break;
 		default:
 			break;
 		}
@@ -2674,13 +2663,12 @@ static int haswell_mci_bind_devs(struct mem_ctl_info *mci,
 	}
 
 	/* Check if everything were registered */
-	if (!pvt->pci_sad0 || !pvt->pci_ha0 || !pvt->pci_sad1 ||
+	if (!pvt->pci_sad0 || !pvt->pci_ha || !pvt->pci_sad1 ||
 	    !pvt->pci_ras  || !pvt->pci_ta || !pvt->info.pci_vtd)
 		goto enodev;
 
-	if (saw_chan_mask != 0x0f && /* -EN */
-	    saw_chan_mask != 0x33 && /* -EP */
-	    saw_chan_mask != 0xff)   /* -EX */
+	if (saw_chan_mask != 0x0f && /* -EN/-EX */
+	    saw_chan_mask != 0x03)   /* -EP */
 		goto enodev;
 	return 0;
 
@@ -2717,30 +2705,27 @@ static int broadwell_mci_bind_devs(struct mem_ctl_info *mci,
 			pvt->pci_sad1 = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0:
-			pvt->pci_ha0 = pdev;
+		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1:
+			pvt->pci_ha = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TA:
+		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TA:
 			pvt->pci_ta = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_THERMAL:
+		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_THERMAL:
 			pvt->pci_ras = pdev;
 			break;
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD0:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD1:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD2:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD3:
-		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA0_TAD0;
-			pvt->pci_tad[id] = pdev;
-			saw_chan_mask |= 1 << id;
-		}
-			break;
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD0:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD1:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD2:
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD3:
 		{
-			int id = pdev->device - PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TAD0 + 4;
+			int id = TAD_DEV_TO_CHAN(pdev->device);
 			pvt->pci_tad[id] = pdev;
 			saw_chan_mask |= 1 << id;
 		}
@@ -2748,12 +2733,6 @@ static int broadwell_mci_bind_devs(struct mem_ctl_info *mci,
 		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_DDRIO0:
 			pvt->pci_ddrio = pdev;
 			break;
-		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1:
-			pvt->pci_ha1 = pdev;
-			break;
-		case PCI_DEVICE_ID_INTEL_BROADWELL_IMC_HA1_TA:
-			pvt->pci_ha1_ta = pdev;
-			break;
 		default:
 			break;
 		}
@@ -2765,13 +2744,12 @@ static int broadwell_mci_bind_devs(struct mem_ctl_info *mci,
 	}
 
 	/* Check if everything were registered */
-	if (!pvt->pci_sad0 || !pvt->pci_ha0 || !pvt->pci_sad1 ||
+	if (!pvt->pci_sad0 || !pvt->pci_ha || !pvt->pci_sad1 ||
 	    !pvt->pci_ras  || !pvt->pci_ta || !pvt->info.pci_vtd)
 		goto enodev;
 
-	if (saw_chan_mask != 0x0f && /* -EN */
-	    saw_chan_mask != 0x33 && /* -EP */
-	    saw_chan_mask != 0xff)   /* -EX */
+	if (saw_chan_mask != 0x0f && /* -EN/-EX */
+	    saw_chan_mask != 0x03)   /* -EP */
 		goto enodev;
 	return 0;
 
@@ -3026,7 +3004,7 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 
 	if (rc < 0)
 		goto err_parsing;
-	new_mci = get_mci_for_node_id(socket);
+	new_mci = get_mci_for_node_id(socket, ha);
 	if (!new_mci) {
 		strcpy(msg, "Error: socket got corrupted!");
 		goto err_parsing;
@@ -3073,7 +3051,7 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	/* Call the helper to output message */
 	edac_mc_handle_error(tp_event, mci, core_err_cnt,
 			     m->addr >> PAGE_SHIFT, m->addr & ~PAGE_MASK, 0,
-			     4*ha+channel, dimm, -1,
+			     channel, dimm, -1,
 			     optype, msg);
 	return;
 err_parsing:
@@ -3098,7 +3076,7 @@ static int sbridge_mce_check_error(struct notifier_block *nb, \
unsigned long val,  if (get_edac_report_status() == EDAC_REPORTING_DISABLED)
 		return NOTIFY_DONE;
 
-	mci = get_mci_for_node_id(mce->socketid);
+	mci = get_mci_for_node_id(mce->socketid, IMC0);
 	if (!mci)
 		return NOTIFY_DONE;
 	pvt = mci->pvt_info;
@@ -3178,11 +3156,6 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  struct pci_dev *pdev = sbridge_dev->pdev[0];
 	int rc;
 
-	/* Check the number of active and not disabled channels */
-	rc = check_if_ecc_is_active(sbridge_dev->bus, type);
-	if (unlikely(rc < 0))
-		return rc;
-
 	/* allocate a new MC control structure */
 	layers[0].type = EDAC_MC_LAYER_CHANNEL;
 	layers[0].size = type == KNIGHTS_LANDING ?
@@ -3235,12 +3208,15 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  pvt->info.max_interleave = \
ARRAY_SIZE(ibridge_interleave_list);  pvt->info.interleave_pkg = \
ibridge_interleave_pkg;  pvt->info.get_width = ibridge_get_width;
-		mci->ctl_name = kasprintf(GFP_KERNEL, "Ivy Bridge Socket#%d", mci->mc_idx);
 
 		/* Store pci devices at mci for faster access */
 		rc = ibridge_mci_bind_devs(mci, sbridge_dev);
 		if (unlikely(rc < 0))
 			goto fail0;
+		get_source_id(mci);
+		mci->ctl_name = kasprintf(GFP_KERNEL, "Ivy Bridge SrcID#%d_Ha#%d",
+			pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain);
+
 		break;
 	case SANDY_BRIDGE:
 		pvt->info.rankcfgr = SB_RANK_CFG_A;
@@ -3259,12 +3235,14 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  pvt->info.max_interleave = \
ARRAY_SIZE(sbridge_interleave_list);  pvt->info.interleave_pkg = \
sbridge_interleave_pkg;  pvt->info.get_width = sbridge_get_width;
-		mci->ctl_name = kasprintf(GFP_KERNEL, "Sandy Bridge Socket#%d", mci->mc_idx);
 
 		/* Store pci devices at mci for faster access */
 		rc = sbridge_mci_bind_devs(mci, sbridge_dev);
 		if (unlikely(rc < 0))
 			goto fail0;
+		get_source_id(mci);
+		mci->ctl_name = kasprintf(GFP_KERNEL, "Sandy Bridge SrcID#%d_Ha#%d",
+			pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain);
 		break;
 	case HASWELL:
 		/* rankcfgr isn't used */
@@ -3283,12 +3261,14 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  pvt->info.max_interleave = \
ARRAY_SIZE(ibridge_interleave_list);  pvt->info.interleave_pkg = \
ibridge_interleave_pkg;  pvt->info.get_width = ibridge_get_width;
-		mci->ctl_name = kasprintf(GFP_KERNEL, "Haswell Socket#%d", mci->mc_idx);
 
 		/* Store pci devices at mci for faster access */
 		rc = haswell_mci_bind_devs(mci, sbridge_dev);
 		if (unlikely(rc < 0))
 			goto fail0;
+		get_source_id(mci);
+		mci->ctl_name = kasprintf(GFP_KERNEL, "Haswell SrcID#%d_Ha#%d",
+			pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain);
 		break;
 	case BROADWELL:
 		/* rankcfgr isn't used */
@@ -3307,12 +3287,14 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  pvt->info.max_interleave = \
ARRAY_SIZE(ibridge_interleave_list);  pvt->info.interleave_pkg = \
ibridge_interleave_pkg;  pvt->info.get_width = broadwell_get_width;
-		mci->ctl_name = kasprintf(GFP_KERNEL, "Broadwell Socket#%d", mci->mc_idx);
 
 		/* Store pci devices at mci for faster access */
 		rc = broadwell_mci_bind_devs(mci, sbridge_dev);
 		if (unlikely(rc < 0))
 			goto fail0;
+		get_source_id(mci);
+		mci->ctl_name = kasprintf(GFP_KERNEL, "Broadwell SrcID#%d_Ha#%d",
+			pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain);
 		break;
 	case KNIGHTS_LANDING:
 		/* pvt->info.rankcfgr == ??? */
@@ -3331,17 +3313,23 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  pvt->info.max_interleave = \
ARRAY_SIZE(knl_interleave_list);  pvt->info.interleave_pkg = ibridge_interleave_pkg;
 		pvt->info.get_width = knl_get_width;
-		mci->ctl_name = kasprintf(GFP_KERNEL,
-			"Knights Landing Socket#%d", mci->mc_idx);
 
 		rc = knl_mci_bind_devs(mci, sbridge_dev);
 		if (unlikely(rc < 0))
 			goto fail0;
+		get_source_id(mci);
+		mci->ctl_name = kasprintf(GFP_KERNEL,
+			"Knights Landing SrcID#%d_Ha#%d",
+			pvt->sbridge_dev->source_id, pvt->sbridge_dev->domain);
 		break;
 	}
 
 	/* Get dimm basic config and the memory layout */
-	get_dimm_config(mci);
+	rc = get_dimm_config(mci);
+	if (rc < 0) {
+		edac_dbg(0, "MC: failed to get_dimm_config()\n");
+		goto fail;
+	}
 	get_memory_layout(mci);
 
 	/* record ptr to the generic device */
@@ -3351,13 +3339,14 @@ static int sbridge_register_mci(struct sbridge_dev \
*sbridge_dev, enum type type)  if (unlikely(edac_mc_add_mc(mci))) {
 		edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
 		rc = -EINVAL;
-		goto fail0;
+		goto fail;
 	}
 
 	return 0;
 
-fail0:
+fail:
 	kfree(mci->ctl_name);
+fail0:
 	edac_mc_free(mci);
 	sbridge_dev->mci = NULL;
 	return rc;
-- 
2.9.0.GIT

--
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic