Message ID | 6a462190-0af2-094a-daa8-f480d54a1fbf@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v9,1/2] dt-bindings: edac: arm-dmc520.txt | expand |
On Wed, Jan 15, 2020 at 06:32:33AM -0800, Shiping Ji wrote: > New driver supports error detection and correction on the devices with ARM > DMC-520 memory controller. > > Signed-off-by: Shiping Ji <shiping.linux@gmail.com> > Signed-off-by: Lei Wang <leiwang_git@outlook.com> > Reviewed-by: James Morse <james.morse@arm.com> This mail still has your From: because I guess you pasted the patch in the mail. But, if you look at what I wrote here: https://lkml.kernel.org/r/20200107195606.GM29542@zn.tnic you'll see the From: Lei Wang <leiwang_git@outlook.com> which is the last From: in the mail and that is taken by git as the author of the patch. However, if I apply this mail of yours, it will make you the author. Because in git there can be only one author per patch and other authors can be additionally accredited with the Co-developed-by: tag from the same doc I was pointing at before: Documentation/process/submitting-patches.rst Looking at this driver, however, you have supplied three authors. And I think you guys need to discuss it amongst yourselves who is going to be the author of this driver in the git history. If there are more questions, I'm pretty sure Sasha would be glad to explain to you how the whole authorship thing works and what the implications are. Thx.
On 1/15/2020 1:38 PM, Borislav Petkov wrote: > On Wed, Jan 15, 2020 at 06:32:33AM -0800, Shiping Ji wrote: >> New driver supports error detection and correction on the devices with ARM >> DMC-520 memory controller. >> >> Signed-off-by: Shiping Ji <shiping.linux@gmail.com> >> Signed-off-by: Lei Wang <leiwang_git@outlook.com> >> Reviewed-by: James Morse <james.morse@arm.com> > > This mail still has your From: because I guess you pasted the patch in > the mail. > > But, if you look at what I wrote here: > > https://lkml.kernel.org/r/20200107195606.GM29542@zn.tnic > > you'll see the > > From: Lei Wang <leiwang_git@outlook.com> > > which is the last From: in the mail and that is taken by git as the > author of the patch. > > However, if I apply this mail of yours, it will make you the > author. Because in git there can be only one author per patch > and other authors can be additionally accredited with the > Co-developed-by: tag from the same doc I was pointing at before: > Documentation/process/submitting-patches.rst > > Looking at this driver, however, you have supplied three authors. And I > think you guys need to discuss it amongst yourselves who is going to be > the author of this driver in the git history. Lei will be the author of this driver in the git history. I could ask her to send the patch again if that's the correct way to go. Please confirm. > If there are more questions, I'm pretty sure Sasha would be glad to > explain to you how the whole authorship thing works and what the > implications are. Thanks, Sasha is currently OOF until April 19th. -- Best regards, Shiping Ji
On Wed, Jan 15, 2020 at 01:49:56PM -0800, Shiping Ji wrote: > Lei will be the author of this driver in the git history. I could ask > her to send the patch again if that's the correct way to go. Please > confirm. No need - you only have to send the patch with her From: at the beginning. Btw, you make her an author in git by doing: git commit --amend --author="Lei Wang <leiwang_git@outlook.com>" But before you send again, let me take a look at the rest of the patch first, tomorrow most likely. Thx.
On 1/15/2020 2:05 PM, Borislav Petkov wrote: > On Wed, Jan 15, 2020 at 01:49:56PM -0800, Shiping Ji wrote: >> Lei will be the author of this driver in the git history. I could ask >> her to send the patch again if that's the correct way to go. Please >> confirm. > > No need - you only have to send the patch with her From: at the > beginning. Btw, you make her an author in git by doing: > > git commit --amend --author="Lei Wang <leiwang_git@outlook.com>" > > But before you send again, let me take a look at the rest of the patch > first, tomorrow most likely. > > Thx. > Got it, I will have the followings next: From: Lei Wang <leiwang_git@outlook.com> <commit message> Signed-off-by: Lei Wang <leiwang_git@outlook.com> Signed-off-by: me -- Best regards, Shiping Ji
On Wed, Jan 15, 2020 at 06:32:33AM -0800, Shiping Ji wrote: > New driver supports error detection and correction on the devices with ARM > DMC-520 memory controller. > > Signed-off-by: Shiping Ji <shiping.linux@gmail.com> > Signed-off-by: Lei Wang <leiwang_git@outlook.com> > Reviewed-by: James Morse <james.morse@arm.com> > > --- > Changes in v9: > - Removed interrupt-config and replaced with an interrupt map where names and masks are predefined > - Only one ISR function is defined, mask is retrieved from the interrupt map > - "dram_ecc_errc" and "dram_ecc_errd" are implemented > > --- > MAINTAINERS | 6 + > drivers/edac/Kconfig | 7 + > drivers/edac/Makefile | 1 + > drivers/edac/dmc520_edac.c | 670 +++++++++++++++++++++++++++++++++++++ > 4 files changed, 684 insertions(+) > create mode 100644 drivers/edac/dmc520_edac.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index bd5847e802de..386195a019c6 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -5914,6 +5914,12 @@ F: Documentation/driver-api/edac.rst > F: drivers/edac/ > F: include/linux/edac.h > > +EDAC-DMC520 > +M: Lei Wang <lewan@microsoft.com> > +L: linux-edac@vger.kernel.org > +S: Supported > +F: drivers/edac/dmc520_edac.c > + > EDAC-E752X > M: Mark Gross <mark.gross@intel.com> > L: linux-edac@vger.kernel.org > diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig > index 37027c298323..7305bd1ec80e 100644 > --- a/drivers/edac/Kconfig > +++ b/drivers/edac/Kconfig > @@ -523,4 +523,11 @@ config EDAC_BLUEFIELD > Support for error detection and correction on the > Mellanox BlueField SoCs. > > +config EDAC_DMC520 > + tristate "ARM DMC-520 ECC" > + depends on ARM64 > + help > + Support for error detection and correction on the > + SoCs with ARM DMC-520 DRAM controller. > + > endif # EDAC > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index d77200c9680b..269e15118cea 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -87,3 +87,4 @@ obj-$(CONFIG_EDAC_TI) += ti_edac.o > obj-$(CONFIG_EDAC_QCOM) += qcom_edac.o > obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o > obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o > +obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o > diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c > new file mode 100644 > index 000000000000..55237c5c522c > --- /dev/null > +++ b/drivers/edac/dmc520_edac.c > @@ -0,0 +1,670 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +/* > + * EDAC driver for DMC-520 memory controller. > + * > + * The driver supports 10 interrupt lines, > + * though only dram_ecc_errc and dram_ecc_errd are currently handled. > + * > + * Authors: Rui Zhao <ruizhao@microsoft.com> > + * Lei Wang <lewan@microsoft.com> > + * Shiping Ji <shji@microsoft.com> > + */ > + > +#include <linux/bitfield.h> > +#include <linux/edac.h> > +#include <linux/interrupt.h> > +#include <linux/io.h> > +#include <linux/module.h> > +#include <linux/of.h> > +#include <linux/platform_device.h> > +#include <linux/slab.h> > +#include <linux/spinlock.h> > +#include "edac_mc.h" > + > +/* DMC-520 registers */ > +#define REG_OFFSET_FEATURE_CONFIG 0x130 > +#define REG_OFFSET_ECC_ERRC_COUNT_31_00 0x158 > +#define REG_OFFSET_ECC_ERRC_COUNT_63_32 0x15C > +#define REG_OFFSET_ECC_ERRD_COUNT_31_00 0x160 > +#define REG_OFFSET_ECC_ERRD_COUNT_63_32 0x164 > +#define REG_OFFSET_INTERRUPT_CONTROL 0x500 > +#define REG_OFFSET_INTERRUPT_CLR 0x508 > +#define REG_OFFSET_INTERRUPT_STATUS 0x510 > +#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 0x528 > +#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 0x52C > +#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00 0x530 > +#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32 0x534 > +#define REG_OFFSET_ADDRESS_CONTROL_NOW 0x1010 > +#define REG_OFFSET_MEMORY_TYPE_NOW 0x1128 > +#define REG_OFFSET_SCRUB_CONTROL0_NOW 0x1170 > +#define REG_OFFSET_FORMAT_CONTROL 0x18 > + > +/* DMC-520 types, masks and bitfields */ > +#define RAM_ECC_INT_CE_BIT BIT(0) > +#define RAM_ECC_INT_UE_BIT BIT(1) > +#define DRAM_ECC_INT_CE_BIT BIT(2) > +#define DRAM_ECC_INT_UE_BIT BIT(3) > +#define FAILED_ACCESS_INT_BIT BIT(4) > +#define FAILED_PROG_INT_BIT BIT(5) > +#define LINK_ERR_INT_BIT BIT(6) > +#define TEMPERATURE_EVENT_INT_BIT BIT(7) Align values vertically. > +#define ARCH_FSM_INT_BIT BIT(8) > +#define PHY_REQUEST_INT_BIT BIT(9) > +#define MEMORY_WIDTH_MASK GENMASK(1, 0) > +#define SCRUB_TRIGGER0_NEXT_MASK GENMASK(1, 0) > +#define REG_FIELD_DRAM_ECC_ENABLED GENMASK(1, 0) > +#define REG_FIELD_MEMORY_TYPE GENMASK(2, 0) > +#define REG_FIELD_DEVICE_WIDTH GENMASK(9, 8) > +#define REG_FIELD_ADDRESS_CONTROL_COL GENMASK(2, 0) > +#define REG_FIELD_ADDRESS_CONTROL_ROW GENMASK(10, 8) > +#define REG_FIELD_ADDRESS_CONTROL_BANK GENMASK(18, 16) > +#define REG_FIELD_ADDRESS_CONTROL_RANK GENMASK(25, 24) > +#define REG_FIELD_ERR_INFO_LOW_VALID BIT(0) > +#define REG_FIELD_ERR_INFO_LOW_COL GENMASK(10, 1) > +#define REG_FIELD_ERR_INFO_LOW_ROW GENMASK(28, 11) > +#define REG_FIELD_ERR_INFO_LOW_RANK GENMASK(31, 29) > +#define REG_FIELD_ERR_INFO_HIGH_BANK GENMASK(3, 0) > +#define REG_FIELD_ERR_INFO_HIGH_VALID BIT(31) > + > +#define DRAM_ADDRESS_CONTROL_MIN_COL_BITS 8 > +#define DRAM_ADDRESS_CONTROL_MIN_ROW_BITS 11 > + > +#define DMC520_SCRUB_TRIGGER_ERR_DETECT 2 > +#define DMC520_SCRUB_TRIGGER_IDLE 3 > + > +/* Driver settings */ > +/* > + * The max-length message would be: "rank:7 bank:15 row:262143 col:1023". > + * Max length is 34. Using a 40-size buffer is enough. > + */ > +#define DMC520_MSG_BUF_SIZE 40 > +#define EDAC_MOD_NAME "dmc520-edac" > +#define EDAC_CTL_NAME "dmc520" > + > +/* the data bus width for the attached memory chips. */ > +enum dmc520_mem_width { > + MEM_WIDTH_X32 = 2, > + MEM_WIDTH_X64 = 3 > +}; > + > +/* memory type */ > +enum dmc520_mem_type { > + MEM_TYPE_DDR3 = 1, > + MEM_TYPE_DDR4 = 2 > +}; > + > +/* memory device width */ > +enum dmc520_dev_width { > + DEV_WIDTH_X4 = 0, > + DEV_WIDTH_X8 = 1, > + DEV_WIDTH_X16 = 2 > +}; > + > +struct ecc_error_info { > + u32 col; > + u32 row; > + u32 bank; > + u32 rank; > +}; > + > +/* The interrupt config */ > +struct dmc520_irq_config { > + char *name; > + int mask; > +}; > + > +/* The interrupt mappings */ > +static struct dmc520_irq_config dmc520_irq_configs[] = { > + { Just a nit: ERROR: trailing whitespace #209: FILE: drivers/edac/dmc520_edac.c:119: +^I{ $ IOW, before you send next time, do: $ git log -p -1 | ./scripts/checkpatch.pl to verify you've caught them all. > + .name = "ram_ecc_errc", > + .mask = RAM_ECC_INT_CE_BIT > + }, > + { > + .name = "ram_ecc_errd", > + .mask = RAM_ECC_INT_UE_BIT > + }, > + { > + .name = "dram_ecc_errc", > + .mask = DRAM_ECC_INT_CE_BIT > + }, > + { > + .name = "dram_ecc_errd", > + .mask = DRAM_ECC_INT_UE_BIT > + }, > + { > + .name = "failed_access", > + .mask = FAILED_ACCESS_INT_BIT > + }, > + { > + .name = "failed_prog", > + .mask = FAILED_PROG_INT_BIT > + }, > + { > + .name = "link_err", > + .mask = LINK_ERR_INT_BIT > + }, > + { > + .name = "temperature_event", > + .mask = TEMPERATURE_EVENT_INT_BIT > + }, > + { > + .name = "arch_fsm", > + .mask = ARCH_FSM_INT_BIT > + }, > + { > + .name = "phy_request", > + .mask = PHY_REQUEST_INT_BIT > + } > +}; > + > +#define NUMBER_OF_IRQS ARRAY_SIZE(dmc520_irq_configs) WARNING: please, no space before tabs #251: FILE: drivers/edac/dmc520_edac.c:161: +#define NUMBER_OF_IRQS ^I^I^I^IARRAY_SIZE(dmc520_irq_configs)$ > + > +/* The EDAC driver private data */ > +struct dmc520_edac { > + void __iomem *reg_base; > + spinlock_t ecc_lock; What does that spinlock protect? Also, its name is not very optimal. > + int irqs[NUMBER_OF_IRQS]; > + int masks[NUMBER_OF_IRQS]; > +}; > + > +static int dmc520_mc_idx; > + > +static irqreturn_t > +dmc520_edac_dram_all_isr(int irq, struct mem_ctl_info *mci, u32 irq_mask); Move the ISR under dmc520_edac_dram_all_isr() and get rid of that forward declaration. > + > +static irqreturn_t dmc520_isr(int irq, void *data) > +{ > + struct mem_ctl_info *mci; > + struct dmc520_edac *edac; > + int idx; > + u32 mask = 0; > + mci = data; WARNING: Missing a blank line after declarations #272: FILE: drivers/edac/dmc520_edac.c:182: + u32 mask = 0; + mci = data; > + edac = mci->pvt_info; Also, do this: struct mem_ctl_info *mci = data; struct dmc520_edac *pvt = mci->pvt_info; u32 mask = 0; int idx; > + > + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { > + if (edac->irqs[idx] == irq) { > + mask = edac->masks[idx]; > + break; > + } > + } > + return dmc520_edac_dram_all_isr(irq, mci, mask); > +} > + > +static u32 dmc520_read_reg(struct dmc520_edac *edac, u32 offset) > +{ > + return readl(edac->reg_base + offset); > +} > + > +static void dmc520_write_reg(struct dmc520_edac *edac, u32 val, u32 offset) > +{ > + writel(val, edac->reg_base + offset); > +} > + > +static u32 dmc520_calc_dram_ecc_error(u32 value) > +{ > + u32 total = 0; > + > + /* Each rank's error counter takes one byte. */ > + while (value > 0) { > + total += (value & 0xFF); > + value >>= 8; > + } > + return total; > +} > + > +static u32 dmc520_get_dram_ecc_error_count(struct dmc520_edac *edac, > + bool is_ce) > +{ > + u32 reg_offset_low, reg_offset_high; > + u32 err_low, err_high; > + u32 err_count; > + > + reg_offset_low = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_31_00 : > + REG_OFFSET_ECC_ERRD_COUNT_31_00; > + reg_offset_high = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_63_32 : > + REG_OFFSET_ECC_ERRD_COUNT_63_32; > + > + err_low = dmc520_read_reg(edac, reg_offset_low); > + err_high = dmc520_read_reg(edac, reg_offset_high); > + /* Reset error counters */ > + dmc520_write_reg(edac, 0, reg_offset_low); > + dmc520_write_reg(edac, 0, reg_offset_high); > + > + err_count = dmc520_calc_dram_ecc_error(err_low) + > + dmc520_calc_dram_ecc_error(err_high); > + > + return err_count; > +} > + > +static void dmc520_get_dram_ecc_error_info(struct dmc520_edac *edac, > + bool is_ce, > + struct ecc_error_info *info) > +{ > + u32 reg_offset_low, reg_offset_high; > + u32 reg_val_low, reg_val_high; > + bool valid; > + > + reg_offset_low = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 : > + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00; > + reg_offset_high = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 : > + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32; Those define names could be shorter. > + > + reg_val_low = dmc520_read_reg(edac, reg_offset_low); > + reg_val_high = dmc520_read_reg(edac, reg_offset_high); > + > + valid = (FIELD_GET(REG_FIELD_ERR_INFO_LOW_VALID, reg_val_low) != 0) && > + (FIELD_GET(REG_FIELD_ERR_INFO_HIGH_VALID, reg_val_high) != 0); > + > + if (valid) { > + info->col = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_COL, reg_val_low); > + info->row = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_ROW, reg_val_low); > + info->rank = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_RANK, reg_val_low); > + info->bank = > + FIELD_GET(REG_FIELD_ERR_INFO_HIGH_BANK, reg_val_high); Those too. And let those stick out - the 80 cols rule is not an absolute one. > + } else { > + memset(info, 0, sizeof(struct ecc_error_info)); > + } > +} > + > +static bool dmc520_is_ecc_enabled(void __iomem *reg_base) > +{ > + u32 reg_val = readl(reg_base + REG_OFFSET_FEATURE_CONFIG); > + > + return FIELD_GET(REG_FIELD_DRAM_ECC_ENABLED, reg_val); > +} > + > +static enum scrub_type dmc520_get_scrub_type(struct dmc520_edac *edac) > +{ > + enum scrub_type type = SCRUB_NONE; > + u32 reg_val, scrub_cfg; > + > + reg_val = dmc520_read_reg(edac, REG_OFFSET_SCRUB_CONTROL0_NOW); > + scrub_cfg = FIELD_GET(SCRUB_TRIGGER0_NEXT_MASK, reg_val); > + > + if (scrub_cfg == DMC520_SCRUB_TRIGGER_ERR_DETECT || > + scrub_cfg == DMC520_SCRUB_TRIGGER_IDLE) > + type = SCRUB_HW_PROG; > + > + return type; > +} > + > +/* Get the memory data bus width, in number of bytes. */ > +static u32 dmc520_get_memory_width(struct dmc520_edac *edac) > +{ > + enum dmc520_mem_width mem_width_field; > + static u32 mem_width_in_bytes; > + u32 reg_val; > + > + if (mem_width_in_bytes) > + return mem_width_in_bytes; This looks like the memory width in bytes is the same for the whole device so you can do this determination in the probe function and make that mem_width_in_bytes variable global and define it at the beginning of this file. > + reg_val = dmc520_read_reg(edac, REG_OFFSET_FORMAT_CONTROL); > + mem_width_field = FIELD_GET(MEMORY_WIDTH_MASK, reg_val); > + > + if (mem_width_field == MEM_WIDTH_X32) > + mem_width_in_bytes = 4; > + else if (mem_width_field == MEM_WIDTH_X64) > + mem_width_in_bytes = 8; > + return mem_width_in_bytes; > +} ... > + > +static int dmc520_edac_probe(struct platform_device *pdev) > +{ > + bool registered[NUMBER_OF_IRQS] = {false}; > + int irqs[NUMBER_OF_IRQS] = {-ENXIO}; > + int masks[NUMBER_OF_IRQS] = {0}; Add spaces around those initializers. E.g., { -ENXIO }; etc. > + struct mem_ctl_info *mci = NULL; Useless initialization. > + struct edac_mc_layer layers[1]; > + struct dmc520_edac *edac; drivers/edac/dmc520_edac.c: In function ‘dmc520_edac_probe’: drivers/edac/dmc520_edac.c:493:22: warning: ‘edac’ may be used uninitialized in this function [-Wmaybe-uninitialized] struct dmc520_edac *edac; ^~~~ Also, that variable name is a bad choice. Most drivers call it "pvt" or so to mean, driver's "private data". > + void __iomem *reg_base; > + u32 irq_mask_all = 0; > + int ret, idx, irq; > + struct resource *res; > + struct device *dev; > + u32 reg_val; > + > + /* Parse the device node */ > + dev = &pdev->dev; > + > + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { > + irq = platform_get_irq_byname(pdev, dmc520_irq_configs[idx].name); > + irqs[idx] = irq; > + masks[idx] = dmc520_irq_configs[idx].mask; > + if (irq >= 0) { > + irq_mask_all |= dmc520_irq_configs[idx].mask; > + edac_printk(KERN_INFO, EDAC_MOD_NAME, > + "Discovered %s, irq: %d.\n", dmc520_irq_configs[idx].name, irq); Is that something you really wanna say on driver load? I.e., should it be edac_dbg() ? > + } > + } > + > + if (irq_mask_all == 0) { if (!irq_mask_all) > + edac_printk(KERN_ERR, EDAC_MOD_NAME, > + "At least one valid interrupt line is expected.\n"); > + return -EINVAL; > + } > + > + /* Initialize dmc520 edac */ > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > + reg_base = devm_ioremap_resource(dev, res); > + if (IS_ERR(reg_base)) > + return PTR_ERR(reg_base); > + > + if (!dmc520_is_ecc_enabled(reg_base)) > + return -ENXIO; > + > + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; > + layers[0].size = dmc520_get_rank_count(reg_base); > + layers[0].is_virt_csrow = true; > + > + mci = edac_mc_alloc(dmc520_mc_idx++, ARRAY_SIZE(layers), layers, > + sizeof(struct dmc520_edac)); > + if (!mci) { > + edac_printk(KERN_ERR, EDAC_MOD_NAME, > + "Failed to allocate memory for mc instance\n"); > + ret = -ENOMEM; > + goto err; > + } > + > + edac = mci->pvt_info; > + > + edac->reg_base = reg_base; > + spin_lock_init(&edac->ecc_lock); > + memcpy(edac->irqs, irqs, sizeof(irqs)); > + memcpy(edac->masks, masks, sizeof(masks)); > + > + platform_set_drvdata(pdev, mci); > + > + mci->pdev = dev; > + mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR4; > + mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED; > + mci->edac_cap = EDAC_FLAG_SECDED; > + mci->scrub_cap = SCRUB_FLAG_HW_SRC; > + mci->scrub_mode = dmc520_get_scrub_type(edac); > + mci->ctl_name = EDAC_CTL_NAME; > + mci->dev_name = dev_name(mci->pdev); > + mci->mod_name = EDAC_MOD_NAME; > + > + edac_op_state = EDAC_OPSTATE_INT; > + > + dmc520_init_csrow(mci); > + > + /* Clear interrupts, not affecting other unrelated interrupts */ > + reg_val = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_CONTROL); > + dmc520_write_reg(edac, reg_val & (~irq_mask_all), > + REG_OFFSET_INTERRUPT_CONTROL); > + dmc520_write_reg(edac, irq_mask_all, REG_OFFSET_INTERRUPT_CLR); > + > + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { > + irq = irqs[idx]; > + if (irq >= 0) { > + ret = devm_request_irq(&pdev->dev, irq, > + dmc520_isr, IRQF_SHARED, > + dev_name(&pdev->dev), mci); Align arguments on the opening brace. > + if (ret < 0) { > + edac_printk(KERN_ERR, EDAC_MC, > + "Failed to request irq %d\n", irq); Ditto. > + goto err; > + } > + registered[idx] = true; > + } > + } > + > + /* Reset DRAM CE/UE counters */ > + if (irq_mask_all & DRAM_ECC_INT_CE_BIT) > + dmc520_get_dram_ecc_error_count(edac, true); > + > + if (irq_mask_all & DRAM_ECC_INT_UE_BIT) > + dmc520_get_dram_ecc_error_count(edac, false); > + > + ret = edac_mc_add_mc(mci); > + if (ret) { > + edac_printk(KERN_ERR, EDAC_MOD_NAME, > + "Failed to register with EDAC core\n"); > + goto err; > + } > + > + /* Enable interrupts, not affecting other unrelated interrupts */ > + dmc520_write_reg(edac, reg_val | irq_mask_all, > + REG_OFFSET_INTERRUPT_CONTROL); > + > + return 0; > + > +err: > + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { > + if (registered[idx]) { > + devm_free_irq(&pdev->dev, edac->irqs[idx], mci); > + } WARNING: braces {} are not necessary for single statement blocks #699: FILE: drivers/edac/dmc520_edac.c:609: + if (registered[idx]) { + devm_free_irq(&pdev->dev, edac->irqs[idx], mci); + } > + } > + if (mci) > + edac_mc_free(mci); > + > + return ret; > +} > + > +static int dmc520_edac_remove(struct platform_device *pdev) > +{ > + struct dmc520_edac *edac; > + struct mem_ctl_info *mci; > + u32 reg_val, idx, irq_mask_all = 0; Please sort function local variables declaration in a reverse christmas tree order: <type A> longest_variable_name; <type B> shorter_var_name; <type C> even_shorter; <type D> i; > + > + mci = platform_get_drvdata(pdev); > + edac = mci->pvt_info; > + > + /* free irq's */ > + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { > + if (edac->irqs[idx] >= 0) { > + irq_mask_all |= edac->masks[idx]; > + devm_free_irq(&pdev->dev, edac->irqs[idx], mci); > + } > + } > + > + /* Disable interrupts */ > + reg_val = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_CONTROL); > + dmc520_write_reg(edac, reg_val & (~irq_mask_all), > + REG_OFFSET_INTERRUPT_CONTROL); Huh, aren't you supposed to disable the interrupts and *then* free the IRQs? > + > + edac_mc_del_mc(&pdev->dev); > + edac_mc_free(mci); > + > + return 0; > +} > + > +static const struct of_device_id dmc520_edac_driver_id[] = { > + { .compatible = "arm,dmc-520", }, > + { /* end of table */ } > +}; > + > +MODULE_DEVICE_TABLE(of, dmc520_edac_driver_id); > + > +static struct platform_driver dmc520_edac_driver = { > + .driver = { > + .name = "dmc520", > + .of_match_table = dmc520_edac_driver_id, > + }, > + > + .probe = dmc520_edac_probe, > + .remove = dmc520_edac_remove > +}; > + > +module_platform_driver(dmc520_edac_driver); > + > +MODULE_AUTHOR("Rui Zhao <ruizhao@microsoft.com>"); > +MODULE_AUTHOR("Lei Wang <lewan@microsoft.com>"); > +MODULE_AUTHOR("Shiping Ji <shji@microsoft.com>"); > +MODULE_DESCRIPTION("DMC-520 ECC driver"); > +MODULE_LICENSE("GPL v2"); > -- > 2.17.1 >
Hi Shiping, Here is another small change to cleanup. On 2020-01-15 6:32 a.m., Shiping Ji wrote: > New driver supports error detection and correction on the devices with ARM > DMC-520 memory controller. > > Signed-off-by: Shiping Ji <shiping.linux@gmail.com> > Signed-off-by: Lei Wang <leiwang_git@outlook.com> > Reviewed-by: James Morse <james.morse@arm.com> > > --- > Changes in v9: > - Removed interrupt-config and replaced with an interrupt map where names and masks are predefined > - Only one ISR function is defined, mask is retrieved from the interrupt map > - "dram_ecc_errc" and "dram_ecc_errd" are implemented > > --- > +static void dmc520_get_dram_ecc_error_info(struct dmc520_edac *edac, > + bool is_ce, > + struct ecc_error_info *info) > +{ > + u32 reg_offset_low, reg_offset_high; > + u32 reg_val_low, reg_val_high; > + bool valid; > + > + reg_offset_low = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 : > + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00; > + reg_offset_high = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 : > + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32; > + > + reg_val_low = dmc520_read_reg(edac, reg_offset_low); > + reg_val_high = dmc520_read_reg(edac, reg_offset_high); > + > + valid = (FIELD_GET(REG_FIELD_ERR_INFO_LOW_VALID, reg_val_low) != 0) && > + (FIELD_GET(REG_FIELD_ERR_INFO_HIGH_VALID, reg_val_high) != 0); > + > + if (valid) { > + info->col = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_COL, reg_val_low); > + info->row = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_ROW, reg_val_low); > + info->rank = > + FIELD_GET(REG_FIELD_ERR_INFO_LOW_RANK, reg_val_low); > + info->bank = > + FIELD_GET(REG_FIELD_ERR_INFO_HIGH_BANK, reg_val_high); > + } else { > + memset(info, 0, sizeof(struct ecc_error_info)); This should be sizeof(*info), not sizeof(struct ecc_error_info) for better programming to allow info to change type in the future without the code changing. > + } > +} > + >
On 1/16/2020 4:31 PM, Scott Branden wrote: > Hi Shiping, > > Here is another small change to cleanup. >> + } else { >> + memset(info, 0, sizeof(struct ecc_error_info)); > This should be sizeof(*info), not sizeof(struct ecc_error_info) > for better programming to allow info to change type in the future > without the code changing. Yes, two occurrences will be replaced in the next patch, thanks!
On 1/16/2020 4:18 PM, Borislav Petkov wrote: >> +/* The EDAC driver private data */ >> +struct dmc520_edac { >> + void __iomem *reg_base; >> + spinlock_t ecc_lock; > > What does that spinlock protect? Also, its name is not very optimal. This is to protect concurrent writes to the mci->error_desc as suggested by James when reviewing the patch v3. >> + reg_offset_low = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 : >> + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00; >> + reg_offset_high = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 : >> + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32; > > Those define names could be shorter. I'm trying to find a good scheme to make them shorter, at the moment they are named according to the TRM. >> + if (irq >= 0) { >> + ret = devm_request_irq(&pdev->dev, irq, >> + dmc520_isr, IRQF_SHARED, >> + dev_name(&pdev->dev), mci); > > Align arguments on the opening brace. I'm not sure how this can be done perfectly with tabs only :) All other comments have been addressed in the next patch, many thanks!
On 2020-01-17 10:31 a.m., Shiping Ji wrote: > >>> + if (irq >= 0) { >>> + ret = devm_request_irq(&pdev->dev, irq, >>> + dmc520_isr, IRQF_SHARED, >>> + dev_name(&pdev->dev), mci); >> Align arguments on the opening brace. > I'm not sure how this can be done perfectly with tabs only :) tabs are used first, followed by however may spaces (less than 8) needed to lineup at the end. > > All other comments have been addressed in the next patch, many thanks! >
On Fri, Jan 17, 2020 at 10:31:18AM -0800, Shiping Ji wrote: > This is to protect concurrent writes to the mci->error_desc as > suggested by James when reviewing the patch v3. Please comment that in the structure definition so that it is clear what it is for. > I'm trying to find a good scheme to make them shorter, at the moment > they are named according to the TRM. Yeah, keeping it the same as the documentation is also a good idea. I leave it up to you to decide as you'll be staring at that code when bugs happen. :) > I'm not sure how this can be done perfectly with tabs only :) Who says you should use only tabs? :-) > All other comments have been addressed in the next patch, many thanks! Thanks too.
diff --git a/MAINTAINERS b/MAINTAINERS index bd5847e802de..386195a019c6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5914,6 +5914,12 @@ F: Documentation/driver-api/edac.rst F: drivers/edac/ F: include/linux/edac.h +EDAC-DMC520 +M: Lei Wang <lewan@microsoft.com> +L: linux-edac@vger.kernel.org +S: Supported +F: drivers/edac/dmc520_edac.c + EDAC-E752X M: Mark Gross <mark.gross@intel.com> L: linux-edac@vger.kernel.org diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 37027c298323..7305bd1ec80e 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -523,4 +523,11 @@ config EDAC_BLUEFIELD Support for error detection and correction on the Mellanox BlueField SoCs. +config EDAC_DMC520 + tristate "ARM DMC-520 ECC" + depends on ARM64 + help + Support for error detection and correction on the + SoCs with ARM DMC-520 DRAM controller. + endif # EDAC diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index d77200c9680b..269e15118cea 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -87,3 +87,4 @@ obj-$(CONFIG_EDAC_TI) += ti_edac.o obj-$(CONFIG_EDAC_QCOM) += qcom_edac.o obj-$(CONFIG_EDAC_ASPEED) += aspeed_edac.o obj-$(CONFIG_EDAC_BLUEFIELD) += bluefield_edac.o +obj-$(CONFIG_EDAC_DMC520) += dmc520_edac.o diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c new file mode 100644 index 000000000000..55237c5c522c --- /dev/null +++ b/drivers/edac/dmc520_edac.c @@ -0,0 +1,670 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * EDAC driver for DMC-520 memory controller. + * + * The driver supports 10 interrupt lines, + * though only dram_ecc_errc and dram_ecc_errd are currently handled. + * + * Authors: Rui Zhao <ruizhao@microsoft.com> + * Lei Wang <lewan@microsoft.com> + * Shiping Ji <shji@microsoft.com> + */ + +#include <linux/bitfield.h> +#include <linux/edac.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/module.h> +#include <linux/of.h> +#include <linux/platform_device.h> +#include <linux/slab.h> +#include <linux/spinlock.h> +#include "edac_mc.h" + +/* DMC-520 registers */ +#define REG_OFFSET_FEATURE_CONFIG 0x130 +#define REG_OFFSET_ECC_ERRC_COUNT_31_00 0x158 +#define REG_OFFSET_ECC_ERRC_COUNT_63_32 0x15C +#define REG_OFFSET_ECC_ERRD_COUNT_31_00 0x160 +#define REG_OFFSET_ECC_ERRD_COUNT_63_32 0x164 +#define REG_OFFSET_INTERRUPT_CONTROL 0x500 +#define REG_OFFSET_INTERRUPT_CLR 0x508 +#define REG_OFFSET_INTERRUPT_STATUS 0x510 +#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 0x528 +#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 0x52C +#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00 0x530 +#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32 0x534 +#define REG_OFFSET_ADDRESS_CONTROL_NOW 0x1010 +#define REG_OFFSET_MEMORY_TYPE_NOW 0x1128 +#define REG_OFFSET_SCRUB_CONTROL0_NOW 0x1170 +#define REG_OFFSET_FORMAT_CONTROL 0x18 + +/* DMC-520 types, masks and bitfields */ +#define RAM_ECC_INT_CE_BIT BIT(0) +#define RAM_ECC_INT_UE_BIT BIT(1) +#define DRAM_ECC_INT_CE_BIT BIT(2) +#define DRAM_ECC_INT_UE_BIT BIT(3) +#define FAILED_ACCESS_INT_BIT BIT(4) +#define FAILED_PROG_INT_BIT BIT(5) +#define LINK_ERR_INT_BIT BIT(6) +#define TEMPERATURE_EVENT_INT_BIT BIT(7) +#define ARCH_FSM_INT_BIT BIT(8) +#define PHY_REQUEST_INT_BIT BIT(9) +#define MEMORY_WIDTH_MASK GENMASK(1, 0) +#define SCRUB_TRIGGER0_NEXT_MASK GENMASK(1, 0) +#define REG_FIELD_DRAM_ECC_ENABLED GENMASK(1, 0) +#define REG_FIELD_MEMORY_TYPE GENMASK(2, 0) +#define REG_FIELD_DEVICE_WIDTH GENMASK(9, 8) +#define REG_FIELD_ADDRESS_CONTROL_COL GENMASK(2, 0) +#define REG_FIELD_ADDRESS_CONTROL_ROW GENMASK(10, 8) +#define REG_FIELD_ADDRESS_CONTROL_BANK GENMASK(18, 16) +#define REG_FIELD_ADDRESS_CONTROL_RANK GENMASK(25, 24) +#define REG_FIELD_ERR_INFO_LOW_VALID BIT(0) +#define REG_FIELD_ERR_INFO_LOW_COL GENMASK(10, 1) +#define REG_FIELD_ERR_INFO_LOW_ROW GENMASK(28, 11) +#define REG_FIELD_ERR_INFO_LOW_RANK GENMASK(31, 29) +#define REG_FIELD_ERR_INFO_HIGH_BANK GENMASK(3, 0) +#define REG_FIELD_ERR_INFO_HIGH_VALID BIT(31) + +#define DRAM_ADDRESS_CONTROL_MIN_COL_BITS 8 +#define DRAM_ADDRESS_CONTROL_MIN_ROW_BITS 11 + +#define DMC520_SCRUB_TRIGGER_ERR_DETECT 2 +#define DMC520_SCRUB_TRIGGER_IDLE 3 + +/* Driver settings */ +/* + * The max-length message would be: "rank:7 bank:15 row:262143 col:1023". + * Max length is 34. Using a 40-size buffer is enough. + */ +#define DMC520_MSG_BUF_SIZE 40 +#define EDAC_MOD_NAME "dmc520-edac" +#define EDAC_CTL_NAME "dmc520" + +/* the data bus width for the attached memory chips. */ +enum dmc520_mem_width { + MEM_WIDTH_X32 = 2, + MEM_WIDTH_X64 = 3 +}; + +/* memory type */ +enum dmc520_mem_type { + MEM_TYPE_DDR3 = 1, + MEM_TYPE_DDR4 = 2 +}; + +/* memory device width */ +enum dmc520_dev_width { + DEV_WIDTH_X4 = 0, + DEV_WIDTH_X8 = 1, + DEV_WIDTH_X16 = 2 +}; + +struct ecc_error_info { + u32 col; + u32 row; + u32 bank; + u32 rank; +}; + +/* The interrupt config */ +struct dmc520_irq_config { + char *name; + int mask; +}; + +/* The interrupt mappings */ +static struct dmc520_irq_config dmc520_irq_configs[] = { + { + .name = "ram_ecc_errc", + .mask = RAM_ECC_INT_CE_BIT + }, + { + .name = "ram_ecc_errd", + .mask = RAM_ECC_INT_UE_BIT + }, + { + .name = "dram_ecc_errc", + .mask = DRAM_ECC_INT_CE_BIT + }, + { + .name = "dram_ecc_errd", + .mask = DRAM_ECC_INT_UE_BIT + }, + { + .name = "failed_access", + .mask = FAILED_ACCESS_INT_BIT + }, + { + .name = "failed_prog", + .mask = FAILED_PROG_INT_BIT + }, + { + .name = "link_err", + .mask = LINK_ERR_INT_BIT + }, + { + .name = "temperature_event", + .mask = TEMPERATURE_EVENT_INT_BIT + }, + { + .name = "arch_fsm", + .mask = ARCH_FSM_INT_BIT + }, + { + .name = "phy_request", + .mask = PHY_REQUEST_INT_BIT + } +}; + +#define NUMBER_OF_IRQS ARRAY_SIZE(dmc520_irq_configs) + +/* The EDAC driver private data */ +struct dmc520_edac { + void __iomem *reg_base; + spinlock_t ecc_lock; + int irqs[NUMBER_OF_IRQS]; + int masks[NUMBER_OF_IRQS]; +}; + +static int dmc520_mc_idx; + +static irqreturn_t +dmc520_edac_dram_all_isr(int irq, struct mem_ctl_info *mci, u32 irq_mask); + +static irqreturn_t dmc520_isr(int irq, void *data) +{ + struct mem_ctl_info *mci; + struct dmc520_edac *edac; + int idx; + u32 mask = 0; + mci = data; + edac = mci->pvt_info; + + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { + if (edac->irqs[idx] == irq) { + mask = edac->masks[idx]; + break; + } + } + return dmc520_edac_dram_all_isr(irq, mci, mask); +} + +static u32 dmc520_read_reg(struct dmc520_edac *edac, u32 offset) +{ + return readl(edac->reg_base + offset); +} + +static void dmc520_write_reg(struct dmc520_edac *edac, u32 val, u32 offset) +{ + writel(val, edac->reg_base + offset); +} + +static u32 dmc520_calc_dram_ecc_error(u32 value) +{ + u32 total = 0; + + /* Each rank's error counter takes one byte. */ + while (value > 0) { + total += (value & 0xFF); + value >>= 8; + } + return total; +} + +static u32 dmc520_get_dram_ecc_error_count(struct dmc520_edac *edac, + bool is_ce) +{ + u32 reg_offset_low, reg_offset_high; + u32 err_low, err_high; + u32 err_count; + + reg_offset_low = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_31_00 : + REG_OFFSET_ECC_ERRD_COUNT_31_00; + reg_offset_high = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_63_32 : + REG_OFFSET_ECC_ERRD_COUNT_63_32; + + err_low = dmc520_read_reg(edac, reg_offset_low); + err_high = dmc520_read_reg(edac, reg_offset_high); + /* Reset error counters */ + dmc520_write_reg(edac, 0, reg_offset_low); + dmc520_write_reg(edac, 0, reg_offset_high); + + err_count = dmc520_calc_dram_ecc_error(err_low) + + dmc520_calc_dram_ecc_error(err_high); + + return err_count; +} + +static void dmc520_get_dram_ecc_error_info(struct dmc520_edac *edac, + bool is_ce, + struct ecc_error_info *info) +{ + u32 reg_offset_low, reg_offset_high; + u32 reg_val_low, reg_val_high; + bool valid; + + reg_offset_low = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 : + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00; + reg_offset_high = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 : + REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32; + + reg_val_low = dmc520_read_reg(edac, reg_offset_low); + reg_val_high = dmc520_read_reg(edac, reg_offset_high); + + valid = (FIELD_GET(REG_FIELD_ERR_INFO_LOW_VALID, reg_val_low) != 0) && + (FIELD_GET(REG_FIELD_ERR_INFO_HIGH_VALID, reg_val_high) != 0); + + if (valid) { + info->col = + FIELD_GET(REG_FIELD_ERR_INFO_LOW_COL, reg_val_low); + info->row = + FIELD_GET(REG_FIELD_ERR_INFO_LOW_ROW, reg_val_low); + info->rank = + FIELD_GET(REG_FIELD_ERR_INFO_LOW_RANK, reg_val_low); + info->bank = + FIELD_GET(REG_FIELD_ERR_INFO_HIGH_BANK, reg_val_high); + } else { + memset(info, 0, sizeof(struct ecc_error_info)); + } +} + +static bool dmc520_is_ecc_enabled(void __iomem *reg_base) +{ + u32 reg_val = readl(reg_base + REG_OFFSET_FEATURE_CONFIG); + + return FIELD_GET(REG_FIELD_DRAM_ECC_ENABLED, reg_val); +} + +static enum scrub_type dmc520_get_scrub_type(struct dmc520_edac *edac) +{ + enum scrub_type type = SCRUB_NONE; + u32 reg_val, scrub_cfg; + + reg_val = dmc520_read_reg(edac, REG_OFFSET_SCRUB_CONTROL0_NOW); + scrub_cfg = FIELD_GET(SCRUB_TRIGGER0_NEXT_MASK, reg_val); + + if (scrub_cfg == DMC520_SCRUB_TRIGGER_ERR_DETECT || + scrub_cfg == DMC520_SCRUB_TRIGGER_IDLE) + type = SCRUB_HW_PROG; + + return type; +} + +/* Get the memory data bus width, in number of bytes. */ +static u32 dmc520_get_memory_width(struct dmc520_edac *edac) +{ + enum dmc520_mem_width mem_width_field; + static u32 mem_width_in_bytes; + u32 reg_val; + + if (mem_width_in_bytes) + return mem_width_in_bytes; + + reg_val = dmc520_read_reg(edac, REG_OFFSET_FORMAT_CONTROL); + mem_width_field = FIELD_GET(MEMORY_WIDTH_MASK, reg_val); + + if (mem_width_field == MEM_WIDTH_X32) + mem_width_in_bytes = 4; + else if (mem_width_field == MEM_WIDTH_X64) + mem_width_in_bytes = 8; + return mem_width_in_bytes; +} + +static enum mem_type dmc520_get_mtype(struct dmc520_edac *edac) +{ + enum mem_type mt = MEM_UNKNOWN; + enum dmc520_mem_type type; + u32 reg_val; + + reg_val = dmc520_read_reg(edac, REG_OFFSET_MEMORY_TYPE_NOW); + type = FIELD_GET(REG_FIELD_MEMORY_TYPE, reg_val); + + switch (type) { + case MEM_TYPE_DDR3: + mt = MEM_DDR3; + break; + + case MEM_TYPE_DDR4: + mt = MEM_DDR4; + break; + } + + return mt; +} + +static enum dev_type dmc520_get_dtype(struct dmc520_edac *edac) +{ + enum dmc520_dev_width device_width; + enum dev_type dt = DEV_UNKNOWN; + u32 reg_val; + + reg_val = dmc520_read_reg(edac, REG_OFFSET_MEMORY_TYPE_NOW); + device_width = FIELD_GET(REG_FIELD_DEVICE_WIDTH, reg_val); + + switch (device_width) { + case DEV_WIDTH_X4: + dt = DEV_X4; + break; + + case DEV_WIDTH_X8: + dt = DEV_X8; + break; + + case DEV_WIDTH_X16: + dt = DEV_X16; + break; + } + + return dt; +} + +static u32 dmc520_get_rank_count(void __iomem *reg_base) +{ + u32 reg_val, rank_bits; + + reg_val = readl(reg_base + REG_OFFSET_ADDRESS_CONTROL_NOW); + rank_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_RANK, reg_val); + + return BIT(rank_bits); +} + +static u64 dmc520_get_rank_size(struct dmc520_edac *edac) +{ + u32 reg_val, col_bits, row_bits, bank_bits; + + reg_val = dmc520_read_reg(edac, REG_OFFSET_ADDRESS_CONTROL_NOW); + + col_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_COL, reg_val) + + DRAM_ADDRESS_CONTROL_MIN_COL_BITS; + row_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_ROW, reg_val) + + DRAM_ADDRESS_CONTROL_MIN_ROW_BITS; + bank_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_BANK, reg_val); + + return (u64)dmc520_get_memory_width(edac) << + (col_bits + row_bits + bank_bits); +} + +static void dmc520_handle_dram_ecc_errors(struct mem_ctl_info *mci, + bool is_ce) +{ + char message[DMC520_MSG_BUF_SIZE]; + struct ecc_error_info info; + struct dmc520_edac *edac; + u32 cnt; + + edac = mci->pvt_info; + dmc520_get_dram_ecc_error_info(edac, is_ce, &info); + + cnt = dmc520_get_dram_ecc_error_count(edac, is_ce); + if (!cnt) + return; + + snprintf(message, ARRAY_SIZE(message), + "rank:%d bank:%d row:%d col:%d", + info.rank, info.bank, + info.row, info.col); + + spin_lock(&edac->ecc_lock); + edac_mc_handle_error((is_ce ? HW_EVENT_ERR_CORRECTED : + HW_EVENT_ERR_UNCORRECTED), + mci, cnt, 0, 0, 0, info.rank, -1, -1, + message, ""); + spin_unlock(&edac->ecc_lock); +} + +static irqreturn_t dmc520_edac_dram_ecc_isr(int irq, struct mem_ctl_info *mci, + bool is_ce) +{ + struct dmc520_edac *edac; + u32 i_mask; + + edac = mci->pvt_info; + + i_mask = is_ce ? DRAM_ECC_INT_CE_BIT : DRAM_ECC_INT_UE_BIT; + + dmc520_handle_dram_ecc_errors(mci, is_ce); + + dmc520_write_reg(edac, i_mask, REG_OFFSET_INTERRUPT_CLR); + + return IRQ_HANDLED; +} + +static irqreturn_t dmc520_edac_dram_all_isr(int irq, struct mem_ctl_info *mci, + u32 irq_mask) +{ + irqreturn_t irq_ret = IRQ_NONE; + struct dmc520_edac *edac; + u32 status; + + edac = mci->pvt_info; + + status = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_STATUS); + + if ((irq_mask & DRAM_ECC_INT_CE_BIT) && + (status & DRAM_ECC_INT_CE_BIT)) + irq_ret = dmc520_edac_dram_ecc_isr(irq, mci, true); + + if ((irq_mask & DRAM_ECC_INT_UE_BIT) && + (status & DRAM_ECC_INT_UE_BIT)) + irq_ret = dmc520_edac_dram_ecc_isr(irq, mci, false); + + return irq_ret; +} + +static void dmc520_init_csrow(struct mem_ctl_info *mci) +{ + struct dmc520_edac *edac = mci->pvt_info; + struct csrow_info *csi; + struct dimm_info *dimm; + u32 pages_per_rank; + enum dev_type dt; + enum mem_type mt; + int row, ch; + u64 rs; + + dt = dmc520_get_dtype(edac); + mt = dmc520_get_mtype(edac); + rs = dmc520_get_rank_size(edac); + pages_per_rank = rs >> PAGE_SHIFT; + + for (row = 0; row < mci->nr_csrows; row++) { + csi = mci->csrows[row]; + + for (ch = 0; ch < csi->nr_channels; ch++) { + dimm = csi->channels[ch]->dimm; + dimm->grain = dmc520_get_memory_width(edac); + dimm->dtype = dt; + dimm->mtype = mt; + dimm->edac_mode = EDAC_FLAG_SECDED; + dimm->nr_pages = pages_per_rank / csi->nr_channels; + } + } +} + +static int dmc520_edac_probe(struct platform_device *pdev) +{ + bool registered[NUMBER_OF_IRQS] = {false}; + int irqs[NUMBER_OF_IRQS] = {-ENXIO}; + int masks[NUMBER_OF_IRQS] = {0}; + struct mem_ctl_info *mci = NULL; + struct edac_mc_layer layers[1]; + struct dmc520_edac *edac; + void __iomem *reg_base; + u32 irq_mask_all = 0; + int ret, idx, irq; + struct resource *res; + struct device *dev; + u32 reg_val; + + /* Parse the device node */ + dev = &pdev->dev; + + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { + irq = platform_get_irq_byname(pdev, dmc520_irq_configs[idx].name); + irqs[idx] = irq; + masks[idx] = dmc520_irq_configs[idx].mask; + if (irq >= 0) { + irq_mask_all |= dmc520_irq_configs[idx].mask; + edac_printk(KERN_INFO, EDAC_MOD_NAME, + "Discovered %s, irq: %d.\n", dmc520_irq_configs[idx].name, irq); + } + } + + if (irq_mask_all == 0) { + edac_printk(KERN_ERR, EDAC_MOD_NAME, + "At least one valid interrupt line is expected.\n"); + return -EINVAL; + } + + /* Initialize dmc520 edac */ + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + reg_base = devm_ioremap_resource(dev, res); + if (IS_ERR(reg_base)) + return PTR_ERR(reg_base); + + if (!dmc520_is_ecc_enabled(reg_base)) + return -ENXIO; + + layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; + layers[0].size = dmc520_get_rank_count(reg_base); + layers[0].is_virt_csrow = true; + + mci = edac_mc_alloc(dmc520_mc_idx++, ARRAY_SIZE(layers), layers, + sizeof(struct dmc520_edac)); + if (!mci) { + edac_printk(KERN_ERR, EDAC_MOD_NAME, + "Failed to allocate memory for mc instance\n"); + ret = -ENOMEM; + goto err; + } + + edac = mci->pvt_info; + + edac->reg_base = reg_base; + spin_lock_init(&edac->ecc_lock); + memcpy(edac->irqs, irqs, sizeof(irqs)); + memcpy(edac->masks, masks, sizeof(masks)); + + platform_set_drvdata(pdev, mci); + + mci->pdev = dev; + mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR4; + mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED; + mci->edac_cap = EDAC_FLAG_SECDED; + mci->scrub_cap = SCRUB_FLAG_HW_SRC; + mci->scrub_mode = dmc520_get_scrub_type(edac); + mci->ctl_name = EDAC_CTL_NAME; + mci->dev_name = dev_name(mci->pdev); + mci->mod_name = EDAC_MOD_NAME; + + edac_op_state = EDAC_OPSTATE_INT; + + dmc520_init_csrow(mci); + + /* Clear interrupts, not affecting other unrelated interrupts */ + reg_val = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_CONTROL); + dmc520_write_reg(edac, reg_val & (~irq_mask_all), + REG_OFFSET_INTERRUPT_CONTROL); + dmc520_write_reg(edac, irq_mask_all, REG_OFFSET_INTERRUPT_CLR); + + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { + irq = irqs[idx]; + if (irq >= 0) { + ret = devm_request_irq(&pdev->dev, irq, + dmc520_isr, IRQF_SHARED, + dev_name(&pdev->dev), mci); + if (ret < 0) { + edac_printk(KERN_ERR, EDAC_MC, + "Failed to request irq %d\n", irq); + goto err; + } + registered[idx] = true; + } + } + + /* Reset DRAM CE/UE counters */ + if (irq_mask_all & DRAM_ECC_INT_CE_BIT) + dmc520_get_dram_ecc_error_count(edac, true); + + if (irq_mask_all & DRAM_ECC_INT_UE_BIT) + dmc520_get_dram_ecc_error_count(edac, false); + + ret = edac_mc_add_mc(mci); + if (ret) { + edac_printk(KERN_ERR, EDAC_MOD_NAME, + "Failed to register with EDAC core\n"); + goto err; + } + + /* Enable interrupts, not affecting other unrelated interrupts */ + dmc520_write_reg(edac, reg_val | irq_mask_all, + REG_OFFSET_INTERRUPT_CONTROL); + + return 0; + +err: + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { + if (registered[idx]) { + devm_free_irq(&pdev->dev, edac->irqs[idx], mci); + } + } + if (mci) + edac_mc_free(mci); + + return ret; +} + +static int dmc520_edac_remove(struct platform_device *pdev) +{ + struct dmc520_edac *edac; + struct mem_ctl_info *mci; + u32 reg_val, idx, irq_mask_all = 0; + + mci = platform_get_drvdata(pdev); + edac = mci->pvt_info; + + /* free irq's */ + for (idx = 0; idx < NUMBER_OF_IRQS; idx++) { + if (edac->irqs[idx] >= 0) { + irq_mask_all |= edac->masks[idx]; + devm_free_irq(&pdev->dev, edac->irqs[idx], mci); + } + } + + /* Disable interrupts */ + reg_val = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_CONTROL); + dmc520_write_reg(edac, reg_val & (~irq_mask_all), + REG_OFFSET_INTERRUPT_CONTROL); + + edac_mc_del_mc(&pdev->dev); + edac_mc_free(mci); + + return 0; +} + +static const struct of_device_id dmc520_edac_driver_id[] = { + { .compatible = "arm,dmc-520", }, + { /* end of table */ } +}; + +MODULE_DEVICE_TABLE(of, dmc520_edac_driver_id); + +static struct platform_driver dmc520_edac_driver = { + .driver = { + .name = "dmc520", + .of_match_table = dmc520_edac_driver_id, + }, + + .probe = dmc520_edac_probe, + .remove = dmc520_edac_remove +}; + +module_platform_driver(dmc520_edac_driver); + +MODULE_AUTHOR("Rui Zhao <ruizhao@microsoft.com>"); +MODULE_AUTHOR("Lei Wang <lewan@microsoft.com>"); +MODULE_AUTHOR("Shiping Ji <shji@microsoft.com>"); +MODULE_DESCRIPTION("DMC-520 ECC driver"); +MODULE_LICENSE("GPL v2");