Message ID | 20241009124120.1124-3-shiju.jose@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | EDAC: Scrub: introduce generic EDAC RAS control feature driver + CXL/ACPI-RAS2 drivers | expand |
On Wed, 9 Oct 2024 13:41:03 +0100 <shiju.jose@huawei.com> wrote: > From: Shiju Jose <shiju.jose@huawei.com> > > Add generic EDAC scrub control in order to control the memory scrubbers > in the system. The device with scrub feature registers with EDAC device > driver, which retrieves the scrub descriptor from EDAC scrub driver and > expose the sysfs scrub control attributes for a scrub instance to userspace > in /sys/bus/edac/devices/<dev-name>/scrubX/. > > The common sysfs scrub control interface abstracts the control of an > arbitrary scrubbing functionality to a common set of functions. > The sysfs scrub attribute nodes would be present only if the client driver > has implemented the corresponding attribute callback function and passed > in ops to the EDAC device driver during registration. > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Signed-off-by: Shiju Jose <shiju.jose@huawei.com> One follow on comment. Otherwise LGTM and the new macros definitely help on reducing code. (I'm not normally fan of macros at the function level, but here I'm convinced) > > @@ -657,17 +686,19 @@ int edac_dev_register(struct device *parent, char *name, > > ret = dev_set_name(&ctx->dev, name); > if (ret) > - goto groups_free; > + goto data_mem_free; > > ret = device_register(&ctx->dev); > if (ret) { > put_device(&ctx->dev); > - goto groups_free; > + goto data_mem_free; As in previous patch I think this goto is incorrect and should be dropped. > return ret; > } > > return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); > > +data_mem_free: > + kfree(ctx->scrub); > groups_free: > kfree(ras_attr_groups); > ctx_free:
On Wed, Oct 09, 2024 at 01:41:03PM +0100, shiju.jose@huawei.com wrote: > diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub > new file mode 100644 > index 000000000000..b4f8f0bba17b > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-edac-scrub > @@ -0,0 +1,69 @@ > +What: /sys/bus/edac/devices/<dev-name>/scrubX > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory > + belongs to an instance of memory scrub control feature, > + where <dev-name> directory corresponds to a device/memory > + region registered with the EDAC device driver for the > + scrub control feature. > + The sysfs scrub attr nodes would be present only if the > + client driver has implemented the corresponding attr > + callback function and passed in ops to the EDAC RAS feature > + driver during registration. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_base > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The base of the address range of the memory region > + to be scrubbed (on-demand scrubbing). Why does this sound more complicated than it is? Why isn't this simply "addr" and the next one "size"? On-demand scrubbing should scrub at address "addr" and "size" bytes. Can't get any simpler than that. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_size > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The size of the address range of the memory region > + to be scrubbed (on-demand scrubbing). > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_background > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) Start/Stop background(patrol) scrubbing if supported. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_on_demand > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) Start/Stop on-demand scrubbing the memory region > + if supported. Why do you need a separate "enable" flag? Why can't it be: "writing into "addr" starts the on-demand scrubbing"? > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/min_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported minimum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/max_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RO) Supported maximum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/current_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@vger.kernel.org > +Description: > + (RW) The current scrub cycle duration in seconds and must be > + within the supported range by the memory scrubber. What are those three good for and why are they exposed? > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index 4edfb83ffbee..a96a74de8b9e 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o > > edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o > edac_core-y += edac_module.o edac_device_sysfs.o wq.o > +edac_core-y += scrub.o > > edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o > > diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c > index 0b8aa8150239..0c9da55d18bc 100644 > --- a/drivers/edac/edac_device.c > +++ b/drivers/edac/edac_device.c > @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev) > { > struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); > > + kfree(ctx->scrub); > kfree(ctx->dev.groups); > kfree(ctx); > } > @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char *name, > const struct edac_dev_feature *ras_features) > { > const struct attribute_group **ras_attr_groups; > + struct edac_dev_data *dev_data; > struct edac_dev_feat_ctx *ctx; > + int scrub_cnt = 0, scrub_inst = 0; > int attr_gcnt = 0; > int ret, feat; The EDAC tree preferred ordering of variable declarations at the beginning of a function is reverse fir tree order:: struct long_struct_name *descriptive_name; unsigned long foo, bar; unsigned int tmp; int ret; The above is faster to parse than the reverse ordering:: int ret; unsigned int tmp; unsigned long foo, bar; struct long_struct_name *descriptive_name; And even more so than random ordering:: unsigned long foo, bar; int ret; struct long_struct_name *descriptive_name; unsigned int tmp; > > @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name, > /* Double parse to make space for attributes */ > for (feat = 0; feat < num_features; feat++) { > switch (ras_features[feat].ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + attr_gcnt++; > + scrub_cnt++; > + break; > default: > return -EINVAL; > } > @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char *name, > goto ctx_free; > } > > + if (scrub_cnt) { > + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL); > + if (!ctx->scrub) { > + ret = -ENOMEM; > + goto groups_free; > + } > + } > + > attr_gcnt = 0; > for (feat = 0; feat < num_features; feat++, ras_features++) { > switch (ras_features->ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + if (!ras_features->scrub_ops) > + continue; Continue? I think fail. What is a scrub feature good for if it doesn't have ops? > + if (scrub_inst != ras_features->instance) > + goto data_mem_free; > + dev_data = &ctx->scrub[scrub_inst]; > + dev_data->instance = scrub_inst; > + dev_data->scrub_ops = ras_features->scrub_ops; > + dev_data->private = ras_features->ctx; > + ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt], > + ras_features->instance); > + if (ret) > + goto data_mem_free; > + scrub_inst++; > + attr_gcnt++; > + break; > default: > ret = -EINVAL; > - goto groups_free; > + goto data_mem_free; > } > } > ...
Thanks for the feedbacks. >-----Original Message----- >From: Borislav Petkov <bp@alien8.de> >Sent: 22 October 2024 20:05 >To: Shiju Jose <shiju.jose@huawei.com> >Cc: linux-edac@vger.kernel.org; linux-cxl@vger.kernel.org; linux- >acpi@vger.kernel.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org; >tony.luck@intel.com; rafael@kernel.org; lenb@kernel.org; >mchehab@kernel.org; dan.j.williams@intel.com; dave@stgolabs.net; Jonathan >Cameron <jonathan.cameron@huawei.com>; dave.jiang@intel.com; >alison.schofield@intel.com; vishal.l.verma@intel.com; ira.weiny@intel.com; >david@redhat.com; Vilas.Sridharan@amd.com; leo.duran@amd.com; >Yazen.Ghannam@amd.com; rientjes@google.com; jiaqiyan@google.com; >Jon.Grimm@amd.com; dave.hansen@linux.intel.com; >naoya.horiguchi@nec.com; james.morse@arm.com; jthoughton@google.com; >somasundaram.a@hpe.com; erdemaktas@google.com; pgonda@google.com; >duenwen@google.com; gthelen@google.com; >wschwartz@amperecomputing.com; dferguson@amperecomputing.com; >wbs@os.amperecomputing.com; nifan.cxl@gmail.com; tanxiaofei ><tanxiaofei@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>; Roberto >Sassu <roberto.sassu@huawei.com>; kangkang.shen@futurewei.com; >wanghuiqiang <wanghuiqiang@huawei.com>; Linuxarm ><linuxarm@huawei.com> >Subject: Re: [PATCH v13 02/18] EDAC: Add scrub control feature > >On Wed, Oct 09, 2024 at 01:41:03PM +0100, shiju.jose@huawei.com wrote: >> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub >> b/Documentation/ABI/testing/sysfs-edac-scrub >> new file mode 100644 >> index 000000000000..b4f8f0bba17b >> --- /dev/null >> +++ b/Documentation/ABI/testing/sysfs-edac-scrub >> @@ -0,0 +1,69 @@ >> +What: /sys/bus/edac/devices/<dev-name>/scrubX >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory >> + belongs to an instance of memory scrub control feature, >> + where <dev-name> directory corresponds to a device/memory >> + region registered with the EDAC device driver for the >> + scrub control feature. >> + The sysfs scrub attr nodes would be present only if the >> + client driver has implemented the corresponding attr >> + callback function and passed in ops to the EDAC RAS feature >> + driver during registration. >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/addr_range_base >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RW) The base of the address range of the memory region >> + to be scrubbed (on-demand scrubbing). > >Why does this sound more complicated than it is? > >Why isn't this simply "addr" and the next one "size"? > >On-demand scrubbing should scrub at address "addr" and "size" bytes. Can't get >any simpler than that. Sure. Will modify to addr and size. > >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/addr_range_size >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RW) The size of the address range of the memory region >> + to be scrubbed (on-demand scrubbing). >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/enable_background >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RW) Start/Stop background(patrol) scrubbing if supported. >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/enable_on_demand >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RW) Start/Stop on-demand scrubbing the memory region >> + if supported. > >Why do you need a separate "enable" flag? > >Why can't it be: "writing into "addr" starts the on-demand scrubbing"? If 'enable' attribute is removed , then there is an ordering with setting address + size. Also user space can't check whether scrubbing is enabled or not. > >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/min_cycle_duration >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RO) Supported minimum scrub cycle duration in seconds >> + by the memory scrubber. >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/max_cycle_duration >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RO) Supported maximum scrub cycle duration in seconds >> + by the memory scrubber. >> + >> +What: /sys/bus/edac/devices/<dev- >name>/scrubX/current_cycle_duration >> +Date: Oct 2024 >> +KernelVersion: 6.12 >> +Contact: linux-edac@vger.kernel.org >> +Description: >> + (RW) The current scrub cycle duration in seconds and must be >> + within the supported range by the memory scrubber. > >What are those three good for and why are they exposed? Scrub has an overhead when running and that may want to be reduced by just taking longer to do it. Min and max scrub cycle duration returns the range of scrub rate supported by the device. > >> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index >> 4edfb83ffbee..a96a74de8b9e 100644 >> --- a/drivers/edac/Makefile >> +++ b/drivers/edac/Makefile >> @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o >> >> edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o >> edac_core-y += edac_module.o edac_device_sysfs.o wq.o >> +edac_core-y += scrub.o >> >> edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o >> >> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c >> index 0b8aa8150239..0c9da55d18bc 100644 >> --- a/drivers/edac/edac_device.c >> +++ b/drivers/edac/edac_device.c >> @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev) >> { >> struct edac_dev_feat_ctx *ctx = container_of(dev, struct >> edac_dev_feat_ctx, dev); >> >> + kfree(ctx->scrub); >> kfree(ctx->dev.groups); >> kfree(ctx); >> } >> @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char >*name, >> const struct edac_dev_feature *ras_features) { >> const struct attribute_group **ras_attr_groups; >> + struct edac_dev_data *dev_data; >> struct edac_dev_feat_ctx *ctx; >> + int scrub_cnt = 0, scrub_inst = 0; >> int attr_gcnt = 0; >> int ret, feat; > >The EDAC tree preferred ordering of variable declarations at the beginning of a >function is reverse fir tree order:: I used the reverse tree ordering, but missed here. I will correct. > > struct long_struct_name *descriptive_name; > unsigned long foo, bar; > unsigned int tmp; > int ret; > >The above is faster to parse than the reverse ordering:: > > int ret; > unsigned int tmp; > unsigned long foo, bar; > struct long_struct_name *descriptive_name; > >And even more so than random ordering:: > > unsigned long foo, bar; > int ret; > struct long_struct_name *descriptive_name; > unsigned int tmp; > >> >> @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char >*name, >> /* Double parse to make space for attributes */ >> for (feat = 0; feat < num_features; feat++) { >> switch (ras_features[feat].ft_type) { >> - /* Add feature specific code */ >> + case RAS_FEAT_SCRUB: >> + attr_gcnt++; >> + scrub_cnt++; >> + break; >> default: >> return -EINVAL; >> } >> @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char >*name, >> goto ctx_free; >> } >> >> + if (scrub_cnt) { >> + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), >GFP_KERNEL); >> + if (!ctx->scrub) { >> + ret = -ENOMEM; >> + goto groups_free; >> + } >> + } >> + >> attr_gcnt = 0; >> for (feat = 0; feat < num_features; feat++, ras_features++) { >> switch (ras_features->ft_type) { >> - /* Add feature specific code */ >> + case RAS_FEAT_SCRUB: >> + if (!ras_features->scrub_ops) >> + continue; > >Continue? > >I think fail. What is a scrub feature good for if it doesn't have ops? Here continue to check any other feature (for eg. ECS, memory repair or another scrub instance) listed by the parent device in the ras_features[]. > >> + if (scrub_inst != ras_features->instance) >> + goto data_mem_free; >> + dev_data = &ctx->scrub[scrub_inst]; >> + dev_data->instance = scrub_inst; >> + dev_data->scrub_ops = ras_features->scrub_ops; >> + dev_data->private = ras_features->ctx; >> + ret = edac_scrub_get_desc(parent, >&ras_attr_groups[attr_gcnt], >> + ras_features->instance); >> + if (ret) >> + goto data_mem_free; >> + scrub_inst++; >> + attr_gcnt++; >> + break; >> default: >> ret = -EINVAL; >> - goto groups_free; >> + goto data_mem_free; >> } >> } >> > >... > >-- >Regards/Gruss, > Boris. > >https://people.kernel.org/tglx/notes-about-netiquette Thanks, Shiju
On Wed, Oct 23, 2024 at 04:04:05PM +0000, Shiju Jose wrote: > >Why do you need a separate "enable" flag? > > > >Why can't it be: "writing into "addr" starts the on-demand scrubbing"? > If 'enable' attribute is removed , then there is an ordering with setting address + size. No, there won't be. You clarify the ordering and if someone doesn't adhere to it, you check for 0 values and return. > Also user space can't check whether scrubbing is enabled or not. That one is semi-valid. You can set addr to 0 when scrubbing is done but then userspace might wanna know which address it scrubbed. > >What are those three good for and why are they exposed? > Scrub has an overhead when running and that may want to be reduced by > just taking longer to do it. > Min and max scrub cycle duration returns the range of scrub rate > supported by the device. This *definitely* needs to be part of the documentation explaining the API. > >I think fail. What is a scrub feature good for if it doesn't have ops? > Here continue to check any other feature (for eg. ECS, memory repair or another scrub instance) listed > by the parent device in the ras_features[]. Why would you tolerate a semi-broken feature? This is all open source code. People will fix it when they test their feature which is missing ops. There's no point in allowing any of that. Btw, do me a favor, pls, and trim your mails when you reply just like I do. You don't want to leave text quoted to which you are not replying to. Thx.
diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub new file mode 100644 index 000000000000..b4f8f0bba17b --- /dev/null +++ b/Documentation/ABI/testing/sysfs-edac-scrub @@ -0,0 +1,69 @@ +What: /sys/bus/edac/devices/<dev-name>/scrubX +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory + belongs to an instance of memory scrub control feature, + where <dev-name> directory corresponds to a device/memory + region registered with the EDAC device driver for the + scrub control feature. + The sysfs scrub attr nodes would be present only if the + client driver has implemented the corresponding attr + callback function and passed in ops to the EDAC RAS feature + driver during registration. + +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_base +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RW) The base of the address range of the memory region + to be scrubbed (on-demand scrubbing). + +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_size +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RW) The size of the address range of the memory region + to be scrubbed (on-demand scrubbing). + +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_background +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Start/Stop background(patrol) scrubbing if supported. + +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_on_demand +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Start/Stop on-demand scrubbing the memory region + if supported. + +What: /sys/bus/edac/devices/<dev-name>/scrubX/min_cycle_duration +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RO) Supported minimum scrub cycle duration in seconds + by the memory scrubber. + +What: /sys/bus/edac/devices/<dev-name>/scrubX/max_cycle_duration +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RO) Supported maximum scrub cycle duration in seconds + by the memory scrubber. + +What: /sys/bus/edac/devices/<dev-name>/scrubX/current_cycle_duration +Date: Oct 2024 +KernelVersion: 6.12 +Contact: linux-edac@vger.kernel.org +Description: + (RW) The current scrub cycle duration in seconds and must be + within the supported range by the memory scrubber. diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 4edfb83ffbee..a96a74de8b9e 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o edac_core-y += edac_module.o edac_device_sysfs.o wq.o +edac_core-y += scrub.o edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c index 0b8aa8150239..0c9da55d18bc 100644 --- a/drivers/edac/edac_device.c +++ b/drivers/edac/edac_device.c @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev) { struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); + kfree(ctx->scrub); kfree(ctx->dev.groups); kfree(ctx); } @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char *name, const struct edac_dev_feature *ras_features) { const struct attribute_group **ras_attr_groups; + struct edac_dev_data *dev_data; struct edac_dev_feat_ctx *ctx; + int scrub_cnt = 0, scrub_inst = 0; int attr_gcnt = 0; int ret, feat; @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name, /* Double parse to make space for attributes */ for (feat = 0; feat < num_features; feat++) { switch (ras_features[feat].ft_type) { - /* Add feature specific code */ + case RAS_FEAT_SCRUB: + attr_gcnt++; + scrub_cnt++; + break; default: return -EINVAL; } @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char *name, goto ctx_free; } + if (scrub_cnt) { + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL); + if (!ctx->scrub) { + ret = -ENOMEM; + goto groups_free; + } + } + attr_gcnt = 0; for (feat = 0; feat < num_features; feat++, ras_features++) { switch (ras_features->ft_type) { - /* Add feature specific code */ + case RAS_FEAT_SCRUB: + if (!ras_features->scrub_ops) + continue; + if (scrub_inst != ras_features->instance) + goto data_mem_free; + dev_data = &ctx->scrub[scrub_inst]; + dev_data->instance = scrub_inst; + dev_data->scrub_ops = ras_features->scrub_ops; + dev_data->private = ras_features->ctx; + ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt], + ras_features->instance); + if (ret) + goto data_mem_free; + scrub_inst++; + attr_gcnt++; + break; default: ret = -EINVAL; - goto groups_free; + goto data_mem_free; } } @@ -657,17 +686,19 @@ int edac_dev_register(struct device *parent, char *name, ret = dev_set_name(&ctx->dev, name); if (ret) - goto groups_free; + goto data_mem_free; ret = device_register(&ctx->dev); if (ret) { put_device(&ctx->dev); - goto groups_free; + goto data_mem_free; return ret; } return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); +data_mem_free: + kfree(ctx->scrub); groups_free: kfree(ras_attr_groups); ctx_free: diff --git a/drivers/edac/scrub.c b/drivers/edac/scrub.c new file mode 100755 index 000000000000..acc36ec7c926 --- /dev/null +++ b/drivers/edac/scrub.c @@ -0,0 +1,223 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Generic EDAC scrub driver in order to control the memory + * scrubbers in the system and the common sysfs scrub interface + * abstracts the control of an arbitrary scrubbing functionality + * to a common set of functions. + * + * Copyright (c) 2024 HiSilicon Limited. + */ + +#define pr_fmt(fmt) "EDAC SCRUB: " fmt + +#include <linux/edac.h> + +enum edac_scrub_attributes { + SCRUB_ADDR_RANGE_BASE, + SCRUB_ADDR_RANGE_SIZE, + SCRUB_ENABLE_BACKGROUND, + SCRUB_ENABLE_ON_DEMAND, + SCRUB_MIN_CYCLE_DURATION, + SCRUB_MAX_CYCLE_DURATION, + SCRUB_CUR_CYCLE_DURATION, + SCRUB_MAX_ATTRS +}; + +struct edac_scrub_dev_attr { + struct device_attribute dev_attr; + u8 instance; +}; + +struct edac_scrub_context { + char name[EDAC_FEAT_NAME_LEN]; + struct edac_scrub_dev_attr scrub_dev_attr[SCRUB_MAX_ATTRS]; + struct attribute *scrub_attrs[SCRUB_MAX_ATTRS + 1]; + struct attribute_group group; +}; + +#define TO_SCRUB_DEV_ATTR(_dev_attr) \ + container_of(_dev_attr, struct edac_scrub_dev_attr, dev_attr) + +#define EDAC_SCRUB_ATTR_SHOW(attrib, cb, type, format) \ +static ssize_t attrib##_show(struct device *ras_feat_dev, \ + struct device_attribute *attr, char *buf) \ +{ \ + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ + type data; \ + int ret; \ + \ + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, &data); \ + if (ret) \ + return ret; \ + \ + return sysfs_emit(buf, format, data); \ +} + +EDAC_SCRUB_ATTR_SHOW(addr_range_base, read_range_base, u64, "0x%llx\n") +EDAC_SCRUB_ATTR_SHOW(addr_range_size, read_range_size, u64, "0x%llx\n") +EDAC_SCRUB_ATTR_SHOW(enable_background, get_enabled_bg, bool, "%u\n") +EDAC_SCRUB_ATTR_SHOW(enable_on_demand, get_enabled_od, bool, "%u\n") +EDAC_SCRUB_ATTR_SHOW(min_cycle_duration, get_min_cycle, u32, "%u\n") +EDAC_SCRUB_ATTR_SHOW(max_cycle_duration, get_max_cycle, u32, "%u\n") +EDAC_SCRUB_ATTR_SHOW(current_cycle_duration, get_cycle_duration, u32, "%u\n") + +#define EDAC_SCRUB_ATTR_STORE(attrib, cb, type, conv_func) \ +static ssize_t attrib##_store(struct device *ras_feat_dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + u8 inst = TO_SCRUB_DEV_ATTR(attr)->instance; \ + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; \ + type data; \ + int ret; \ + \ + ret = conv_func(buf, 0, &data); \ + if (ret < 0) \ + return ret; \ + \ + ret = ops->cb(ras_feat_dev->parent, ctx->scrub[inst].private, data); \ + if (ret) \ + return ret; \ + \ + return len; \ +} + +EDAC_SCRUB_ATTR_STORE(addr_range_base, write_range_base, u64, kstrtou64) +EDAC_SCRUB_ATTR_STORE(addr_range_size, write_range_size, u64, kstrtou64) +EDAC_SCRUB_ATTR_STORE(enable_background, set_enabled_bg, unsigned long, kstrtoul) +EDAC_SCRUB_ATTR_STORE(enable_on_demand, set_enabled_od, unsigned long, kstrtoul) +EDAC_SCRUB_ATTR_STORE(current_cycle_duration, set_cycle_duration, unsigned long, kstrtoul) + +static umode_t scrub_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id) +{ + struct device *ras_feat_dev = kobj_to_dev(kobj); + struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr); + u8 inst = TO_SCRUB_DEV_ATTR(dev_attr)->instance; + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); + const struct edac_scrub_ops *ops = ctx->scrub[inst].scrub_ops; + + switch (attr_id) { + case SCRUB_ADDR_RANGE_BASE: + if (ops->read_range_base) { + if (ops->write_range_base) + return a->mode; + else + return 0444; + } + break; + case SCRUB_ADDR_RANGE_SIZE: + if (ops->read_range_size) { + if (ops->write_range_size) + return a->mode; + else + return 0444; + } + break; + case SCRUB_ENABLE_BACKGROUND: + if (ops->get_enabled_bg) { + if (ops->set_enabled_bg) + return a->mode; + else + return 0444; + } + break; + case SCRUB_ENABLE_ON_DEMAND: + if (ops->get_enabled_od) { + if (ops->set_enabled_od) + return a->mode; + else + return 0444; + } + break; + case SCRUB_MIN_CYCLE_DURATION: + if (ops->get_min_cycle) + return a->mode; + break; + case SCRUB_MAX_CYCLE_DURATION: + if (ops->get_max_cycle) + return a->mode; + break; + case SCRUB_CUR_CYCLE_DURATION: + if (ops->get_cycle_duration) { + if (ops->set_cycle_duration) + return a->mode; + else + return 0444; + } + break; + default: + break; + } + + return 0; +} + +#define EDAC_SCRUB_ATTR_RO(_name, _instance) \ + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RO(_name), \ + .instance = _instance }) + +#define EDAC_SCRUB_ATTR_WO(_name, _instance) \ + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_WO(_name), \ + .instance = _instance }) + +#define EDAC_SCRUB_ATTR_RW(_name, _instance) \ + ((struct edac_scrub_dev_attr) { .dev_attr = __ATTR_RW(_name), \ + .instance = _instance }) + +static int scrub_create_desc(struct device *scrub_dev, + const struct attribute_group **attr_groups, u8 instance) +{ + struct edac_scrub_context *scrub_ctx; + struct attribute_group *group; + int i; + struct edac_scrub_dev_attr dev_attr[] = { + [SCRUB_ADDR_RANGE_BASE] = EDAC_SCRUB_ATTR_RW(addr_range_base, instance), + [SCRUB_ADDR_RANGE_SIZE] = EDAC_SCRUB_ATTR_RW(addr_range_size, instance), + [SCRUB_ENABLE_BACKGROUND] = EDAC_SCRUB_ATTR_RW(enable_background, instance), + [SCRUB_ENABLE_ON_DEMAND] = EDAC_SCRUB_ATTR_RW(enable_on_demand, instance), + [SCRUB_MIN_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(min_cycle_duration, instance), + [SCRUB_MAX_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RO(max_cycle_duration, instance), + [SCRUB_CUR_CYCLE_DURATION] = EDAC_SCRUB_ATTR_RW(current_cycle_duration, instance) + }; + + scrub_ctx = devm_kzalloc(scrub_dev, sizeof(*scrub_ctx), GFP_KERNEL); + if (!scrub_ctx) + return -ENOMEM; + + group = &scrub_ctx->group; + for (i = 0; i < SCRUB_MAX_ATTRS; i++) { + memcpy(&scrub_ctx->scrub_dev_attr[i], &dev_attr[i], sizeof(dev_attr[i])); + scrub_ctx->scrub_attrs[i] = &scrub_ctx->scrub_dev_attr[i].dev_attr.attr; + } + sprintf(scrub_ctx->name, "%s%d", "scrub", instance); + group->name = scrub_ctx->name; + group->attrs = scrub_ctx->scrub_attrs; + group->is_visible = scrub_attr_visible; + + attr_groups[0] = group; + + return 0; +} + +/** + * edac_scrub_get_desc - get EDAC scrub descriptors + * @scrub_dev: client device, with scrub support + * @attr_groups: pointer to attribute group container + * @instance: device's scrub instance number. + * + * Return: + * * %0 - Success. + * * %-EINVAL - Invalid parameters passed. + * * %-ENOMEM - Dynamic memory allocation failed. + */ +int edac_scrub_get_desc(struct device *scrub_dev, + const struct attribute_group **attr_groups, u8 instance) +{ + if (!scrub_dev || !attr_groups) + return -EINVAL; + + return scrub_create_desc(scrub_dev, attr_groups, instance); +} diff --git a/include/linux/edac.h b/include/linux/edac.h index 1db008a82690..5344e2cf6808 100644 --- a/include/linux/edac.h +++ b/include/linux/edac.h @@ -668,11 +668,47 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci, /* RAS feature type */ enum edac_dev_feat { + RAS_FEAT_SCRUB, RAS_FEAT_MAX }; +/** + * struct scrub_ops - scrub device operations (all elements optional) + * @read_range_base: read base of scrubbing range. + * @read_range_size: read offset of scrubbing range. + * @write_range_base: set base of the scrubbing range. + * @write_range_size: set offset of the scrubbing range. + * @get_enabled_bg: check if currently performing background scrub. + * @set_enabled_bg: start or stop a bg-scrub. + * @get_enabled_od: check if currently performing on-demand scrub. + * @set_enabled_od: start or stop an on-demand scrub. + * @get_min_cycle: get minimum supported scrub cycle duration in seconds. + * @get_max_cycle: get maximum supported scrub cycle duration in seconds. + * @get_cycle_duration: get current scrub cycle duration in seconds. + * @set_cycle_duration: set current scrub cycle duration in seconds. + */ +struct edac_scrub_ops { + int (*read_range_base)(struct device *dev, void *drv_data, u64 *base); + int (*read_range_size)(struct device *dev, void *drv_data, u64 *size); + int (*write_range_base)(struct device *dev, void *drv_data, u64 base); + int (*write_range_size)(struct device *dev, void *drv_data, u64 size); + int (*get_enabled_bg)(struct device *dev, void *drv_data, bool *enable); + int (*set_enabled_bg)(struct device *dev, void *drv_data, bool enable); + int (*get_enabled_od)(struct device *dev, void *drv_data, bool *enable); + int (*set_enabled_od)(struct device *dev, void *drv_data, bool enable); + int (*get_min_cycle)(struct device *dev, void *drv_data, u32 *min); + int (*get_max_cycle)(struct device *dev, void *drv_data, u32 *max); + int (*get_cycle_duration)(struct device *dev, void *drv_data, u32 *cycle); + int (*set_cycle_duration)(struct device *dev, void *drv_data, u32 cycle); +}; + +int edac_scrub_get_desc(struct device *scrub_dev, + const struct attribute_group **attr_groups, + u8 instance); + /* EDAC device feature information structure */ struct edac_dev_data { + const struct edac_scrub_ops *scrub_ops; u8 instance; void *private; }; @@ -682,11 +718,13 @@ struct device; struct edac_dev_feat_ctx { struct device dev; void *private; + struct edac_dev_data *scrub; }; struct edac_dev_feature { enum edac_dev_feat ft_type; u8 instance; + const struct edac_scrub_ops *scrub_ops; void *ctx; };