Message ID | 20250320180450.539-1-shiju.jose@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | cxl: support CXL memory RAS features | expand |
Le 20/03/2025 à 19:04, shiju.jose@huawei.com a écrit : > From: Shiju Jose <shiju.jose@huawei.com> > > Support for CXL memory RAS features: patrol scrub, ECS, soft-PPR and > memory sparing. > > This CXL series was part of the EDAC series [1]. > > The code is based on cxl.git next branch [2] merged with ras.git edac-cxl > branch [3]. > > 1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/ > 2. https://web.git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=next > 3. https://web.git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-cxl > > Userspace code for CXL memory repair features [4] and > sample boot-script for CXL memory repair [5]. > > [4]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/ > [5]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/ The title for the series is quite confusing, CXL seems to be something else. There is a series here [1] that removes CXL driver, but after looking it seems to be something completely different. [1] https://lore.kernel.org/all/20250219070007.177725-1-ajd@linux.ibm.com/ Christophe > > Changes > ======= > v1 -> v2: > 1. Feedbacks from Dan Williams on v1, > https://lore.kernel.org/linux-mm/20250307091137.00006a0a@huawei.com/T/ > - Fixed lock issues in region scrubbing, added local cxl_acquire() > and cxl_unlock. > - Replaced CXL examples using cat and echo from EDAC .rst docs > with short description and ref to ABI docs. Also corrections > in existing descriptions as suggested by Dan. > - Add policy description for the scrub control feature. > However this may require inputs from CXL experts. > - Replaced CONFIG_CXL_RAS_FEATURES with CONFIG_CXL_EDAC_MEM_FEATURES. > - Few changes to depends part of CONFIG_CXL_EDAC_MEM_FEATURES. > - Rename drivers/cxl/core/memfeatures.c as drivers/cxl/core/edac.c > - snprintf() -> kasprintf() in few places. > > 2. Feedbacks from Alison on v1, > - In cxl_get_feature_entry()(patch 1), return NULL on failures and > reintroduced checks in cxl_get_feature_entry(). > - Changed logic in for loop in region based scrubbing code. > - Replace cxl_are_decoders_committed() to cxl_is_memdev_memory_online() > and add as a local function to drivers/cxl/core/edac.c > - Changed few multiline comments to single line comments. > - Removed unnecessary comments from the code. > - Reduced line length of few macros in ECS and memory repair code. > - In new files, changed "GPL-2.0-or-later" -> "GPL-2.0-only". > - Ran clang-format for new files and updated. > 3. Changes for feedbacks from Jonathan on v1. > - Changed few multiline comments to single line comments. > > Shiju Jose (8): > cxl: Add helper function to retrieve a feature entry > EDAC: Update documentation for the CXL memory patrol scrub control > feature > cxl/edac: Add CXL memory device patrol scrub control feature > cxl/edac: Add CXL memory device ECS control feature > cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command > cxl: Support for finding memory operation attributes from the current > boot > cxl/memfeature: Add CXL memory device soft PPR control feature > cxl/memfeature: Add CXL memory device memory sparing control feature > > Documentation/edac/memory_repair.rst | 31 + > Documentation/edac/scrub.rst | 47 + > drivers/cxl/Kconfig | 27 + > drivers/cxl/core/Makefile | 1 + > drivers/cxl/core/core.h | 2 + > drivers/cxl/core/edac.c | 1730 ++++++++++++++++++++++++++ > drivers/cxl/core/features.c | 23 + > drivers/cxl/core/mbox.c | 45 +- > drivers/cxl/core/memdev.c | 9 + > drivers/cxl/core/ras.c | 145 +++ > drivers/cxl/core/region.c | 5 + > drivers/cxl/cxlmem.h | 73 ++ > drivers/cxl/mem.c | 4 + > drivers/cxl/pci.c | 3 + > drivers/edac/mem_repair.c | 9 + > include/linux/edac.h | 7 + > 16 files changed, 2159 insertions(+), 2 deletions(-) > create mode 100644 drivers/cxl/core/edac.c >
From: Shiju Jose <shiju.jose@huawei.com> Support for CXL memory RAS features: patrol scrub, ECS, soft-PPR and memory sparing. This CXL series was part of the EDAC series [1]. The code is based on cxl.git next branch [2] merged with ras.git edac-cxl branch [3]. 1. https://lore.kernel.org/linux-cxl/20250212143654.1893-1-shiju.jose@huawei.com/ 2. https://web.git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=next 3. https://web.git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-cxl Userspace code for CXL memory repair features [4] and sample boot-script for CXL memory repair [5]. [4]: https://lore.kernel.org/lkml/20250207143028.1865-1-shiju.jose@huawei.com/ [5]: https://lore.kernel.org/lkml/20250207143028.1865-5-shiju.jose@huawei.com/ Changes ======= v1 -> v2: 1. Feedbacks from Dan Williams on v1, https://lore.kernel.org/linux-mm/20250307091137.00006a0a@huawei.com/T/ - Fixed lock issues in region scrubbing, added local cxl_acquire() and cxl_unlock. - Replaced CXL examples using cat and echo from EDAC .rst docs with short description and ref to ABI docs. Also corrections in existing descriptions as suggested by Dan. - Add policy description for the scrub control feature. However this may require inputs from CXL experts. - Replaced CONFIG_CXL_RAS_FEATURES with CONFIG_CXL_EDAC_MEM_FEATURES. - Few changes to depends part of CONFIG_CXL_EDAC_MEM_FEATURES. - Rename drivers/cxl/core/memfeatures.c as drivers/cxl/core/edac.c - snprintf() -> kasprintf() in few places. 2. Feedbacks from Alison on v1, - In cxl_get_feature_entry()(patch 1), return NULL on failures and reintroduced checks in cxl_get_feature_entry(). - Changed logic in for loop in region based scrubbing code. - Replace cxl_are_decoders_committed() to cxl_is_memdev_memory_online() and add as a local function to drivers/cxl/core/edac.c - Changed few multiline comments to single line comments. - Removed unnecessary comments from the code. - Reduced line length of few macros in ECS and memory repair code. - In new files, changed "GPL-2.0-or-later" -> "GPL-2.0-only". - Ran clang-format for new files and updated. 3. Changes for feedbacks from Jonathan on v1. - Changed few multiline comments to single line comments. Shiju Jose (8): cxl: Add helper function to retrieve a feature entry EDAC: Update documentation for the CXL memory patrol scrub control feature cxl/edac: Add CXL memory device patrol scrub control feature cxl/edac: Add CXL memory device ECS control feature cxl/mbox: Add support for PERFORM_MAINTENANCE mailbox command cxl: Support for finding memory operation attributes from the current boot cxl/memfeature: Add CXL memory device soft PPR control feature cxl/memfeature: Add CXL memory device memory sparing control feature Documentation/edac/memory_repair.rst | 31 + Documentation/edac/scrub.rst | 47 + drivers/cxl/Kconfig | 27 + drivers/cxl/core/Makefile | 1 + drivers/cxl/core/core.h | 2 + drivers/cxl/core/edac.c | 1730 ++++++++++++++++++++++++++ drivers/cxl/core/features.c | 23 + drivers/cxl/core/mbox.c | 45 +- drivers/cxl/core/memdev.c | 9 + drivers/cxl/core/ras.c | 145 +++ drivers/cxl/core/region.c | 5 + drivers/cxl/cxlmem.h | 73 ++ drivers/cxl/mem.c | 4 + drivers/cxl/pci.c | 3 + drivers/edac/mem_repair.c | 9 + include/linux/edac.h | 7 + 16 files changed, 2159 insertions(+), 2 deletions(-) create mode 100644 drivers/cxl/core/edac.c