From patchwork Fri Feb 23 14:37:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 13569146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D699C5478C for ; Fri, 23 Feb 2024 14:37:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 574416B0074; Fri, 23 Feb 2024 09:37:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 523F66B0075; Fri, 23 Feb 2024 09:37:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C5386B0078; Fri, 23 Feb 2024 09:37:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2DAAF6B0074 for ; Fri, 23 Feb 2024 09:37:42 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 07F6514116D for ; Fri, 23 Feb 2024 14:37:42 +0000 (UTC) X-FDA: 81823322364.27.76B7F14 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf26.hostedemail.com (Postfix) with ESMTP id E4342140007 for ; Fri, 23 Feb 2024 14:37:38 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708699060; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=Tu/dPvrNEHJvUIMMGlbXLq52LTqNCu4et9LcEjaOMHA=; b=0i8o/+hzcdqbVxs3Drc8RZYU/1eIHspZkM/OrSbEgR3nbWNyQ2I/qdQgQoswNAHgr8deXA vFCxY3wUEaMILl5xVLRELLoaUm8rBk4CRhm2uJVBJnihZXfH3hOaJ6UDQVal62/rDd/wKb 8oniIUL+m+5LJ88494m6PTdn2/psqTY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of shiju.jose@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=shiju.jose@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708699060; a=rsa-sha256; cv=none; b=yLFD3EmGr/PTUUnL2eFplYdR9BeLH+wbD3HyTGFm+4OgqKo8dw7+7KGgxv4mo+tP+4WtS0 x7VcYMWvHHXikgvoB9ELiawboXZr03lHEKv64IPkyX3vmGdnaHuHIdX/LBMZO6FDum8MZd 8thpBLAHPsVmliAudCAr8jSiDcd+lqE= Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4ThCD5296xz6J9yc; Fri, 23 Feb 2024 22:33:09 +0800 (CST) Received: from lhrpeml500006.china.huawei.com (unknown [7.191.161.198]) by mail.maildlp.com (Postfix) with ESMTPS id E8E88140B33; Fri, 23 Feb 2024 22:37:34 +0800 (CST) Received: from SecurePC30232.china.huawei.com (10.122.247.234) by lhrpeml500006.china.huawei.com (7.191.161.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 23 Feb 2024 14:37:33 +0000 From: To: , , , , , , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [RFC PATCH v7 00/12] memory: scrub: introduce subsystem + CXL/ACPI-RAS2 drivers Date: Fri, 23 Feb 2024 22:37:11 +0800 Message-ID: <20240223143723.1574-1-shiju.jose@huawei.com> X-Mailer: git-send-email 2.35.1.windows.2 MIME-Version: 1.0 X-Originating-IP: [10.122.247.234] X-ClientProxiedBy: lhrpeml500003.china.huawei.com (7.191.162.67) To lhrpeml500006.china.huawei.com (7.191.161.198) X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E4342140007 X-Stat-Signature: qibpu3eueoezxyjrmgi3xw4xxc1dh93p X-HE-Tag: 1708699058-347253 X-HE-Meta: U2FsdGVkX19XjyOegmUpR63NddV+n6VE2LpgJic6CgFLF6Z72Jy2BcwMXSl/qM7ZYHxAOflT89adSno+AfQfR0wU4kolQDRALOAC0EZ0/yM1h7cZ3NgK0aROLQ3wA/3H1Bn5wu8TJ4sWj2Dg924z0UlAUBC3kwU9bAETxPfy4S4xDtJKQny49SbOoa2X3bDccE2QW5VoZX0ct1ng6+qWIUv1l+UCoCAP/lkqnsAtR6b+dJex15lPEdhlOcktp0FOdCKqvtiv/d6R8yBPrBoRgh6vF6CrI/BQdJoEsropBiZA+Jpy/PR4MHm2CDeyoZD0085eVBAZwaZPs9x/XH6IjHzvE/Wsy1sHxQK3GgV39xpfs8w5/B7WHe01y9e2qK4rr2pLZXSO7BnqaALuK4CRQuNmnDZfgt75TyGQTqGolVMdKZmp4mSA3jiGfnrD6lH7f1OjAITnVX+6f+S3mfvTGVGbjYyyD2IedK8XVEDypq+wyICxwL4EVUmEWhLX82nMx0zgaAsEuAdX3+fvyFlH4tRk9FYncvpN9QDOnlj/QpabdoSQpLLPgHDJBFfQZWJm0Jkocsde/r4RSNMqecad+ELe/7t025lQWYOT+lQKbcDl01skVZYNEWLnqumrWXZP3dwGUnlOL8NBka+agH1m7LGRlMkZIefNVqIVJRPHtggiYWXCPDGoozSDbByw6H+PWMbKkAlj1yNAY+40hmtZ/vQrcwdbW7kODdR0uT9UgnlZoM5kBn+GSdDvXkhefK0mZ85bY+IumJMLboceEyTTJyGHnT/LK7MuBPFAJw+RFBxRd96Wfj4AQGN6Z6lnWcS4DYtW9U72hJclcWLgVyLgWOEUhuvU9meOLaqI7onC5Q3e/ATKbn3gxQGnI08y2iMOKVpARhHOZrvbgDqTPuLPygF+6QN+tmWDjl2d+cqT9jb2OHMq9/1wSBGdNBfWHfuuUBBXwzqOV9hTul8BLBb 6cUNtyo7 yAJlV+yN72JWjxBy/hXm3N+q1dWOlIlFPzzJxU6Sm5ZN4yvDiY53NCVnnpEkIpEvvhOd9Fw0QcuNi+3APo6sRGOobdmOdLRTRMwFMmPP2FFMGcx7aKsvNn5TPp7o+8abDUlSLx0wkCOOU2lItXX45iI8mdzL/Kbm2LicuAFJONruImfCSF1o9iiu9iC7xmQM97EdWs02TQnVs32liGsbyWqeTGkgh97HZIaAhjQen97Vr1J5HyGNvagnjmZcVT+8p6ShJSz+5vnffR5rbj1C6BCEtMDg/DfJy3Lv/kiFhZBH+sCXu9/U7YWumaQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Shiju Jose 'Previously known as: cxl: Add support for CXL feature commands, CXL device patrol scrub control and DDR5 ECS control features' https://lore.kernel.org/lkml/20240215111455.1462-1-shiju.jose@huawei.com/ Introduce generic memory scrub subsystem which allows user to control underlying memory scrubbers in the system via the sysfs scrub control interface. Memory scrub is a feature where an ECC engine reads data from each memory media location, corrects with an ECC if necessary and writes the corrected data back to the same memory media location. More details can be found in Reference [1]. CXL patrol scrub and DDR5 ECS and ACPI RAS2 HW based memory patrol scrub features are added as use cases for the scrub subsystem to expose the scrub controls to the user. CXL device patrol scrub and DDR5 ECS features needs support for the CXL feature mail box commands. CXL device scrub driver registers with the memory scrub subsystem to expose the scrub controls for CXL device patrol and ECS scrubs to the user. RAS2 HW based memory patrol scrub needs RAS2 PCC interfaces and ACPI RAS2 driver for communication b/w kernel and firmware. ACPI RAS2 Driver adds platform device, for each memory feature, which binds to the RAS2 memory driver. Memory RAS2 driver registers with the memory scrub subsystem to expose the RAS2 scrub controls to the user. Series adds, 1. scrub subsystem driver supports configuring memory scrubs in the system. 2. support for CXL feature mailbox commands. 3. CXL device scrub driver supporting patrol scrub control and ECS control features. 4. register CXL device patrol scrub and ECS with scrub subsystem. 5. common library for RAS2 PCC interfaces. 6. ACPI RAS2 driver for ACPI RAS2 feature table (RAS2). 7. memory RAS2 driver and registers with scrub subsystem. The QEMU series to support the CXL specific scrub features is available here, https://lore.kernel.org/qemu-devel/20240223085902.1549-1-shiju.jose@huawei.com/ References: 1. Discussions on kernel support of memory error detection and patrol scrubber can be found here. https://lore.kernel.org/all/20221103155029.2451105-1-jiaqiyan@google.com/ 2. Discussions on RASF: https://lore.kernel.org/lkml/20230915172818.761-1-shiju.jose@huawei.com/#r https://patchwork.kernel.org/project/linux-arm-kernel/patch/CS1PR84MB0038718F49DBC0FF03919E1184390@CS1PR84MB0038.NAMPRD84.PROD.OUTLOOK.COM/ Changes v6 -> v7: 1. Main changes for comments from Jonathan. 1.1. CXL - Changes for deal with small mail box and supporting multipart feature data transfers. - Provide more specific parameters to mbox supported/get/set features interface functions. - kvmalloc -> kmalloc in CXL scrub mem allocation for feature commands. - Changed the way using __free(kfree) - Removed readback and verify for setting CXL scrub patrol and ECS parameters. Could be added later if needed. - In is_visible() callback functions for scrub control sysfs attrs changed to writeback the default attribute mode value instead of setting per attrs. - Add documentation for sysfs interfaces for CXL ECS scrub control. 1.2. RAS2 - In rasf common code, rename rasf to ras2 because RASF seems obselete. - Replace pr_* with dev_* log function calls from ACPI RAS2 and memory RAS2 drivers. - In rasf common code, rename rasf to ras2. - Removed including unnecessary .h file from memory RAS2 driver. - In is_visible() callback functions for scrub control sysfs attrs changed to writeback the default attribute mode value instead of setting per attribute. 2. Changes for comments from Fan. - Add debug message if cxl patrol scrub and ecs init function calls fail. 3. Updated cover letter for feedback from Dan Williams. v5 -> v6: 1. Changes for comments from Davidlohr, Thanks. - Update CXL feature code based on spec 3.1. - attrb -> attr - Use enums with default counting. 2. Rebased to the latest kernel. v4 -> v5: 1. Following are the main changes made based on the feedback from Dan Williams on v4. 1.1. In the scrub subsystem the common scrub control attributes are statically defined instead of dynamically created. 1.2. Add scrub subsystem support externally defined attribute group. Add CXL ECS driver define ECS specific attribute group and pass to the scrub subsystem. 1.3. Move cxl_mem_ecs_init() to cxl/core/region.c so that the CXL region_id is used in the registration with the scrub subsystem. 1.4. Add previously posted RASF common and RAS2 patches to this scrub series. 2. Add support for the 'enable_background_scrub' attribute for RAS2, on request from Bill Schwartz(wschwartz@amperecomputing.com). v3 -> v4: 1. Fixes for the warnings/errors reported by kernel test robot. 2. Add support for reading the 'enable' attribute of CXL patrol scrub. Changes v2 -> v3: 1. Changes for comments from Davidlohr, Thanks. - Updated cxl scrub kconfig - removed usage of the flag is_support_feature from the function cxl_mem_get_supported_feature_entry(). - corrected spelling error. - removed unnecessary debug message. - removed export feature commands to the userspace. 2. Possible fix for the warnings/errors reported by kernel test robot. 3. Add documentation for the common scrub configure atrributes. v1 -> v2: 1. Changes for comments from Dave Jiang, Thanks. - Split patches. - reversed xmas tree declarations. - declared flags as enums. - removed few unnecessary variable initializations. - replaced PTR_ERR_OR_ZERO() with IS_ERR() and PTR_ERR(). - add auto clean declarations. - replaced while loop with for loop. - Removed allocation from cxl_get_supported_features() and cxl_get_feature() and make change to take allocated memory pointer from the caller. - replaced if/else with switch case. - replaced sprintf() with sysfs_emit() in 2 places. - replaced goto label with return in few functions. 2. removed unused code for supported attributes from ecs. 3. Included following common patch for scrub configure driver to this series. "memory: scrub: Add scrub driver supports configuring memory scrubbers in the system" A Somasundaram (1): ACPI:RAS2: Add common library for RAS2 PCC interfaces Shiju Jose (11): cxl/mbox: Add GET_SUPPORTED_FEATURES mailbox command cxl/mbox: Add GET_FEATURE mailbox command cxl/mbox: Add SET_FEATURE mailbox command cxl/memscrub: Add CXL device patrol scrub control feature cxl/memscrub: Add CXL device ECS control feature memory: scrub: Add scrub subsystem driver supports configuring memory scrubs in the system cxl/memscrub: Register CXL device patrol scrub with scrub subsystem driver cxl/memscrub: Register CXL device ECS with scrub subsystem driver ACPICA: ACPI 6.5: Add support for RAS2 table ACPI:RAS2: Add driver for ACPI RAS2 feature table (RAS2) memory: RAS2: Add memory RAS2 driver .../ABI/testing/sysfs-class-cxl-ecs-configure | 79 ++ .../ABI/testing/sysfs-class-scrub-configure | 91 ++ drivers/acpi/Kconfig | 14 + drivers/acpi/Makefile | 1 + drivers/acpi/ras2_acpi.c | 97 ++ drivers/acpi/ras2_acpi_common.c | 272 +++++ drivers/cxl/Kconfig | 21 + drivers/cxl/core/Makefile | 1 + drivers/cxl/core/mbox.c | 143 +++ drivers/cxl/core/memscrub.c | 954 ++++++++++++++++++ drivers/cxl/core/region.c | 3 + drivers/cxl/cxlmem.h | 124 +++ drivers/cxl/pci.c | 4 + drivers/memory/Kconfig | 15 + drivers/memory/Makefile | 3 + drivers/memory/ras2.c | 364 +++++++ drivers/memory/ras2_common.c | 282 ++++++ drivers/memory/scrub/Kconfig | 11 + drivers/memory/scrub/Makefile | 6 + drivers/memory/scrub/memory-scrub.c | 369 +++++++ include/acpi/actbl2.h | 137 +++ include/acpi/ras2_acpi.h | 59 ++ include/memory/memory-scrub.h | 79 ++ include/memory/ras2.h | 88 ++ 24 files changed, 3217 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-class-cxl-ecs-configure create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure create mode 100755 drivers/acpi/ras2_acpi.c create mode 100755 drivers/acpi/ras2_acpi_common.c create mode 100644 drivers/cxl/core/memscrub.c create mode 100644 drivers/memory/ras2.c create mode 100644 drivers/memory/ras2_common.c create mode 100644 drivers/memory/scrub/Kconfig create mode 100644 drivers/memory/scrub/Makefile create mode 100755 drivers/memory/scrub/memory-scrub.c create mode 100644 include/acpi/ras2_acpi.h create mode 100755 include/memory/memory-scrub.h create mode 100755 include/memory/ras2.h