From patchwork Fri Feb 10 09:05:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13135547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FBAFC636CD for ; Fri, 10 Feb 2023 09:05:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 503A26B0072; Fri, 10 Feb 2023 04:05:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DAC16B007B; Fri, 10 Feb 2023 04:05:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CB206B0089; Fri, 10 Feb 2023 04:05:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 193286B0072 for ; Fri, 10 Feb 2023 04:05:28 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AB16716066B for ; Fri, 10 Feb 2023 09:05:27 +0000 (UTC) X-FDA: 80450798694.07.03957DB Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf16.hostedemail.com (Postfix) with ESMTP id 68F6B180011 for ; Fri, 10 Feb 2023 09:05:25 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hpeGaKIj; spf=pass (imf16.hostedemail.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676019925; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=X6qcpPTzVpe4hHLXBIJYtELaPJ4HAHvlLnfmtQBUkXo=; b=2dQpu6evMRtH092AXnXo0zzFLzpn/yYr2p3qpnmawYxz/8Om4FNP0YKgM16fbhq4tzQwyX V2fv0DAKDosoi3BR3DUUoIYNCpMetqu6MJE20Cbs0KDi3Dv9Gsd6J+TIKbePWpLxEbl6cR 18Q/27gYX6xfGY9QP/JDsHr5kArk1co= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=hpeGaKIj; spf=pass (imf16.hostedemail.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676019925; a=rsa-sha256; cv=none; b=NDH1ImGqmaFCLpbre4iVBcNB0Bdca4myh4OFTpTSuci/9t5ZbXz7qy7mprFU1IRUpTUcFy MFuLNCE66h/2N6FhTVIkySgfNtGmKhuO8t0/zIZe7NCQXG7sgitz1NRRxlG2lgLCCRGsyJ BUUFDIDmZruXEFd0eD/0WU8w2ac4el8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676019925; x=1707555925; h=subject:from:to:cc:date:message-id:mime-version: content-transfer-encoding; bh=aAOYbYjD+N5rIbj4yLf5cWdsbDyvwj1VTim/NDqe4VU=; b=hpeGaKIjNkX9TpcKem/iPm95NodEfFZDizeCaaUERt+80dRN04UzQ39K 6kN2dDEeLHDSbmsJE64eQrjJqiFLQUPlMPt3gmXvRQ9igd3d2H21bFRwJ ix+oG4ASDlHKZQYYPTdR7N9mLD9HoOelWM7n66UVvf/Wq1Mz8KIRYhG+T QxmplVx1hntKjqsRM0kBAL7sPw04BaFp1MnhgJ9PVsgd2Wcj9WPY/My9r yojtkHZ4lNOZTEkuXGpRYPmtmuAPalEYj12KZ3VuO7A0e9qNWvhaFVfE3 5frc3Q6duo3HhZFdjy2K1tagBFwsy17rX2A0uFVS+GyhBp1PRRo15EAU0 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10616"; a="314018681" X-IronPort-AV: E=Sophos;i="5.97,286,1669104000"; d="scan'208";a="314018681" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2023 01:05:23 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10616"; a="736669673" X-IronPort-AV: E=Sophos;i="5.97,286,1669104000"; d="scan'208";a="736669673" Received: from hrchavan-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.46.42]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2023 01:05:22 -0800 Subject: [PATCH v2 00/20] CXL RAM and the 'Soft Reserved' => 'System RAM' default From: Dan Williams To: linux-cxl@vger.kernel.org Cc: Ira Weiny , David Hildenbrand , Dave Jiang , Davidlohr Bueso , Kees Cook , Jonathan Cameron , Vishal Verma , Dave Hansen , Michal Hocko , Jonathan Cameron , Gregory Price , Fan Ni , linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Fri, 10 Feb 2023 01:05:21 -0800 Message-ID: <167601992097.1924368.18291887895351917895.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Stat-Signature: sxhkp8eci7qmxuq8afbsod4oh7sfgkxn X-Rspam-User: X-Rspamd-Queue-Id: 68F6B180011 X-Rspamd-Server: rspam06 X-HE-Tag: 1676019925-17243 X-HE-Meta: U2FsdGVkX19hImFvZV1TX5vLQ+4O+HvCJHOlMZkgf8u3cFGgGWuNlSe2tZR2MlXqXqYTwJRlOys5e1/TQ2nBWaZClkil7XtlCZMDCpO7Di7aCQy9Wz3/HMvzdLvmvPdOhANRp/8XHUDFWh07yUuJGJbETsvuWbjVJJZ02vWEoFOTrrNmbwW7KuayYPtsa9UgZ3YJ80u7p5O1BPMWtWNs1vAzdfHYnbgu6kXs+K+LxzGQDfX/WhaqvC/33IXLdSmHRVst5T92O3gWTczX/POMe10V1hZuXwPKV/oQvTnulyeBOr03cWS2C6HaCMuwofht2kj7mZtB+EFUEH0RmFur7C2/d0iKp4f8uRA2Lv5r9aJhFv6082I3RYhcOaSr9vD8SNxsdiQzuXQ/bx452OSDeAKvba44+izfDZtBnypH6AkOT58QHUSsNuHEIYWsxSplB7ZPN/GEGrp41Vv45AGOoc/KVCWrg6fak06/i701UH+pN2L4+3xglNijJcWt5X9sMFhticAb9wdsKOegscdHeBfCWIIWT9oJgBhCAmaDX1qjS+7avyBRrPBCHXbvUjuV8pSXvhoRtpcQRwHpmUM1yTAwEknkf4RNtfvAE3esOPNqf0cW5fpGr3uNsAuMHnYCqvIoyW1VYhrOIN8CzKWe+hI0OiI8OX8+gZ/A5k2YyOo00sa3O9NVpNT4evmsc2KsPhdULp4LBd4hBSfJd22UEvJIuQuw0vWCjkWl8wUhDDvkuoPURr62FFHSzwMaYgfPxE8sVHC2K5QGZxy4q6HLVwu8FfpIz7Eephm7nhwvI/XtJXYwAqxFCgSlOszRDlgIE8qSvh4cqgNdATESIizDQ1pfflg0e0qIWVzTRtG1GyMBeE5vlo2MB44GuH2gb/UTUGOiYqiENNSHexz9n+ci+yYAVGaAKG6DAa/et7p/N3H4jK2SPbEfT/bv6rbmskmqr04Y7dmB8blqRgHgoXt jV0Z/VwP 3BN1MJTURXKJGgbqD/Wrs7gIDv5ca6ev++ztpHsOeNC2UFrGmUuQ+Bgh8XaLSIHQ7OHydA4NpuAxwnVBfx1/UZs2X6bPaE+iPpSdnBvkv+gg74A7oeAo2Ico6m0mYL4PTLqSFwRqQfT8xnZDZzoJ0lX8awVL9f+0XIwWzOUlhWwdCqtWWx+kdnMv68L1NygJky2tHh9VSBvLb2l0aLkff0oVXPd+f3hcCiiqQoxbXdJCkCdWFdr8+vDhLiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes since v1: [1] - Add a fix for memdev removal racing port removal (found by unit tests) - Add a fix to unwind region target list updates on error in cxl_region_attach() (Jonathan) - Move the passthrough decoder fix for submission for v6.2-final (Greg) - Fix wrong initcall for cxl_core (Gregory and Davidlohr) - Add an endpoint decoder state (CXL_DECODER_STATE_AUTO) to replace the flag CXL_DECODER_F_AUTO (Jonathan) - Reflow cmp_decode_pos() to reduce levels of indentation (Jonathan) - Fix a leaked reference count in cxl_add_to_region() (Jonathan) - Make cxl_add_to_region() return an error (Jonathan) - Fix several spurious whitespace changes (Jonathan) - Cleanup some spurious changes from the tools/testing/cxl update (Jonathan) - Test for == CXL_CONFIG_COMMIT rather than >= CXL_CONFIG_COMMIT (Jonathan) - Add comment to clarify device_attach() return code expectation in cxl_add_to_region() (Jonathan) - Add a patch to split cxl_port_probe() into switch and endpoint port probe calls (Jonathan) - Collect reviewed-by and tested-by tags [1]: http://lore.kernel.org/r/167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com --- Cover letter same as v1 Summary: -------- CXL RAM support allows for the dynamic provisioning of new CXL RAM regions, and more routinely, assembling a region from an existing configuration established by platform-firmware. The latter is motivated by CXL memory RAS (Reliability, Availability and Serviceability) support, that requires associating device events with System Physical Address ranges and vice versa. The 'Soft Reserved' policy rework arranges for performance differentiated memory like CXL attached DRAM, or high-bandwidth memory, to be designated for 'System RAM' by default, rather than the device-dax dedicated access mode. That current device-dax default is confusing and surprising for the Pareto of users that do not expect memory to be quarantined for dedicated access by default. Most users expect all 'System RAM'-capable memory to show up in FREE(1). Details: -------- Recall that the Linux 'Soft Reserved' designation for memory is a reaction to platform-firmware, like EFI EDK2, delineating memory with the EFI Specific Purpose Memory attribute (EFI_MEMORY_SP). An alternative way to think of that attribute is that it specifies the *not* general-purpose memory pool. It is memory that may be too precious for general usage or not performant enough for some hot data structures. However, in the absence of explicit policy it should just be 'System RAM' by default. Rather than require every distribution to ship a udev policy to assign dax devices to dax_kmem (the device-memory hotplug driver) just make that the kernel default. This is similar to the rationale in: commit 8604d9e534a3 ("memory_hotplug: introduce CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE") With this change the relatively niche use case of accessing this memory via mapping a device-dax instance can be achieved by building with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n, or specifying memhp_default_state=offline at boot, and then use: daxctl reconfigure-device $device -m devdax --force ...to shift the corresponding address range to device-dax access. The process of assembling a device-dax instance for a given CXL region device configuration is similar to the process of assembling a Device-Mapper or MDRAID storage-device array. Specifically, asynchronous probing by the PCI and driver core enumerates all CXL endpoints and their decoders. Then, once enough decoders have arrived to a describe a given region, that region is passed to the device-dax subsystem where it is subject to the above 'dax_kmem' policy. This assignment and policy choice is only possible if memory is set aside by the 'Soft Reserved' designation. Otherwise, CXL that is mapped as 'System RAM' becomes immutable by CXL driver mechanisms, but is still enumerated for RAS purposes. This series is also available via: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.3/cxl-ram-region ...and has gone through some preview testing in various forms. --- Dan Williams (20): cxl/memdev: Fix endpoint port removal cxl/Documentation: Update references to attributes added in v6.0 cxl/region: Add a mode attribute for regions cxl/region: Support empty uuids for non-pmem regions cxl/region: Validate region mode vs decoder mode cxl/region: Add volatile region creation support cxl/region: Refactor attach_target() for autodiscovery cxl/region: Cleanup target list on attach error cxl/region: Move region-position validation to a helper kernel/range: Uplevel the cxl subsystem's range_contains() helper cxl/region: Enable CONFIG_CXL_REGION to be toggled cxl/port: Split endpoint and switch port probe cxl/region: Add region autodiscovery tools/testing/cxl: Define a fixed volatile configuration to parse dax/hmem: Move HMAT and Soft reservation probe initcall level dax/hmem: Drop unnecessary dax_hmem_remove() dax/hmem: Convey the dax range via memregion_info() dax/hmem: Move hmem device registration to dax_hmem.ko dax: Assign RAM regions to memory-hotplug by default cxl/dax: Create dax devices for CXL RAM regions Documentation/ABI/testing/sysfs-bus-cxl | 64 +- MAINTAINERS | 1 drivers/acpi/numa/hmat.c | 4 drivers/cxl/Kconfig | 12 drivers/cxl/acpi.c | 3 drivers/cxl/core/core.h | 7 drivers/cxl/core/hdm.c | 25 + drivers/cxl/core/memdev.c | 1 drivers/cxl/core/pci.c | 5 drivers/cxl/core/port.c | 92 ++- drivers/cxl/core/region.c | 851 ++++++++++++++++++++++++++++--- drivers/cxl/cxl.h | 57 ++ drivers/cxl/cxlmem.h | 5 drivers/cxl/port.c | 113 +++- drivers/dax/Kconfig | 17 + drivers/dax/Makefile | 2 drivers/dax/bus.c | 53 +- drivers/dax/bus.h | 12 drivers/dax/cxl.c | 53 ++ drivers/dax/device.c | 3 drivers/dax/hmem/Makefile | 3 drivers/dax/hmem/device.c | 102 ++-- drivers/dax/hmem/hmem.c | 148 +++++ drivers/dax/kmem.c | 1 include/linux/dax.h | 7 include/linux/memregion.h | 2 include/linux/range.h | 5 lib/stackinit_kunit.c | 6 tools/testing/cxl/test/cxl.c | 147 +++++ 29 files changed, 1484 insertions(+), 317 deletions(-) create mode 100644 drivers/dax/cxl.c base-commit: 172738bbccdb4ef76bdd72fc72a315c741c39161