From patchwork Mon Feb 6 01:02:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E480EC636D4 for ; Mon, 6 Feb 2023 01:02:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229468AbjBFBCh (ORCPT ); Sun, 5 Feb 2023 20:02:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229448AbjBFBCg (ORCPT ); Sun, 5 Feb 2023 20:02:36 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BD8E126DD; Sun, 5 Feb 2023 17:02:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645355; x=1707181355; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+oQ5qejLH1X6RwG1ebhyZ7IMLgkFgjFEswwe1eorwQo=; b=mi73eNNtc/D/T/pYHG/NlXM30QWei6mK/JO3JE3yOIBNv8GFYfGiTJLv E04h6L2OOGSh3PQDyduuiV5EI49MJldHgg27cR1++VvrjXu2CKzsOauwM f1u1POyKARsp/o5tJqEQ4dPDZwKO9MmRbsuvxbg2s7+4I61zmRydjfn92 fSxM0RpzxJg0aAXWp/W87TGo2jVvL+amtCKYYo4lZ3YdrNY9Wh79BglDD vhV1PpZa+bUY3ClOOGetlc5XessOzsEId3Vh0d679ibn30+a9rJ8pGpWx mfBSZ8Rjhmylu7GNyVTyHIdFX7Uid5RPn1VijZnJ2wNINL0xEIVXnFuQI A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="331243774" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="331243774" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:35 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="643855715" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="643855715" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:35 -0800 Subject: [PATCH 01/18] cxl/Documentation: Update references to attributes added in v6.0 From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:02:35 -0800 Message-ID: <167564535494.847146.12120939572640882946.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Prior to Linus deciding that the kernel that following v5.19 would be v6.0, the CXL ABI documentation already referenced v5.20. In preparation for updating these entries update the kernel version to v6.0. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Gregory Price Reviewed-by: Davidlohr Bueso Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- Documentation/ABI/testing/sysfs-bus-cxl | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 329a7e46c805..5be032313e29 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -198,7 +198,7 @@ Description: What: /sys/bus/cxl/devices/endpointX/CDAT Date: July, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RO) If this sysfs entry is not present no DOE mailbox was @@ -209,7 +209,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/mode Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it @@ -229,7 +229,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RO) When a CXL decoder is of devtype "cxl_decoder_endpoint", @@ -240,7 +240,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/dpa_size Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it @@ -260,7 +260,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RO) The number of targets across which this decoder's host @@ -275,7 +275,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RO) The number of consecutive bytes of host physical address @@ -287,7 +287,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Write a string in the form 'regionZ' to start the process @@ -303,7 +303,7 @@ Description: What: /sys/bus/cxl/devices/decoderX.Y/delete_region Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (WO) Write a string in the form 'regionZ' to delete that region, @@ -312,7 +312,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/uuid Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Write a unique identifier for the region. This field must @@ -322,7 +322,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/interleave_granularity Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Set the number of consecutive bytes each device in the @@ -333,7 +333,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/interleave_ways Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Configures the number of devices participating in the @@ -343,7 +343,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/size Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) System physical address space to be consumed by the region. @@ -360,7 +360,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/resource Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RO) A region is a contiguous partition of a CXL root decoder @@ -372,7 +372,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/target[0..N] Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Write an endpoint decoder object name to 'targetX' where X @@ -391,7 +391,7 @@ Description: What: /sys/bus/cxl/devices/regionZ/commit Date: May, 2022 -KernelVersion: v5.20 +KernelVersion: v6.0 Contact: linux-cxl@vger.kernel.org Description: (RW) Write a boolean 'true' string value to this attribute to From patchwork Mon Feb 6 01:02:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3C3DC636CC for ; Mon, 6 Feb 2023 01:02:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229503AbjBFBCo (ORCPT ); Sun, 5 Feb 2023 20:02:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229500AbjBFBCn (ORCPT ); Sun, 5 Feb 2023 20:02:43 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AF7912874; Sun, 5 Feb 2023 17:02:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645361; x=1707181361; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UOKI+/A98jKePPsZHuytj7Ccuyq4WgZ+jxkAtz+DXQ4=; b=JBq5/8tjjChrlSIbXtpzVmElLFRJbt4qZKfb8xkCnrhAN51PLeBfh7Gu 1BgG10LCNGDvEtwjZPdFaMsbk1euVlUZlwjo0RNz4QKHgQR68jMY1FcvW 9GtjYhal95kUFr42WM0sjKU1z3rK0OIUmw/VV2k3XYFlsZ+qTO1XrRx+x PFGOUSDRLWkDPeAwoES1EAp08ZTN9MLYMleqpqXIu+OyNjApidWZWXkm6 OKdk0rEACHO38vXFuwIwMSklmK1b0HAYksJn3Mh3Dnj9qQcsiQEF/TqWO z+ev0j4zrORXdVfoDKIGhhULlZZV5GUQ2fJvPS6l2dy5AVd1kJmAN2O5C w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="331243785" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="331243785" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:40 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="643855730" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="643855730" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:40 -0800 Subject: [PATCH 02/18] cxl/region: Add a mode attribute for regions From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:02:40 -0800 Message-ID: <167564536041.847146.11330354943211409793.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for a new region type, "ram" regions, add a mode attribute to clarify the mode of the decoders that can be added to a region. Share the internals of mode_show() (for decoders) with the region case. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Gregory Price Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- Documentation/ABI/testing/sysfs-bus-cxl | 11 +++++++++++ drivers/cxl/core/port.c | 12 +----------- drivers/cxl/core/region.c | 10 ++++++++++ drivers/cxl/cxl.h | 14 ++++++++++++++ 4 files changed, 36 insertions(+), 11 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 5be032313e29..058b0c45001f 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -358,6 +358,17 @@ Description: results in the same address being allocated. +What: /sys/bus/cxl/devices/regionZ/mode +Date: January, 2023 +KernelVersion: v6.3 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The mode of a region is established at region creation time + and dictates the mode of the endpoint decoder that comprise the + region. For more details on the possible modes see + /sys/bus/cxl/devices/decoderX.Y/mode + + What: /sys/bus/cxl/devices/regionZ/resource Date: May, 2022 KernelVersion: v6.0 diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 410c036c09fa..8566451cb22f 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -180,17 +180,7 @@ static ssize_t mode_show(struct device *dev, struct device_attribute *attr, { struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev); - switch (cxled->mode) { - case CXL_DECODER_RAM: - return sysfs_emit(buf, "ram\n"); - case CXL_DECODER_PMEM: - return sysfs_emit(buf, "pmem\n"); - case CXL_DECODER_NONE: - return sysfs_emit(buf, "none\n"); - case CXL_DECODER_MIXED: - default: - return sysfs_emit(buf, "mixed\n"); - } + return sysfs_emit(buf, "%s\n", cxl_decoder_mode_name(cxled->mode)); } static ssize_t mode_store(struct device *dev, struct device_attribute *attr, diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 60828d01972a..17d2d0c12725 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -458,6 +458,15 @@ static ssize_t resource_show(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_RO(resource); +static ssize_t mode_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%s\n", cxl_decoder_mode_name(cxlr->mode)); +} +static DEVICE_ATTR_RO(mode); + static int alloc_hpa(struct cxl_region *cxlr, resource_size_t size) { struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); @@ -585,6 +594,7 @@ static struct attribute *cxl_region_attrs[] = { &dev_attr_interleave_granularity.attr, &dev_attr_resource.attr, &dev_attr_size.attr, + &dev_attr_mode.attr, NULL, }; diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index aa3af3bb73b2..ca76879af1de 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -320,6 +320,20 @@ enum cxl_decoder_mode { CXL_DECODER_DEAD, }; +static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode) +{ + static const char * const names[] = { + [CXL_DECODER_NONE] = "none", + [CXL_DECODER_RAM] = "ram", + [CXL_DECODER_PMEM] = "pmem", + [CXL_DECODER_MIXED] = "mixed", + }; + + if (mode >= CXL_DECODER_NONE && mode <= CXL_DECODER_MIXED) + return names[mode]; + return "mixed"; +} + /** * struct cxl_endpoint_decoder - Endpoint / SPA to DPA decoder * @cxld: base cxl_decoder_object From patchwork Mon Feb 6 01:02:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129214 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADEA3C636D4 for ; Mon, 6 Feb 2023 01:02:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229510AbjBFBCs (ORCPT ); Sun, 5 Feb 2023 20:02:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229521AbjBFBCr (ORCPT ); Sun, 5 Feb 2023 20:02:47 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76F7C2738; Sun, 5 Feb 2023 17:02:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645366; x=1707181366; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yb8ZrYfywa+KhatYaUiHluR1JdOlK0Jjr6L/f1PwJLc=; b=e7t2nDwoNb+uIEsh993w0izEa8mmYBvCqYCqwTBSq4PDXkOy/vxOB6oU wpd21Z4fF/iH0hBdxWocp3LViKoQDIMX3lXnJpj6gZHRjHB2ubl8TVBvt XOFMcPKyyp/DKBzzlyosbg4HVpjX5vTvG8J2AQ/rwbV6SD9NGB/gGYd3B jhrca1sILgvOUYmSpDrd8X0UREgYGTEQW31ik5RgQ6vlTqETADsRN2UJb wukBZvCO+XMkiZBJUYGJP6yelijxsi6DtaUixEbC8US5XT7NxlxKtr7me Cr2EKOXvWCKh2m9Y+xAZxE8M9+w/hjY41ON/80xpfzZveEE8EacckMD1V A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="331243790" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="331243790" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:46 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="643855738" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="643855738" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:46 -0800 Subject: [PATCH 03/18] cxl/region: Support empty uuids for non-pmem regions From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:02:45 -0800 Message-ID: <167564536587.847146.12703125206459604597.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Shipping versions of the cxl-cli utility expect all regions to have a 'uuid' attribute. In preparation for 'ram' regions, update the 'uuid' attribute to return an empty string which satisfies the current expectations of 'cxl list -R'. Otherwise, 'cxl list -R' fails in the presence of regions with the 'uuid' attribute missing. Force the attribute to be read-only as there is no facility or expectation for a 'ram' region to recall its uuid from one boot to the next. Signed-off-by: Dan Williams Reviewed-by: Vishal Verma --- Documentation/ABI/testing/sysfs-bus-cxl | 3 ++- drivers/cxl/core/region.c | 7 +++++-- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 058b0c45001f..4c4e1cbb1169 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -317,7 +317,8 @@ Contact: linux-cxl@vger.kernel.org Description: (RW) Write a unique identifier for the region. This field must be set for persistent regions and it must not conflict with the - UUID of another region. + UUID of another region. For volatile ram regions this + attribute is a read-only empty string. What: /sys/bus/cxl/devices/regionZ/interleave_granularity diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 17d2d0c12725..c9e7f05caa0f 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -45,7 +45,10 @@ static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, rc = down_read_interruptible(&cxl_region_rwsem); if (rc) return rc; - rc = sysfs_emit(buf, "%pUb\n", &p->uuid); + if (cxlr->mode != CXL_DECODER_PMEM) + rc = sysfs_emit(buf, "\n"); + else + rc = sysfs_emit(buf, "%pUb\n", &p->uuid); up_read(&cxl_region_rwsem); return rc; @@ -301,7 +304,7 @@ static umode_t cxl_region_visible(struct kobject *kobj, struct attribute *a, struct cxl_region *cxlr = to_cxl_region(dev); if (a == &dev_attr_uuid.attr && cxlr->mode != CXL_DECODER_PMEM) - return 0; + return 0444; return a->mode; } From patchwork Mon Feb 6 01:02:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 313DDC636D4 for ; Mon, 6 Feb 2023 01:02:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229525AbjBFBCy (ORCPT ); Sun, 5 Feb 2023 20:02:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229523AbjBFBCx (ORCPT ); Sun, 5 Feb 2023 20:02:53 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08BF48A71; Sun, 5 Feb 2023 17:02:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645372; x=1707181372; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yszGLPkFMXedVGllSqe6JjeO5D0jX4yLcgUfwilZAzI=; b=FTdgOzxiK/n8EQrct1NKXny2jtk1ViZwiVEpgp9OZkPYE4CXUjZBJSUj VZqMuRxyEbtehyZ11M36/P/r7n4tsbt9qu/15yhULYXKG+d0NwadZA5uE SM+sw2ibRu3KiXXkjtyL6zw95dwgLgx/KWw2mmruy7mpKlYo5yO/RQ931 Z1TkR8raZZ56FFpisyDfzoUnARRtGBkVagXqkbGJE0jhqf899RhjZMwSl PAz64pP9vJ7oILUYmdcQ/JfH3g6hgKj8YfEyejrG04X1WngqADCNrvHQe 8P1ktKEjs18+ZoWNCYjnZ9s2rjiUuqdvRREd76PUOnofiVCkEWeAWO4Ow Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="331243798" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="331243798" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:51 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="643855741" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="643855741" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:51 -0800 Subject: [PATCH 04/18] cxl/region: Validate region mode vs decoder mode From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:02:51 -0800 Message-ID: <167564537131.847146.9020072654741860107.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for a new region mode, do not, for example, allow 'ram' decoders to be assigned to 'pmem' regions and vice versa. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny Reviewed-by: Gregory Price Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- drivers/cxl/core/region.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index c9e7f05caa0f..53d6dbe4de6d 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1217,6 +1217,12 @@ static int cxl_region_attach(struct cxl_region *cxlr, struct cxl_dport *dport; int i, rc = -ENXIO; + if (cxled->mode != cxlr->mode) { + dev_dbg(&cxlr->dev, "%s region mode: %d mismatch: %d\n", + dev_name(&cxled->cxld.dev), cxlr->mode, cxled->mode); + return -EINVAL; + } + if (cxled->mode == CXL_DECODER_DEAD) { dev_dbg(&cxlr->dev, "%s dead\n", dev_name(&cxled->cxld.dev)); return -ENODEV; From patchwork Mon Feb 6 01:02:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53C27C636CC for ; Mon, 6 Feb 2023 01:03:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229523AbjBFBDA (ORCPT ); Sun, 5 Feb 2023 20:03:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229526AbjBFBDA (ORCPT ); Sun, 5 Feb 2023 20:03:00 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B640DBFE; Sun, 5 Feb 2023 17:02:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645377; x=1707181377; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2NFt5YIQN4Pv/UJUgAAJbE/vCys4h+IOB5VcXpizhjY=; b=dL0oz9mOolNG2TuoGvxn4nrTmr9p9Q4GzxTt7u9Vs/Ne4ZpTmlBMojLr JZnB6iS3wbppxMKVcJIIENUdQ9hLZiG5wJ25P2qpjTXvfdhSbv9TJLzsh AeHfczri/tiaxuaxrFzo4Uzqx5tWOScpxW/TK7Mgb8mR22LZ24Jh23Hzv oXr6KQpGa4AyWmadad0Htu63v3XB5lX5btTEDAR/CYsJTc/dD5YzxJkK8 Spln6wehYlxKj1At75mrrNR9BFkVxfhMq8Ounrmj+83HPFuWTScHMq5pe DMoXKZzCWo+u/rmGI+qU9CiCJ8Ayrc5ifVjRnxsIGSanhOe3A7pvTj2V8 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="331243807" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="331243807" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:57 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="643855748" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="643855748" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:02:57 -0800 Subject: [PATCH 05/18] cxl/region: Add volatile region creation support From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:02:56 -0800 Message-ID: <167564537678.847146.4066579806086171091.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Expand the region creation infrastructure to enable 'ram' (volatile-memory) regions. The internals of create_pmem_region_store() and create_pmem_region_show() are factored out into helpers __create_region() and __create_region_show() for the 'ram' case to reuse. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny Reviewed-by: Gregory Price Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- Documentation/ABI/testing/sysfs-bus-cxl | 22 +++++----- drivers/cxl/core/core.h | 1 drivers/cxl/core/port.c | 14 ++++++ drivers/cxl/core/region.c | 71 +++++++++++++++++++++++++------ 4 files changed, 82 insertions(+), 26 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 4c4e1cbb1169..3acf2f17a73f 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -285,20 +285,20 @@ Description: interleave_granularity). -What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region -Date: May, 2022 -KernelVersion: v6.0 +What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region +Date: May, 2022, January, 2023 +KernelVersion: v6.0 (pmem), v6.3 (ram) Contact: linux-cxl@vger.kernel.org Description: (RW) Write a string in the form 'regionZ' to start the process - of defining a new persistent memory region (interleave-set) - within the decode range bounded by root decoder 'decoderX.Y'. - The value written must match the current value returned from - reading this attribute. An atomic compare exchange operation is - done on write to assign the requested id to a region and - allocate the region-id for the next creation attempt. EBUSY is - returned if the region name written does not match the current - cached value. + of defining a new persistent, or volatile memory region + (interleave-set) within the decode range bounded by root decoder + 'decoderX.Y'. The value written must match the current value + returned from reading this attribute. An atomic compare exchange + operation is done on write to assign the requested id to a + region and allocate the region-id for the next creation attempt. + EBUSY is returned if the region name written does not match the + current cached value. What: /sys/bus/cxl/devices/decoderX.Y/delete_region diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 8c04672dca56..5eb873da5a30 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -11,6 +11,7 @@ extern struct attribute_group cxl_base_attribute_group; #ifdef CONFIG_CXL_REGION extern struct device_attribute dev_attr_create_pmem_region; +extern struct device_attribute dev_attr_create_ram_region; extern struct device_attribute dev_attr_delete_region; extern struct device_attribute dev_attr_region; extern const struct device_type cxl_pmem_region_type; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 8566451cb22f..47e450c3a5a9 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -294,6 +294,7 @@ static struct attribute *cxl_decoder_root_attrs[] = { &dev_attr_cap_type3.attr, &dev_attr_target_list.attr, SET_CXL_REGION_ATTR(create_pmem_region) + SET_CXL_REGION_ATTR(create_ram_region) SET_CXL_REGION_ATTR(delete_region) NULL, }; @@ -305,6 +306,13 @@ static bool can_create_pmem(struct cxl_root_decoder *cxlrd) return (cxlrd->cxlsd.cxld.flags & flags) == flags; } +static bool can_create_ram(struct cxl_root_decoder *cxlrd) +{ + unsigned long flags = CXL_DECODER_F_TYPE3 | CXL_DECODER_F_RAM; + + return (cxlrd->cxlsd.cxld.flags & flags) == flags; +} + static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute *a, int n) { struct device *dev = kobj_to_dev(kobj); @@ -313,7 +321,11 @@ static umode_t cxl_root_decoder_visible(struct kobject *kobj, struct attribute * if (a == CXL_REGION_ATTR(create_pmem_region) && !can_create_pmem(cxlrd)) return 0; - if (a == CXL_REGION_ATTR(delete_region) && !can_create_pmem(cxlrd)) + if (a == CXL_REGION_ATTR(create_ram_region) && !can_create_ram(cxlrd)) + return 0; + + if (a == CXL_REGION_ATTR(delete_region) && + !(can_create_pmem(cxlrd) || can_create_ram(cxlrd))) return 0; return a->mode; diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 53d6dbe4de6d..8dea49c021b8 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1685,6 +1685,15 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd, struct device *dev; int rc; + switch (mode) { + case CXL_DECODER_RAM: + case CXL_DECODER_PMEM: + break; + default: + dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode); + return ERR_PTR(-EINVAL); + } + cxlr = cxl_region_alloc(cxlrd, id); if (IS_ERR(cxlr)) return cxlr; @@ -1713,12 +1722,38 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd, return ERR_PTR(rc); } +static ssize_t __create_region_show(struct cxl_root_decoder *cxlrd, char *buf) +{ + return sysfs_emit(buf, "region%u\n", atomic_read(&cxlrd->region_id)); +} + static ssize_t create_pmem_region_show(struct device *dev, struct device_attribute *attr, char *buf) { - struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev); + return __create_region_show(to_cxl_root_decoder(dev), buf); +} - return sysfs_emit(buf, "region%u\n", atomic_read(&cxlrd->region_id)); +static ssize_t create_ram_region_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return __create_region_show(to_cxl_root_decoder(dev), buf); +} + +static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd, + enum cxl_decoder_mode mode, int id) +{ + int rc; + + rc = memregion_alloc(GFP_KERNEL); + if (rc < 0) + return ERR_PTR(rc); + + if (atomic_cmpxchg(&cxlrd->region_id, id, rc) != id) { + memregion_free(rc); + return ERR_PTR(-EBUSY); + } + + return devm_cxl_add_region(cxlrd, id, mode, CXL_DECODER_EXPANDER); } static ssize_t create_pmem_region_store(struct device *dev, @@ -1727,29 +1762,37 @@ static ssize_t create_pmem_region_store(struct device *dev, { struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev); struct cxl_region *cxlr; - int id, rc; + int rc, id; rc = sscanf(buf, "region%d\n", &id); if (rc != 1) return -EINVAL; - rc = memregion_alloc(GFP_KERNEL); - if (rc < 0) - return rc; + cxlr = __create_region(cxlrd, CXL_DECODER_PMEM, id); + if (IS_ERR(cxlr)) + return PTR_ERR(cxlr); + return len; +} +DEVICE_ATTR_RW(create_pmem_region); - if (atomic_cmpxchg(&cxlrd->region_id, id, rc) != id) { - memregion_free(rc); - return -EBUSY; - } +static ssize_t create_ram_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(dev); + struct cxl_region *cxlr; + int rc, id; - cxlr = devm_cxl_add_region(cxlrd, id, CXL_DECODER_PMEM, - CXL_DECODER_EXPANDER); + rc = sscanf(buf, "region%d\n", &id); + if (rc != 1) + return -EINVAL; + + cxlr = __create_region(cxlrd, CXL_DECODER_RAM, id); if (IS_ERR(cxlr)) return PTR_ERR(cxlr); - return len; } -DEVICE_ATTR_RW(create_pmem_region); +DEVICE_ATTR_RW(create_ram_region); static ssize_t region_show(struct device *dev, struct device_attribute *attr, char *buf) From patchwork Mon Feb 6 01:03:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E27DC636D4 for ; Mon, 6 Feb 2023 01:03:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229526AbjBFBDG (ORCPT ); Sun, 5 Feb 2023 20:03:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229537AbjBFBDE (ORCPT ); Sun, 5 Feb 2023 20:03:04 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2ABF7144B4; Sun, 5 Feb 2023 17:03:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645383; x=1707181383; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gxV5boq5z7i+vQyYxApsVbkxX+I+JHzPNbJoiDtgews=; b=XNfxh7Guj5o3I0pbrQJOsQxMwAt3pCT5WR46zYQtPuCaoX/yVq6Y6SlZ H6CT1JC/gqRaXGS2ue+gybJReMgZLe/KDnbR3rICH/CNE5s7J3a5TD15/ 9tri7gck7nODENbdi7l5O1Y8VHjsCPcQP8SOVB95IRTL1w/zhzjz/6Sh9 it/UhQt05OXiih7Nvr5TUf+QNxfAl+i5pCiFnnymFSncQROqjkMLtdiDM /I5YB6ogI+6tp8Is5moLM4h8gYn88XRdNulRgduUiu1XvtJIaEcdmgp8I Q1Yl9ORP8KCfl9DYCYR0YX1mBNynCRlVCez4DJAAuLY0VI23JtiV9yXDo w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763180" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763180" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:02 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291294" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291294" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:02 -0800 Subject: [PATCH 06/18] cxl/region: Refactor attach_target() for autodiscovery From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:02 -0800 Message-ID: <167564538227.847146.16305045998592488364.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Region autodiscovery is the process of kernel creating 'struct cxl_region' object to represent active CXL memory ranges it finds already active in hardware when the driver loads. Typically this happens when platform firmware establishes CXL memory regions and then publishes them in the memory map. However, this can also happen in the case of kexec-reboot after the kernel has created regions. In the autodiscovery case the region creation process starts with a known endpoint decoder. Refactor attach_target() into a helper that is suitable to be called from either sysfs, for runtime region creation, or from cxl_port_probe() after it has enumerated all endpoint decoders. The cxl_port_probe() context is an async device-core probing context, so it is not appropriate to allow SIGTERM to interrupt the assembly process. Refactor attach_target() to take @cxled and @state as arguments where @state indicates whether waiting from the region rwsem is interruptible or not. No behavior change is intended. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- drivers/cxl/core/region.c | 47 +++++++++++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 19 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 8dea49c021b8..97eafdd75675 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1418,31 +1418,25 @@ void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled) up_write(&cxl_region_rwsem); } -static int attach_target(struct cxl_region *cxlr, const char *decoder, int pos) +static int attach_target(struct cxl_region *cxlr, + struct cxl_endpoint_decoder *cxled, int pos, + unsigned int state) { - struct device *dev; - int rc; - - dev = bus_find_device_by_name(&cxl_bus_type, NULL, decoder); - if (!dev) - return -ENODEV; - - if (!is_endpoint_decoder(dev)) { - put_device(dev); - return -EINVAL; - } + int rc = 0; - rc = down_write_killable(&cxl_region_rwsem); + if (state == TASK_INTERRUPTIBLE) + rc = down_write_killable(&cxl_region_rwsem); + else + down_write(&cxl_region_rwsem); if (rc) - goto out; + return rc; + down_read(&cxl_dpa_rwsem); - rc = cxl_region_attach(cxlr, to_cxl_endpoint_decoder(dev), pos); + rc = cxl_region_attach(cxlr, cxled, pos); if (rc == 0) set_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags); up_read(&cxl_dpa_rwsem); up_write(&cxl_region_rwsem); -out: - put_device(dev); return rc; } @@ -1480,8 +1474,23 @@ static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos, if (sysfs_streq(buf, "\n")) rc = detach_target(cxlr, pos); - else - rc = attach_target(cxlr, buf, pos); + else { + struct device *dev; + + dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf); + if (!dev) + return -ENODEV; + + if (!is_endpoint_decoder(dev)) { + rc = -EINVAL; + goto out; + } + + rc = attach_target(cxlr, to_cxl_endpoint_decoder(dev), pos, + TASK_INTERRUPTIBLE); +out: + put_device(dev); + } if (rc < 0) return rc; From patchwork Mon Feb 6 01:03:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80430C636D4 for ; Mon, 6 Feb 2023 01:03:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229543AbjBFBDM (ORCPT ); Sun, 5 Feb 2023 20:03:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229517AbjBFBDJ (ORCPT ); Sun, 5 Feb 2023 20:03:09 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EEDC126DD; Sun, 5 Feb 2023 17:03:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645388; x=1707181388; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pd/OzsallZam53iFg2v2xJwCJFvmTCwZnqxjvamzJrs=; b=V6ixechAb0aRGXZJn8LGYRHnpWFwokMukFUkZSNgO3O286d93Qz+2pRd It6QuUCPzNPz0q8zDU3uyInALTW715NE1d5UHEvYJ4dQY1uG8EQKL8aRY D2URgHZpvHKBYmAywnZ42EzXD6oR/gwarEtJqq8oqWb7/oAIZbdA2nX5m d4j+HK6AKaUT45dMwcqzy3kbiHPeNqSGj8o/nLBmmj+DPjH6Of5/UkeLC q/VWl5vpnBB9fLyF438ixhX8JGihlu+UBioMRKxbQIKGe6FBUeg8yE7+G sJQ8F6LSVC2AxjxNeqII65+G5I1vbFsw9zSKaHKOXwU8kf2HLN5riSvFO w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763186" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763186" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:08 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291299" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291299" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:08 -0800 Subject: [PATCH 07/18] cxl/region: Move region-position validation to a helper From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:07 -0800 Message-ID: <167564538779.847146.8356062886811511706.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for region autodiscovery, that needs all devices discovered before their relative position in the region can be determined, consolidate all position dependent validation in a helper. Recall that in the on-demand region creation flow the end-user picks the position of a given endpoint decoder in a region. In the autodiscovery case the position of an endpoint decoder can only be determined after all other endpoint decoders that claim to decode the region's address range have been enumerated and attached. So, in the autodiscovery case endpoint decoders may be attached before their relative position is known. Once all decoders arrive, then positions can be determined and validated with cxl_region_validate_position() the same as user initiated on-demand creation. Signed-off-by: Dan Williams Reviewed-by: Vishal Verma --- drivers/cxl/core/region.c | 119 +++++++++++++++++++++++++++++---------------- 1 file changed, 76 insertions(+), 43 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 97eafdd75675..c82d3b6f3d1f 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1207,35 +1207,13 @@ static int cxl_region_setup_targets(struct cxl_region *cxlr) return 0; } -static int cxl_region_attach(struct cxl_region *cxlr, - struct cxl_endpoint_decoder *cxled, int pos) +static int cxl_region_validate_position(struct cxl_region *cxlr, + struct cxl_endpoint_decoder *cxled, + int pos) { - struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); - struct cxl_port *ep_port, *root_port, *iter; struct cxl_region_params *p = &cxlr->params; - struct cxl_dport *dport; - int i, rc = -ENXIO; - - if (cxled->mode != cxlr->mode) { - dev_dbg(&cxlr->dev, "%s region mode: %d mismatch: %d\n", - dev_name(&cxled->cxld.dev), cxlr->mode, cxled->mode); - return -EINVAL; - } - - if (cxled->mode == CXL_DECODER_DEAD) { - dev_dbg(&cxlr->dev, "%s dead\n", dev_name(&cxled->cxld.dev)); - return -ENODEV; - } - - /* all full of members, or interleave config not established? */ - if (p->state > CXL_CONFIG_INTERLEAVE_ACTIVE) { - dev_dbg(&cxlr->dev, "region already active\n"); - return -EBUSY; - } else if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) { - dev_dbg(&cxlr->dev, "interleave config missing\n"); - return -ENXIO; - } + int i; if (pos < 0 || pos >= p->interleave_ways) { dev_dbg(&cxlr->dev, "position %d out of range %d\n", pos, @@ -1274,6 +1252,71 @@ static int cxl_region_attach(struct cxl_region *cxlr, } } + return 0; +} + +static int cxl_region_attach_position(struct cxl_region *cxlr, + struct cxl_root_decoder *cxlrd, + struct cxl_endpoint_decoder *cxled, + const struct cxl_dport *dport, int pos) +{ + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); + struct cxl_port *iter; + int rc; + + if (cxlrd->calc_hb(cxlrd, pos) != dport) { + dev_dbg(&cxlr->dev, "%s:%s invalid target position for %s\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), + dev_name(&cxlrd->cxlsd.cxld.dev)); + return -ENXIO; + } + + for (iter = cxled_to_port(cxled); !is_cxl_root(iter); + iter = to_cxl_port(iter->dev.parent)) { + rc = cxl_port_attach_region(iter, cxlr, cxled, pos); + if (rc) + goto err; + } + + return 0; + +err: + for (iter = cxled_to_port(cxled); !is_cxl_root(iter); + iter = to_cxl_port(iter->dev.parent)) + cxl_port_detach_region(iter, cxlr, cxled); + return rc; +} + +static int cxl_region_attach(struct cxl_region *cxlr, + struct cxl_endpoint_decoder *cxled, int pos) +{ + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent); + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); + struct cxl_region_params *p = &cxlr->params; + struct cxl_port *ep_port, *root_port; + struct cxl_dport *dport; + int rc = -ENXIO; + + if (cxled->mode != cxlr->mode) { + dev_dbg(&cxlr->dev, "%s region mode: %d mismatch: %d\n", + dev_name(&cxled->cxld.dev), cxlr->mode, cxled->mode); + return -EINVAL; + } + + if (cxled->mode == CXL_DECODER_DEAD) { + dev_dbg(&cxlr->dev, "%s dead\n", dev_name(&cxled->cxld.dev)); + return -ENODEV; + } + + /* all full of members, or interleave config not established? */ + if (p->state > CXL_CONFIG_INTERLEAVE_ACTIVE) { + dev_dbg(&cxlr->dev, "region already active\n"); + return -EBUSY; + } else if (p->state < CXL_CONFIG_INTERLEAVE_ACTIVE) { + dev_dbg(&cxlr->dev, "interleave config missing\n"); + return -ENXIO; + } + ep_port = cxled_to_port(cxled); root_port = cxlrd_to_port(cxlrd); dport = cxl_find_dport_by_dev(root_port, ep_port->host_bridge); @@ -1284,13 +1327,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -ENXIO; } - if (cxlrd->calc_hb(cxlrd, pos) != dport) { - dev_dbg(&cxlr->dev, "%s:%s invalid target position for %s\n", - dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), - dev_name(&cxlrd->cxlsd.cxld.dev)); - return -ENXIO; - } - if (cxled->cxld.target_type != cxlr->type) { dev_dbg(&cxlr->dev, "%s:%s type mismatch: %d vs %d\n", dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), @@ -1314,12 +1350,13 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -EINVAL; } - for (iter = ep_port; !is_cxl_root(iter); - iter = to_cxl_port(iter->dev.parent)) { - rc = cxl_port_attach_region(iter, cxlr, cxled, pos); - if (rc) - goto err; - } + rc = cxl_region_validate_position(cxlr, cxled, pos); + if (rc) + return rc; + + rc = cxl_region_attach_position(cxlr, cxlrd, cxled, dport, pos); + if (rc) + return rc; p->targets[pos] = cxled; cxled->pos = pos; @@ -1343,10 +1380,6 @@ static int cxl_region_attach(struct cxl_region *cxlr, err_decrement: p->nr_targets--; -err: - for (iter = ep_port; !is_cxl_root(iter); - iter = to_cxl_port(iter->dev.parent)) - cxl_port_detach_region(iter, cxlr, cxled); return rc; } From patchwork Mon Feb 6 01:03:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC859C636CC for ; Mon, 6 Feb 2023 01:03:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229542AbjBFBDZ (ORCPT ); Sun, 5 Feb 2023 20:03:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229536AbjBFBDY (ORCPT ); Sun, 5 Feb 2023 20:03:24 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D447144B5; Sun, 5 Feb 2023 17:03:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645402; x=1707181402; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AeIQeRHiJX+pQkHKAuoYi54yr8qg0sN4RL762t1p8Jc=; b=bx49lvqJ4e8uv2OvNwbT3WZ5NpqFhkCUeqsUs9woWs3AymVqkit2JdnO VqJ2HOcySlldDGhb3WFd2/OI6DKAX7YilwP7ChfYS2N/jEmGbjbr8GJl4 sKEimG8+avDUIt7H5zRZfhGjgrtPqQToelaAJRk7I7gEerF2GVPWkz3rw 4s8ST54R4tMZw5L17v//qNXjzJFCSR7OkayENtmZU029jlgxUBZS1P+EZ D8Y4nCTlf8NYG7h6kd00B0iyihin9C+D8HnAFVYZZtuKbZh7rHMfD2gnn Eq11ApoGoLhjfG7enK3eFmAju5NOl3f4lwRX4/CqLUgprYXyLWg7Z8YO7 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763198" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763198" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:13 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291316" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291316" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:13 -0800 Subject: [PATCH 08/18] kernel/range: Uplevel the cxl subsystem's range_contains() helper From: Dan Williams To: linux-cxl@vger.kernel.org Cc: Kees Cook , dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:13 -0800 Message-ID: <167564539327.847146.788601375229324484.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In support of the CXL subsystem's use of 'struct range' to track decode address ranges, add a common range_contains() implementation with identical semantics as resource_contains(); The existing 'range_contains()' in lib/stackinit_kunit.c is namespaced with a 'stackinit_' prefix. Cc: Kees Cook Signed-off-by: Dan Williams Reviewed-by: Gregory Price Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Reviewed-by: Jonathan Cameron Reviewed-by: Vishal Verma --- drivers/cxl/core/pci.c | 5 ----- include/linux/range.h | 5 +++++ lib/stackinit_kunit.c | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 1d1492440287..9ed2120dbf8a 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -214,11 +214,6 @@ static int devm_cxl_enable_mem(struct device *host, struct cxl_dev_state *cxlds) return devm_add_action_or_reset(host, clear_mem_enable, cxlds); } -static bool range_contains(struct range *r1, struct range *r2) -{ - return r1->start <= r2->start && r1->end >= r2->end; -} - /* require dvsec ranges to be covered by a locked platform window */ static int dvsec_range_allowed(struct device *dev, void *arg) { diff --git a/include/linux/range.h b/include/linux/range.h index 274681cc3154..7efb6a9b069b 100644 --- a/include/linux/range.h +++ b/include/linux/range.h @@ -13,6 +13,11 @@ static inline u64 range_len(const struct range *range) return range->end - range->start + 1; } +static inline bool range_contains(struct range *r1, struct range *r2) +{ + return r1->start <= r2->start && r1->end >= r2->end; +} + int add_range(struct range *range, int az, int nr_range, u64 start, u64 end); diff --git a/lib/stackinit_kunit.c b/lib/stackinit_kunit.c index 4591d6cf5e01..05947a2feb93 100644 --- a/lib/stackinit_kunit.c +++ b/lib/stackinit_kunit.c @@ -31,8 +31,8 @@ static volatile u8 forced_mask = 0xff; static void *fill_start, *target_start; static size_t fill_size, target_size; -static bool range_contains(char *haystack_start, size_t haystack_size, - char *needle_start, size_t needle_size) +static bool stackinit_range_contains(char *haystack_start, size_t haystack_size, + char *needle_start, size_t needle_size) { if (needle_start >= haystack_start && needle_start + needle_size <= haystack_start + haystack_size) @@ -175,7 +175,7 @@ static noinline void test_ ## name (struct kunit *test) \ \ /* Validate that compiler lined up fill and target. */ \ KUNIT_ASSERT_TRUE_MSG(test, \ - range_contains(fill_start, fill_size, \ + stackinit_range_contains(fill_start, fill_size, \ target_start, target_size), \ "stack fill missed target!? " \ "(fill %zu wide, target offset by %d)\n", \ From patchwork Mon Feb 6 01:03:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EE30C63797 for ; Mon, 6 Feb 2023 01:03:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229546AbjBFBDZ (ORCPT ); Sun, 5 Feb 2023 20:03:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229517AbjBFBDY (ORCPT ); Sun, 5 Feb 2023 20:03:24 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB28A144B8; Sun, 5 Feb 2023 17:03:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645403; x=1707181403; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1wW75nbCrf3Mo9olc/rXs6RitxwA0HhPGRVwUk3MkpQ=; b=cdOXogEpJQ4b9DkUVRe1uYinj//tFtgvOE5QcSTD1vu/IVKOXL+gzj9/ CvyzAiFU73EqSyw8XwG33BVTDiNtb4QUJ5HPSIVnmRrOPFp+6kLOJ4pAj y1acNyf7qWYTzzURDoI6Fjia0qRMZUUSSZ0Xad01zkUj6Z8/if/nmhSW9 J3XOQQCv92VXD0RPjsx4fPGu6dpXeERvKZI3Gl/GHlDKnRqsNO/FUKxJ5 t0O9lJJw8/4s9g7UeIBRX+MAf7ixNUukW6HP72hjH35VKGJP76zJxRfjj Y1GcvQXGi43QvKVevNqGadQ7XJWW14tfgYfToFLCq8sw5K8mBJYMJudVj Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763209" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763209" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:19 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291323" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291323" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:19 -0800 Subject: [PATCH 09/18] cxl/region: Enable CONFIG_CXL_REGION to be toggled From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:18 -0800 Message-ID: <167564539875.847146.16213498614174558767.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Add help text and a label so the CXL_REGION config option can be toggled. This is mainly to enable compile testing without region support. Signed-off-by: Dan Williams Reviewed-by: Gregory Price Reviewed-by: Jonathan Cameron Reviewed-by: Vishal Verma --- drivers/cxl/Kconfig | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 0ac53c422c31..163c094e67ae 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -104,12 +104,22 @@ config CXL_SUSPEND depends on SUSPEND && CXL_MEM config CXL_REGION - bool + bool "CXL: Region Support" default CXL_BUS # For MAX_PHYSMEM_BITS depends on SPARSEMEM select MEMREGION select GET_FREE_REGION + help + Enable the CXL core to enumerate and provision CXL regions. A CXL + region is defined by one or more CXL expanders that decode a given + system-physical address range. For CXL regions established by + platform-firmware this option enables memory error handling to + identify the devices participating in a given interleaved memory + range. Otherwise, platform-firmware managed CXL is enabled by being + placed in the system address map and does not need a driver. + + If unsure say 'y' config CXL_REGION_INVALIDATION_TEST bool "CXL: Region Cache Management Bypass (TEST)" From patchwork Mon Feb 6 01:03:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E215C64EC3 for ; Mon, 6 Feb 2023 01:03:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229559AbjBFBD0 (ORCPT ); Sun, 5 Feb 2023 20:03:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229552AbjBFBDZ (ORCPT ); Sun, 5 Feb 2023 20:03:25 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 108C5D537; Sun, 5 Feb 2023 17:03:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645405; x=1707181405; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p3cJbdfBQzbXgLwCpMurOEl3oekEj08Qxoo3cX7rCec=; b=X5XV5SF7dVYfv53QwVVGbEhdhn4OT71WXRkFOX8nhVyBJ2+Ja407c0Bs TFKgzn6X1Aiw0yC4T3+vxvVIYnymDbCjHNVXEaCuiLZQTseU+f7nx0+wr V8G0r1jiorHVAWHcEeVVvAPXjmN/nafADqKP+Rbqp6EVY5hj1FciAP1Z7 ViP6u/gWCxZuNH8IdaOxDR0NPI/lOOuQoQ5Hn0w8vtQ8LwTzf+40w3wZv Dn+an039/64w6HArMtcFMxi5ifEC9lQiw16tG1/7gX4Yu2o+6yBuJTKGb lP0ZIhtaLU7jfCBAzDGK1bHjtEAu9U7MXOSQa6/ZjMziR+i73UwfnDBHz g==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763215" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763215" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:24 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291328" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291328" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:24 -0800 Subject: [PATCH 10/18] cxl/region: Fix passthrough-decoder detection From: Dan Williams To: linux-cxl@vger.kernel.org Cc: stable@vger.kernel.org, dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:24 -0800 Message-ID: <167564540422.847146.13816934143225777888.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org A passthrough decoder is a decoder that maps only 1 target. It is a special case because it does not impose any constraints on the interleave-math as compared to a decoder with multiple targets. Extend the passthrough case to multi-target-capable decoders that only have one target selected. I.e. the current code was only considering passthrough *ports* which are only a subset of the potential passthrough decoder scenarios. Fixes: e4f6dfa9ef75 ("cxl/region: Fix 'distance' calculation with passthrough ports") Cc: Signed-off-by: Dan Williams Reviewed-by: Dave Jiang Reviewed-by: Vishal Verma --- drivers/cxl/core/region.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index c82d3b6f3d1f..34cf95217901 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1019,10 +1019,10 @@ static int cxl_port_setup_targets(struct cxl_port *port, int i, distance; /* - * Passthrough ports impose no distance requirements between + * Passthrough decoders impose no distance requirements between * peers */ - if (port->nr_dports == 1) + if (cxl_rr->nr_targets == 1) distance = 0; else distance = p->nr_targets / cxl_rr->nr_targets; From patchwork Mon Feb 6 01:03:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2BBCC636CD for ; Mon, 6 Feb 2023 01:03:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229517AbjBFBDd (ORCPT ); Sun, 5 Feb 2023 20:03:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229519AbjBFBDc (ORCPT ); Sun, 5 Feb 2023 20:03:32 -0500 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F12ED537; Sun, 5 Feb 2023 17:03:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645410; x=1707181410; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=npXpg/Au3yCYgiHvmbPs27wvuVW7voTuMbho/HBxtfI=; b=QstRgqCj2DZtwt7cG7/QpzK/nqQGPVp1mrYfyN50b40VtSXG2pgOOooH MlgvFaum29lZ1DXcFXpBT6UdN6qbArnBdKBdsn68sowelv8KY08PdX2a9 Ev6+6SPcfxqKsvyDhkIBPA1f5Px5Ulda5o+L6ISeCZ5dRieblkKhTOgLP nEWCBwChKG1gHdKb+6JeUFUQLJgYWhIOb4EnqmuGto/OGTB0g7ldhA8mp MGggDWFD88lfu4ojnq3KdFt45fHSCTR+BGQ1jPYO+TIwYcAB3hPX1jIAk ulo2qtyOC38l/YSXw24k/iop1uuruRd33PTNv4PIupA1U+J+13tzGjrRt g==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="312763222" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="312763222" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:30 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="616291339" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="616291339" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:30 -0800 Subject: [PATCH 11/18] cxl/region: Add region autodiscovery From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:29 -0800 Message-ID: <167564540972.847146.17096178433176097831.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Region autodiscovery is an asynchrounous state machine advanced by cxl_port_probe(). After the decoders on an endpoint port are enumerated they are scanned for actively enabled instances. Each active decoder is flagged for auto-assembly CXL_DECODER_F_AUTO and attached to a region. If a region does not already exist for the address range setting of the decoder one is created. That creation process may race with other decoders of the same region being discovered since cxl_port_probe() is asynchronous. A new 'struct cxl_root_decoder' lock, @range_lock, is introduced to mitigate that race. Once all decoders have arrived, "p->nr_targets == p->interleave_ways", they are sorted by their relative decode position. The sort algorithm involves finding the point in the cxl_port topology where one leg of the decode leads to deviceA and the other deviceB. At that point in the topology the target order in the 'struct cxl_switch_decoder' indicates the relative position of those endpoint decoders in the region. >From that point the region goes through the same setup and validation steps as user-created regions, but instead of programming the decoders it validates that driver would have written the same values to the decoders as were already present. Signed-off-by: Dan Williams --- drivers/cxl/core/hdm.c | 5 drivers/cxl/core/port.c | 2 drivers/cxl/core/region.c | 496 ++++++++++++++++++++++++++++++++++++++++++++- drivers/cxl/cxl.h | 16 + drivers/cxl/port.c | 26 ++ 5 files changed, 531 insertions(+), 14 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index dcc16d7cb8f3..174cddfec6e8 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -674,7 +674,7 @@ static int cxl_decoder_reset(struct cxl_decoder *cxld) up_read(&cxl_dpa_rwsem); port->commit_end--; - cxld->flags &= ~CXL_DECODER_F_ENABLE; + cxld->flags &= ~(CXL_DECODER_F_ENABLE | CXL_DECODER_F_AUTO); return 0; } @@ -719,7 +719,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, /* decoders are enabled if committed */ if (committed) { - cxld->flags |= CXL_DECODER_F_ENABLE; + cxld->flags |= CXL_DECODER_F_ENABLE | CXL_DECODER_F_AUTO; if (ctrl & CXL_HDM_DECODER0_CTRL_LOCK) cxld->flags |= CXL_DECODER_F_LOCK; if (FIELD_GET(CXL_HDM_DECODER0_CTRL_TYPE, ctrl)) @@ -783,6 +783,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, return rc; } *dpa_base += dpa_size + skip; + return 0; } diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 47e450c3a5a9..8130430ffbcf 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -446,6 +446,7 @@ bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } +EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL); bool is_root_decoder(struct device *dev) { @@ -1622,6 +1623,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, } cxlrd->calc_hb = calc_hb; + mutex_init(&cxlrd->range_lock); cxld = &cxlsd->cxld; cxld->dev.type = &cxl_decoder_root_type; diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 34cf95217901..6fe8c70790df 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -520,7 +521,12 @@ static void cxl_region_iomem_release(struct cxl_region *cxlr) if (device_is_registered(&cxlr->dev)) lockdep_assert_held_write(&cxl_region_rwsem); if (p->res) { - remove_resource(p->res); + /* + * Autodiscovered regions may not have been able to insert their + * resource. + */ + if (p->res->parent) + remove_resource(p->res); kfree(p->res); p->res = NULL; } @@ -1101,12 +1107,35 @@ static int cxl_port_setup_targets(struct cxl_port *port, return rc; } - cxld->interleave_ways = iw; - cxld->interleave_granularity = ig; - cxld->hpa_range = (struct range) { - .start = p->res->start, - .end = p->res->end, - }; + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) { + if (cxld->interleave_ways != iw || + cxld->interleave_granularity != ig || + cxld->hpa_range.start != p->res->start || + cxld->hpa_range.end != p->res->end || + ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) { + dev_err(&cxlr->dev, + "%s:%s %s expected iw: %d ig: %d %pr\n", + dev_name(port->uport), dev_name(&port->dev), + __func__, iw, ig, p->res); + dev_err(&cxlr->dev, + "%s:%s %s got iw: %d ig: %d state: %s %#llx:%#llx\n", + dev_name(port->uport), dev_name(&port->dev), + __func__, cxld->interleave_ways, + cxld->interleave_granularity, + (cxld->flags & CXL_DECODER_F_ENABLE) ? + "enabled" : + "disabled", + cxld->hpa_range.start, cxld->hpa_range.end); + return -ENXIO; + } + } else { + cxld->interleave_ways = iw; + cxld->interleave_granularity = ig; + cxld->hpa_range = (struct range) { + .start = p->res->start, + .end = p->res->end, + }; + } dev_dbg(&cxlr->dev, "%s:%s iw: %d ig: %d\n", dev_name(port->uport), dev_name(&port->dev), iw, ig); add_target: @@ -1117,7 +1146,17 @@ static int cxl_port_setup_targets(struct cxl_port *port, dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), pos); return -ENXIO; } - cxlsd->target[cxl_rr->nr_targets_set] = ep->dport; + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) { + if (cxlsd->target[cxl_rr->nr_targets_set] != ep->dport) { + dev_dbg(&cxlr->dev, "%s:%s: %s expected %s at %d\n", + dev_name(port->uport), dev_name(&port->dev), + dev_name(&cxlsd->cxld.dev), + dev_name(ep->dport->dport), + cxl_rr->nr_targets_set); + return -ENXIO; + } + } else + cxlsd->target[cxl_rr->nr_targets_set] = ep->dport; inc = 1; out_target_set: cxl_rr->nr_targets_set += inc; @@ -1159,6 +1198,13 @@ static void cxl_region_teardown_targets(struct cxl_region *cxlr) struct cxl_ep *ep; int i; + /* + * In the auto-discovery case skip automatic teardown since the + * address space is already active + */ + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) + return; + for (i = 0; i < p->nr_targets; i++) { cxled = p->targets[i]; cxlmd = cxled_to_memdev(cxled); @@ -1191,8 +1237,8 @@ static int cxl_region_setup_targets(struct cxl_region *cxlr) iter = to_cxl_port(iter->dev.parent); /* - * Descend the topology tree programming targets while - * looking for conflicts. + * Descend the topology tree programming / validating + * targets while looking for conflicts. */ for (ep = cxl_ep_load(iter, cxlmd); iter; iter = ep->next, ep = cxl_ep_load(iter, cxlmd)) { @@ -1287,6 +1333,182 @@ static int cxl_region_attach_position(struct cxl_region *cxlr, return rc; } +static int cxl_region_attach_auto(struct cxl_region *cxlr, + struct cxl_endpoint_decoder *cxled, int pos) +{ + struct cxl_region_params *p = &cxlr->params; + + if (!(cxled->cxld.flags & CXL_DECODER_F_AUTO)) { + dev_err(&cxlr->dev, + "%s: unable to add decoder to autodetected region\n", + dev_name(&cxled->cxld.dev)); + return -EINVAL; + } + + if (pos >= 0) { + dev_dbg(&cxlr->dev, "%s: expected auto position, not %d\n", + dev_name(&cxled->cxld.dev), pos); + return -EINVAL; + } + + if (p->nr_targets >= p->interleave_ways) { + dev_err(&cxlr->dev, "%s: no more target slots available\n", + dev_name(&cxled->cxld.dev)); + return -ENXIO; + } + + /* + * Temporarily record the endpoint decoder into the target array. Yes, + * this means that userspace can view devices in the wrong position + * before the region activates, and must be careful to understand when + * it might be racing region autodiscovery. + */ + pos = p->nr_targets; + p->targets[pos] = cxled; + cxled->pos = pos; + p->nr_targets++; + + return 0; +} + +static struct cxl_port *next_port(struct cxl_port *port) +{ + if (!port->parent_dport) + return NULL; + return port->parent_dport->port; +} + +static int decoder_match_range(struct device *dev, void *data) +{ + struct cxl_endpoint_decoder *cxled = data; + struct cxl_switch_decoder *cxlsd; + + if (!is_switch_decoder(dev)) + return 0; + + cxlsd = to_cxl_switch_decoder(dev); + return range_contains(&cxlsd->cxld.hpa_range, &cxled->cxld.hpa_range); +} + +static void find_positions(const struct cxl_switch_decoder *cxlsd, + const struct cxl_port *iter_a, + const struct cxl_port *iter_b, int *a_pos, + int *b_pos) +{ + int i; + + for (i = 0, *a_pos = -1, *b_pos = -1; i < cxlsd->nr_targets; i++) { + if (cxlsd->target[i] == iter_a->parent_dport) + *a_pos = i; + else if (cxlsd->target[i] == iter_b->parent_dport) + *b_pos = i; + if (*a_pos >= 0 && *b_pos >= 0) + break; + } +} + +static int cmp_decode_pos(const void *a, const void *b) +{ + struct cxl_endpoint_decoder *cxled_a = *(typeof(cxled_a) *)a; + struct cxl_endpoint_decoder *cxled_b = *(typeof(cxled_b) *)b; + struct cxl_memdev *cxlmd_a = cxled_to_memdev(cxled_a); + struct cxl_memdev *cxlmd_b = cxled_to_memdev(cxled_b); + struct cxl_port *port_a = cxled_to_port(cxled_a); + struct cxl_port *port_b = cxled_to_port(cxled_b); + struct cxl_port *iter_a, *iter_b; + + /* Exit early if any prior sorting failed */ + if (cxled_a->pos < 0 || cxled_b->pos < 0) + return 0; + + /* + * Walk up the hierarchy to find a shared port, find the decoder + * that maps the range, compare the relative position of those + * dport mappings. + */ + for (iter_a = port_a; iter_a; iter_a = next_port(iter_a)) { + struct cxl_port *next_a, *next_b, *port; + struct cxl_switch_decoder *cxlsd; + + next_a = next_port(iter_a); + for (iter_b = port_b; iter_b; iter_b = next_port(iter_b)) { + int a_pos, b_pos, result; + struct device *dev; + unsigned int seq; + + next_b = next_port(iter_b); + if (next_a != next_b) + continue; + if (!next_a) + goto out; + port = next_a; + dev = device_find_child(&port->dev, cxled_a, + decoder_match_range); + if (!dev) { + struct range *range = &cxled_a->cxld.hpa_range; + + dev_err(port->uport, + "failed to find decoder that maps %#llx-:%#llx\n", + range->start, range->end); + cxled_a->pos = -1; + return 0; + } + + cxlsd = to_cxl_switch_decoder(dev); + do { + seq = read_seqbegin(&cxlsd->target_lock); + find_positions(cxlsd, iter_a, iter_b, &a_pos, + &b_pos); + } while (read_seqretry(&cxlsd->target_lock, seq)); + + if (a_pos < 0 || b_pos < 0) { + dev_err(port->uport, + "failed to find shared decoder for %s and %s\n", + dev_name(cxlmd_a->dev.parent), + dev_name(cxlmd_b->dev.parent)); + cxled_a->pos = -1; + result = 0; + } else { + result = a_pos - b_pos; + dev_dbg(port->uport, "%s: %s comes %s %s\n", + dev_name(&cxlsd->cxld.dev), + dev_name(cxlmd_a->dev.parent), + result < 0 ? "before" : "after", + dev_name(cxlmd_b->dev.parent)); + } + + put_device(dev); + + return result; + } + } +out: + dev_err(cxlmd_a->dev.parent, "failed to find shared port with %s\n", + dev_name(cxlmd_b->dev.parent)); + cxled_a->pos = -1; + return 0; +} + +static int cxl_region_sort_targets(struct cxl_region *cxlr) +{ + struct cxl_region_params *p = &cxlr->params; + int i, rc = 0; + + sort(p->targets, p->nr_targets, sizeof(p->targets[0]), cmp_decode_pos, + NULL); + + for (i = 0; i < p->nr_targets; i++) { + struct cxl_endpoint_decoder *cxled = p->targets[i]; + + if (cxled->pos < 0) + rc = -ENXIO; + cxled->pos = i; + } + + dev_dbg(&cxlr->dev, "region sort %s\n", rc ? "failed" : "successful"); + return rc; +} + static int cxl_region_attach(struct cxl_region *cxlr, struct cxl_endpoint_decoder *cxled, int pos) { @@ -1350,6 +1572,50 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -EINVAL; } + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) { + int i; + + rc = cxl_region_attach_auto(cxlr, cxled, pos); + if (rc) + return rc; + + /* await more targets to arrive... */ + if (p->nr_targets < p->interleave_ways) + return 0; + + /* + * All targets are here, which implies all PCI enumeration that + * affects this region has been completed. Walk the topology to + * sort the devices into their relative region decode position. + */ + rc = cxl_region_sort_targets(cxlr); + if (rc) + return rc; + + for (i = 0; i < p->nr_targets; i++) { + cxled = p->targets[i]; + ep_port = cxled_to_port(cxled); + dport = cxl_find_dport_by_dev(root_port, + ep_port->host_bridge); + rc = cxl_region_attach_position(cxlr, cxlrd, cxled, + dport, i); + if (rc) + return rc; + } + + rc = cxl_region_setup_targets(cxlr); + if (rc) + return rc; + + /* + * If target setup succeeds in the autodiscovery case + * then the region is already committed. + */ + p->state = CXL_CONFIG_COMMIT; + + return 0; + } + rc = cxl_region_validate_position(cxlr, cxled, pos); if (rc) return rc; @@ -1500,8 +1766,8 @@ static int detach_target(struct cxl_region *cxlr, int pos) return rc; } -static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos, - size_t len) +static ssize_t store_targetN(struct cxl_region *cxlr, const char *buf, int pos, + size_t len) { int rc; @@ -2079,6 +2345,191 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr) return rc; } +static int match_decoder_by_range(struct device *dev, void *data) +{ + struct range *r1, *r2 = data; + struct cxl_root_decoder *cxlrd; + + if (!is_root_decoder(dev)) + return 0; + + cxlrd = to_cxl_root_decoder(dev); + r1 = &cxlrd->cxlsd.cxld.hpa_range; + return range_contains(r1, r2); +} + +static int match_region_by_range(struct device *dev, void *data) +{ + struct cxl_region_params *p; + struct cxl_region *cxlr; + struct range *r = data; + int rc = 0; + + if (!is_cxl_region(dev)) + return 0; + + cxlr = to_cxl_region(dev); + p = &cxlr->params; + + down_read(&cxl_region_rwsem); + if (p->res && p->res->start == r->start && p->res->end == r->end) + rc = 1; + up_read(&cxl_region_rwsem); + + return rc; +} + +/* Establish an empty region covering the given HPA range */ +static struct cxl_region *construct_region(struct cxl_root_decoder *cxlrd, + struct cxl_endpoint_decoder *cxled) +{ + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); + struct cxl_port *port = cxlrd_to_port(cxlrd); + struct range *hpa = &cxled->cxld.hpa_range; + struct cxl_region_params *p; + struct cxl_region *cxlr; + struct resource *res; + int rc; + + do { + cxlr = __create_region(cxlrd, cxled->mode, + atomic_read(&cxlrd->region_id)); + } while (IS_ERR(cxlr) && PTR_ERR(cxlr) == -EBUSY); + + if (IS_ERR(cxlr)) { + dev_err(cxlmd->dev.parent, + "%s:%s: %s failed assign region: %ld\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), + __func__, PTR_ERR(cxlr)); + return cxlr; + } + + down_write(&cxl_region_rwsem); + p = &cxlr->params; + if (p->state >= CXL_CONFIG_INTERLEAVE_ACTIVE) { + dev_err(cxlmd->dev.parent, + "%s:%s: %s autodiscovery interrupted\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), + __func__); + rc = -EBUSY; + goto err; + } + + set_bit(CXL_REGION_F_AUTO, &cxlr->flags); + + res = kmalloc(sizeof(*res), GFP_KERNEL); + if (!res) { + rc = -ENOMEM; + goto err; + } + + *res = DEFINE_RES_MEM_NAMED(hpa->start, range_len(hpa), + dev_name(&cxlr->dev)); + rc = insert_resource(cxlrd->res, res); + if (rc) { + /* + * Platform-firmware may not have split resources like "System + * RAM" on CXL window boundaries see cxl_region_iomem_release() + */ + dev_warn(cxlmd->dev.parent, + "%s:%s: %s %s cannot insert resource\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), + __func__, dev_name(&cxlr->dev)); + } + + p->res = res; + p->interleave_ways = cxled->cxld.interleave_ways; + p->interleave_granularity = cxled->cxld.interleave_granularity; + p->state = CXL_CONFIG_INTERLEAVE_ACTIVE; + + rc = sysfs_update_group(&cxlr->dev.kobj, get_cxl_region_target_group()); + if (rc) + goto err; + + dev_dbg(cxlmd->dev.parent, "%s:%s: %s %s res: %pr iw: %d ig: %d\n", + dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev), __func__, + dev_name(&cxlr->dev), p->res, p->interleave_ways, + p->interleave_granularity); + + /* ...to match put_device() in cxl_add_to_region() */ + get_device(&cxlr->dev); + up_write(&cxl_region_rwsem); + + return cxlr; + +err: + up_write(&cxl_region_rwsem); + devm_release_action(port->uport, unregister_region, cxlr); + return ERR_PTR(rc); +} + +void cxl_add_to_region(struct cxl_endpoint_decoder *cxled) +{ + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled); + struct range *hpa = &cxled->cxld.hpa_range; + struct cxl_decoder *cxld = &cxled->cxld; + struct cxl_root_decoder *cxlrd; + struct cxl_region_params *p; + struct cxl_region *cxlr; + struct cxl_port *root; + bool attach = false; + struct device *dev; + + root = find_cxl_root(&cxlmd->dev); + if (!root) { + dev_err(cxlmd->dev.parent, "%s: failed to map CXL root\n", + dev_name(&cxlmd->dev)); + return; + } + + dev = device_find_child(&root->dev, &cxld->hpa_range, + match_decoder_by_range); + if (!dev) { + dev_err(cxlmd->dev.parent, + "%s:%s no CXL window for range %#llx:%#llx\n", + dev_name(&cxlmd->dev), dev_name(&cxld->dev), + cxld->hpa_range.start, cxld->hpa_range.end); + goto out; + } + cxlrd = to_cxl_root_decoder(dev); + + mutex_lock(&cxlrd->range_lock); + dev = device_find_child(&cxlrd->cxlsd.cxld.dev, hpa, + match_region_by_range); + if (!dev) + cxlr = construct_region(cxlrd, cxled); + else + cxlr = to_cxl_region(dev); + mutex_unlock(&cxlrd->range_lock); + + if (IS_ERR(cxlr)) + goto out; + + attach_target(cxlr, cxled, -1, TASK_UNINTERRUPTIBLE); + + down_read(&cxl_region_rwsem); + p = &cxlr->params; + attach = p->state >= CXL_CONFIG_COMMIT; + up_read(&cxl_region_rwsem); + + if (attach) { + int rc = device_attach(&cxlr->dev); + + /* + * If device_attach() fails the range may still be + * active via the platform-firmware memory map + */ + if (rc < 0) + dev_err(&cxlr->dev, "failed to enable, range: %pr\n", + p->res); + } + + put_device(&cxlr->dev); +out: + put_device(&root->dev); +} +EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL); + static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) { if (!test_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags)) @@ -2103,6 +2554,15 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) return 0; } +static int is_system_ram(struct resource *res, void *arg) +{ + struct cxl_region *cxlr = arg; + struct cxl_region_params *p = &cxlr->params; + + dev_dbg(&cxlr->dev, "%pr has System RAM: %pr\n", p->res, res); + return 1; +} + static int cxl_region_probe(struct device *dev) { struct cxl_region *cxlr = to_cxl_region(dev); @@ -2136,6 +2596,18 @@ static int cxl_region_probe(struct device *dev) switch (cxlr->mode) { case CXL_DECODER_PMEM: return devm_cxl_add_pmem_region(cxlr); + case CXL_DECODER_RAM: + /* + * The region can not be manged by CXL if any portion of + * it is already online as 'System RAM' + */ + if (walk_iomem_res_desc(IORES_DESC_NONE, + IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, + p->res->start, p->res->end, cxlr, + is_system_ram) > 0) + return 0; + dev_dbg(dev, "TODO: hookup devdax\n"); + return 0; default: dev_dbg(&cxlr->dev, "unsupported region mode: %d\n", cxlr->mode); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index ca76879af1de..9b3765c5c81a 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -261,6 +261,8 @@ resource_size_t cxl_rcrb_to_component(struct device *dev, * cxl_decoder flags that define the type of memory / devices this * decoder supports as well as configuration lock status See "CXL 2.0 * 8.2.5.12.7 CXL HDM Decoder 0 Control Register" for details. + * Additionally indicate whether decoder settings were autodetected, + * user customized. */ #define CXL_DECODER_F_RAM BIT(0) #define CXL_DECODER_F_PMEM BIT(1) @@ -268,6 +270,7 @@ resource_size_t cxl_rcrb_to_component(struct device *dev, #define CXL_DECODER_F_TYPE3 BIT(3) #define CXL_DECODER_F_LOCK BIT(4) #define CXL_DECODER_F_ENABLE BIT(5) +#define CXL_DECODER_F_AUTO BIT(6) #define CXL_DECODER_F_MASK GENMASK(5, 0) enum cxl_decoder_type { @@ -380,6 +383,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd, * @region_id: region id for next region provisioning event * @calc_hb: which host bridge covers the n'th position by granularity * @platform_data: platform specific configuration data + * @range_lock: sync region autodiscovery by address range * @cxlsd: base cxl switch decoder */ struct cxl_root_decoder { @@ -387,6 +391,7 @@ struct cxl_root_decoder { atomic_t region_id; cxl_calc_hb_fn calc_hb; void *platform_data; + struct mutex range_lock; struct cxl_switch_decoder cxlsd; }; @@ -436,6 +441,13 @@ struct cxl_region_params { */ #define CXL_REGION_F_INCOHERENT 0 +/* + * Indicate whether this region has been assembled by autodetection or + * userspace assembly. Prevent endpoint decoders outside of automatic + * detection from being added to the region. + */ +#define CXL_REGION_F_AUTO 1 + /** * struct cxl_region - CXL region * @dev: This region's device @@ -699,6 +711,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct device *dev); #ifdef CONFIG_CXL_REGION bool is_cxl_pmem_region(struct device *dev); struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); +void cxl_add_to_region(struct cxl_endpoint_decoder *cxled); #else static inline bool is_cxl_pmem_region(struct device *dev) { @@ -708,6 +721,9 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev) { return NULL; } +static inline void cxl_add_to_region(struct cxl_endpoint_decoder *cxled) +{ +} #endif /* diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c index 5453771bf330..012a0c6f8476 100644 --- a/drivers/cxl/port.c +++ b/drivers/cxl/port.c @@ -30,6 +30,23 @@ static void schedule_detach(void *cxlmd) schedule_cxl_memdev_detach(cxlmd); } +static int discover_region(struct device *dev, void *data) +{ + const unsigned long flags = CXL_DECODER_F_ENABLE | CXL_DECODER_F_AUTO; + struct cxl_endpoint_decoder *cxled; + + if (!is_endpoint_decoder(dev)) + return 0; + + cxled = to_cxl_endpoint_decoder(dev); + if ((cxled->cxld.flags & flags) != flags) + return 0; + + cxl_add_to_region(cxled); + + return 0; +} + static int cxl_port_probe(struct device *dev) { struct cxl_port *port = to_cxl_port(dev); @@ -78,6 +95,15 @@ static int cxl_port_probe(struct device *dev) return rc; } + if (!is_cxl_endpoint(port)) + return 0; + + /* + * Now that all endpoint decoders are successfully enumerated, + * try to assemble regions from committed decoders + */ + device_for_each_child(dev, NULL, discover_region); + return 0; } From patchwork Mon Feb 6 01:03:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18D2CC636CC for ; Mon, 6 Feb 2023 01:03:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229493AbjBFBDj (ORCPT ); Sun, 5 Feb 2023 20:03:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229565AbjBFBDh (ORCPT ); Sun, 5 Feb 2023 20:03:37 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EDC614EA0; Sun, 5 Feb 2023 17:03:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645416; x=1707181416; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t66eRnuPtljjvIsyyQm2MInJpJjgFLhSqg1tPwrvkw4=; b=PAO+1yeOOtU3dk87SargORah/X+hU2QTiF7A3L5MgaPgz0Xki/8JmMZz k0PovZ738Fu3vgkyMY9qPUNwb2aakCcJLHo9/wXzxZJWqhRr0zSnjZlI8 StnTb1fNDIhYtVssoxnTRdlAJfapSPf6gEi9m90lr9LVwk5gWVF/rTTr4 Nkx7l23Hcwlwdy6tBqloFfC2++TtTfgGM1kq2Y4wz7eglqHZRXhHI49VG t0xOedRDZ0TPQof1k8tdMZcf1olkDK5xurNvaqbOS5O/XZ56ka307UtdU rd9y0k3pYyrca38A8g51SjOFBk2vYOPjVG9U6vzBR5Iko3A6l3fAG4Bi/ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442656" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442656" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:36 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006645" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006645" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:35 -0800 Subject: [PATCH 12/18] tools/testing/cxl: Define a fixed volatile configuration to parse From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:35 -0800 Message-ID: <167564541523.847146.12199636368812381475.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Take two endpoints attached to the first switch on the first host-bridge in the cxl_test topology and define a pre-initialized region. This is a x2 interleave underneath a x1 CXL Window. $ modprobe cxl_test $ # cxl list -Ru { "region":"region3", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } Signed-off-by: Dan Williams --- drivers/cxl/core/core.h | 3 - drivers/cxl/core/hdm.c | 3 + drivers/cxl/core/port.c | 4 + drivers/cxl/cxl.h | 4 + drivers/cxl/cxlmem.h | 3 + tools/testing/cxl/test/cxl.c | 146 +++++++++++++++++++++++++++++++++++++++--- 6 files changed, 148 insertions(+), 15 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 5eb873da5a30..479f01da6d35 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -57,9 +57,6 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled); resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled); extern struct rw_semaphore cxl_dpa_rwsem; -bool is_switch_decoder(struct device *dev); -struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev); - int cxl_memdev_init(void); void cxl_memdev_exit(void); void cxl_mbox_init(void); diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 174cddfec6e8..28fa4d835259 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -279,7 +279,7 @@ static int __cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, return 0; } -static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, +int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, resource_size_t base, resource_size_t len, resource_size_t skipped) { @@ -295,6 +295,7 @@ static int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, return devm_add_action_or_reset(&port->dev, cxl_dpa_release, cxled); } +EXPORT_SYMBOL_NS_GPL(devm_cxl_dpa_reserve, CXL); resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled) { diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 8130430ffbcf..8d0895cbae93 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -458,6 +458,7 @@ bool is_switch_decoder(struct device *dev) { return is_root_decoder(dev) || dev->type == &cxl_decoder_switch_type; } +EXPORT_SYMBOL_NS_GPL(is_switch_decoder, CXL); struct cxl_decoder *to_cxl_decoder(struct device *dev) { @@ -485,6 +486,7 @@ struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev) return NULL; return container_of(dev, struct cxl_switch_decoder, cxld.dev); } +EXPORT_SYMBOL_NS_GPL(to_cxl_switch_decoder, CXL); static void cxl_ep_release(struct cxl_ep *ep) { @@ -528,7 +530,7 @@ static const struct device_type cxl_port_type = { bool is_cxl_port(struct device *dev) { - return dev->type == &cxl_port_type; + return dev && dev->type == &cxl_port_type; } EXPORT_SYMBOL_NS_GPL(is_cxl_port, CXL); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 9b3765c5c81a..4c6ee6c96f23 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -452,6 +452,7 @@ struct cxl_region_params { * struct cxl_region - CXL region * @dev: This region's device * @id: This region's id. Id is globally unique across all regions + * @fixed: At least one decoder in this region was locked down at init * @mode: Endpoint decoder allocation / access mode * @type: Endpoint decoder target type * @cxl_nvb: nvdimm bridge for coordinating @cxlr_pmem setup / shutdown @@ -462,6 +463,7 @@ struct cxl_region_params { struct cxl_region { struct device dev; int id; + bool fixed; enum cxl_decoder_mode mode; enum cxl_decoder_type type; struct cxl_nvdimm_bridge *cxl_nvb; @@ -643,8 +645,10 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port, struct cxl_decoder *to_cxl_decoder(struct device *dev); struct cxl_root_decoder *to_cxl_root_decoder(struct device *dev); +struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev); struct cxl_endpoint_decoder *to_cxl_endpoint_decoder(struct device *dev); bool is_root_decoder(struct device *dev); +bool is_switch_decoder(struct device *dev); bool is_endpoint_decoder(struct device *dev); struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, unsigned int nr_targets, diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index ab138004f644..82a430cfc3ff 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -79,6 +79,9 @@ static inline bool is_cxl_endpoint(struct cxl_port *port) } struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds); +int devm_cxl_dpa_reserve(struct cxl_endpoint_decoder *cxled, + resource_size_t base, resource_size_t len, + resource_size_t skipped); static inline struct cxl_ep *cxl_ep_load(struct cxl_port *port, struct cxl_memdev *cxlmd) diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 920bd969c554..ea761afa446f 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -703,6 +703,141 @@ static int mock_decoder_reset(struct cxl_decoder *cxld) return 0; } +static void default_mock_decoder(struct cxl_decoder *cxld) +{ + cxld->hpa_range = (struct range){ + .start = 0, + .end = -1, + }; + + cxld->interleave_ways = 1; + cxld->interleave_granularity = 256; + cxld->target_type = CXL_DECODER_EXPANDER; + cxld->commit = mock_decoder_commit; + cxld->reset = mock_decoder_reset; +} + +static int first_decoder(struct device *dev, void *data) +{ + struct cxl_decoder *cxld; + + if (!is_switch_decoder(dev)) + return 0; + cxld = to_cxl_decoder(dev); + if (cxld->id == 0) + return 1; + return 0; +} + +static void mock_init_hdm_decoder(struct cxl_decoder *cxld) +{ + struct acpi_cedt_cfmws *window = mock_cfmws[0]; + struct platform_device *pdev = NULL; + struct cxl_endpoint_decoder *cxled; + struct cxl_switch_decoder *cxlsd; + struct cxl_port *port, *iter; + const int size = SZ_512M; + struct cxl_memdev *cxlmd; + struct cxl_dport *dport; + struct device *dev; + bool hb0 = false; + u64 base; + int i; + + if (is_endpoint_decoder(&cxld->dev)) { + cxled = to_cxl_endpoint_decoder(&cxld->dev); + cxlmd = cxled_to_memdev(cxled); + WARN_ON(!dev_is_platform(cxlmd->dev.parent)); + pdev = to_platform_device(cxlmd->dev.parent); + + /* check is endpoint is attach to host-bridge0 */ + port = cxled_to_port(cxled); + do { + if (port->uport == &cxl_host_bridge[0]->dev) { + hb0 = true; + break; + } + if (is_cxl_port(port->dev.parent)) + port = to_cxl_port(port->dev.parent); + else + port = NULL; + } while (port); + port = cxled_to_port(cxled); + } + + /* + * The first decoder on the first 2 devices on the first switch + * attached to host-bridge0 mock a fake / static RAM region. All + * other decoders are default disabled. Given the round robin + * assignment those devices are named cxl_mem.0, and cxl_mem.4. + * + * See 'cxl list -BMPu -m cxl_mem.0,cxl_mem.4' + */ + if (!hb0 || pdev->id % 4 || pdev->id > 4 || cxld->id > 0) { + default_mock_decoder(cxld); + return; + } + + base = window->base_hpa; + cxld->hpa_range = (struct range) { + .start = base, + .end = base + size - 1, + }; + + cxld->interleave_ways = 2; + eig_to_granularity(window->granularity, &cxld->interleave_granularity); + cxld->target_type = CXL_DECODER_EXPANDER; + cxld->flags = CXL_DECODER_F_ENABLE | CXL_DECODER_F_AUTO; + port->commit_end = cxld->id; + devm_cxl_dpa_reserve(cxled, 0, size / cxld->interleave_ways, 0); + cxld->commit = mock_decoder_commit; + cxld->reset = mock_decoder_reset; + + /* + * Now that endpoint decoder is set up, walk up the hierarchy + * and setup the switch and root port decoders targeting @cxlmd. + */ + iter = port; + for (i = 0; i < 2; i++) { + dport = iter->parent_dport; + iter = dport->port; + dev = device_find_child(&iter->dev, NULL, first_decoder); + /* + * Ancestor ports are guaranteed to be enumerated before + * @port, and all ports have at least one decoder. + */ + if (WARN_ON(!dev)) + continue; + cxlsd = to_cxl_switch_decoder(dev); + if (i == 0) { + /* put cxl_mem.4 second in the decode order */ + if (pdev->id == 4) + cxlsd->target[1] = dport; + else + cxlsd->target[0] = dport; + } else + cxlsd->target[0] = dport; + cxld = &cxlsd->cxld; + cxld->target_type = CXL_DECODER_EXPANDER; + cxld->flags = CXL_DECODER_F_ENABLE; + iter->commit_end = 0; + /* + * Switch targets 2 endpoints, while host bridge targets + * one root port + */ + if (i == 0) + cxld->interleave_ways = 2; + else + cxld->interleave_ways = 1; + cxld->interleave_granularity = 256; + cxld->hpa_range = (struct range) { + .start = base, + .end = base + size - 1, + }; + put_device(dev); + } +} + static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) { struct cxl_port *port = cxlhdm->port; @@ -748,16 +883,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) cxld = &cxled->cxld; } - cxld->hpa_range = (struct range) { - .start = 0, - .end = -1, - }; - - cxld->interleave_ways = min_not_zero(target_count, 1); - cxld->interleave_granularity = SZ_4K; - cxld->target_type = CXL_DECODER_EXPANDER; - cxld->commit = mock_decoder_commit; - cxld->reset = mock_decoder_reset; + mock_init_hdm_decoder(cxld); if (target_count) { rc = device_for_each_child(port->uport, &ctx, From patchwork Mon Feb 6 01:03:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129224 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0795C636CD for ; Mon, 6 Feb 2023 01:03:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229565AbjBFBDo (ORCPT ); Sun, 5 Feb 2023 20:03:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229553AbjBFBDm (ORCPT ); Sun, 5 Feb 2023 20:03:42 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E31EF144B5; Sun, 5 Feb 2023 17:03:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645421; x=1707181421; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=efc+FneV0jKpVBpxm5XWbyZ/alUdKgjj2AHpNBEopmM=; b=QuD+LaJh/jx2c/zkUgdfIpNkBA83Y+y8LkZhlaNy52yLpt7LdRcn7dnR Dl1Ju3SXaveEBpyi54gpugW2xCqq0t7JtkR0yIq18lo/rmgnoFDMPvfzU 9UNnOqUgL9nkswx+XXa3LLfQVSFG3CI8GWqJ2Bznf+INiKF9QDOXTNlUo jhg7HqL1gbwxVLeN82fnVQOe1HGZwcI39e0Q2Z0YiibvSnXLaI8x/Jyo2 LIqB/5YvDROE1vtFzpsYUdqxvMlTMJBFE+Fc6hgxQrsCm7/cAWJeLXefG DFYuYElemXlV6Ca8020cjnZxgk9x5cSBNWPvwh6PILmoy0r6z+iBC7yrb A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442667" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442667" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:41 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006670" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006670" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:41 -0800 Subject: [PATCH 13/18] dax/hmem: Move HMAT and Soft reservation probe initcall level From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:41 -0800 Message-ID: <167564542109.847146.10113972881782419363.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for moving more filtering of "hmem" ranges into the dax_hmem.ko module, update the initcall levels. HMAT range registration moves to subsys_initcall() to be done before Soft Reservation probing, and Soft Reservation probing is moved to device_initcall() to be done before dax_hmem.ko initialization if it is built-in. Signed-off-by: Dan Williams --- drivers/acpi/numa/hmat.c | 2 +- drivers/dax/hmem/Makefile | 3 ++- drivers/dax/hmem/device.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index 605a0c7053be..ff24282301ab 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -869,4 +869,4 @@ static __init int hmat_init(void) acpi_put_table(tbl); return 0; } -device_initcall(hmat_init); +subsys_initcall(hmat_init); diff --git a/drivers/dax/hmem/Makefile b/drivers/dax/hmem/Makefile index 57377b4c3d47..d4c4cd6bccd7 100644 --- a/drivers/dax/hmem/Makefile +++ b/drivers/dax/hmem/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o +# device_hmem.o deliberately precedes dax_hmem.o for initcall ordering obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o +obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o device_hmem-y := device.o dax_hmem-y := hmem.o diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c index 903325aac991..20749c7fab81 100644 --- a/drivers/dax/hmem/device.c +++ b/drivers/dax/hmem/device.c @@ -104,4 +104,4 @@ static __init int hmem_init(void) * As this is a fallback for address ranges unclaimed by the ACPI HMAT * parsing it must be at an initcall level greater than hmat_init(). */ -late_initcall(hmem_init); +device_initcall(hmem_init); From patchwork Mon Feb 6 01:03:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A365EC636CC for ; Mon, 6 Feb 2023 01:03:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229568AbjBFBDt (ORCPT ); Sun, 5 Feb 2023 20:03:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229553AbjBFBDs (ORCPT ); Sun, 5 Feb 2023 20:03:48 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 298208A53; Sun, 5 Feb 2023 17:03:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645428; x=1707181428; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ME8XRCvN/v2Itlb7KzpHyRZDGDqzMok/X5JUlaNFKjM=; b=gx6W2Yk7XxUKIxbtRO/Apx7OhpEQTJ0eQl6sXp0DGths2dlKfgaZ1P4r 0D2q2vibkUfC6QgEdK5GRqChWGVIMrOzisFzB9Zf8wxbjcID73o89BksI Ksz0ca6bY8e7LUixQLwHvXt0ggbUdx4c6OqVRG4RVs6ovr/1XSy3wYUfN zx16ZDXa1ABvplS9N76bKv7VbSQbAKOrQZSc32Wa7acroDOhOcnZufNsC g6zjwEf6PxeGynkPREXyfhGKqfVWEANJrNgCyulAPx4UavrwVeADpR1QV hVYJbz6AaJgT9rIXsUO9vIc9NrjnWKLc55Z0SlV8z72qCoOZkBmi2iMQ3 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442676" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442676" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:47 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006679" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006679" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:47 -0800 Subject: [PATCH 14/18] dax/hmem: Drop unnecessary dax_hmem_remove() From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:46 -0800 Message-ID: <167564542679.847146.17174404738816053065.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Empty driver remove callbacks can just be elided. Signed-off-by: Dan Williams Reviewed-by: Gregory Price Reviewed-by: Jonathan Cameron --- drivers/dax/hmem/hmem.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 1bf040dbc834..c7351e0dc8ff 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -44,15 +44,8 @@ static int dax_hmem_probe(struct platform_device *pdev) return 0; } -static int dax_hmem_remove(struct platform_device *pdev) -{ - /* devm handles teardown */ - return 0; -} - static struct platform_driver dax_hmem_driver = { .probe = dax_hmem_probe, - .remove = dax_hmem_remove, .driver = { .name = "hmem", }, From patchwork Mon Feb 6 01:03:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66E31C636D4 for ; Mon, 6 Feb 2023 01:03:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229570AbjBFBD4 (ORCPT ); Sun, 5 Feb 2023 20:03:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229553AbjBFBDz (ORCPT ); Sun, 5 Feb 2023 20:03:55 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E6A1CDF9; Sun, 5 Feb 2023 17:03:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645434; x=1707181434; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UGjDlEaM3SmHqxTQJem08iZ8rEDpl61Vdevr59NhbRE=; b=av4WjYS/GIFWEccOeYzYOEsQ0c5yprUOqTWRziN3M+HEke2oc95HsbpP rurWoUn1GLC7IenNpl30oNu5tbai5dJRi1+ZUtMJSGJbMTMpdEDCmaSXU q7iJ/H//ICzTYR2MUa2JwY308x1cw6MbTqIuldYvbZTWe5Vwi/VScIlyZ 64IL8CuOWd6g0qYaGHWfjuwPRoyKBgkqm7iHf3PVvm9YlGGrlTZiMG9M9 H5S+htDW2oEDcdSLB7S37L+0J+gCj7svwN6sTjgaaYnR128/SpcOyuU1m hC5Li1DWuRhg1I3RdYPK1t7cTit+9l8nurCdcSvtQ4M6SceXs4cDial9K w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442686" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442686" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:54 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006720" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006720" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:53 -0800 Subject: [PATCH 15/18] dax/hmem: Convey the dax range via memregion_info() From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:53 -0800 Message-ID: <167564543303.847146.11045895213318648441.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for hmem platform devices to be unregistered, stop using platform_device_add_resources() to convey the address range. The platform_device_add_resources() API causes an existing "Soft Reserved" iomem resource to be re-parented under an inserted platform device resource. When that platform device is deleted it removes the platform device resource and all children. Instead, it is sufficient to convey just the address range and let request_mem_region() insert resources to indicate the devices active in the range. This allows the "Soft Reserved" resource to be re-enumerated upon the next probe event. Signed-off-by: Dan Williams Reviewed-by: Jonathan Cameron --- drivers/dax/hmem/device.c | 37 ++++++++++++++----------------------- drivers/dax/hmem/hmem.c | 14 +++----------- include/linux/memregion.h | 2 ++ 3 files changed, 19 insertions(+), 34 deletions(-) diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c index 20749c7fab81..b1b339bccfe5 100644 --- a/drivers/dax/hmem/device.c +++ b/drivers/dax/hmem/device.c @@ -15,15 +15,8 @@ static struct resource hmem_active = { .flags = IORESOURCE_MEM, }; -void hmem_register_device(int target_nid, struct resource *r) +void hmem_register_device(int target_nid, struct resource *res) { - /* define a clean / non-busy resource for the platform device */ - struct resource res = { - .start = r->start, - .end = r->end, - .flags = IORESOURCE_MEM, - .desc = IORES_DESC_SOFT_RESERVED, - }; struct platform_device *pdev; struct memregion_info info; int rc, id; @@ -31,55 +24,53 @@ void hmem_register_device(int target_nid, struct resource *r) if (nohmem) return; - rc = region_intersects(res.start, resource_size(&res), IORESOURCE_MEM, - IORES_DESC_SOFT_RESERVED); + rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM, + IORES_DESC_SOFT_RESERVED); if (rc != REGION_INTERSECTS) return; id = memregion_alloc(GFP_KERNEL); if (id < 0) { - pr_err("memregion allocation failure for %pr\n", &res); + pr_err("memregion allocation failure for %pr\n", res); return; } pdev = platform_device_alloc("hmem", id); if (!pdev) { - pr_err("hmem device allocation failure for %pr\n", &res); + pr_err("hmem device allocation failure for %pr\n", res); goto out_pdev; } - if (!__request_region(&hmem_active, res.start, resource_size(&res), + if (!__request_region(&hmem_active, res->start, resource_size(res), dev_name(&pdev->dev), 0)) { - dev_dbg(&pdev->dev, "hmem range %pr already active\n", &res); + dev_dbg(&pdev->dev, "hmem range %pr already active\n", res); goto out_active; } pdev->dev.numa_node = numa_map_to_online_node(target_nid); info = (struct memregion_info) { .target_node = target_nid, + .range = { + .start = res->start, + .end = res->end, + }, }; rc = platform_device_add_data(pdev, &info, sizeof(info)); if (rc < 0) { - pr_err("hmem memregion_info allocation failure for %pr\n", &res); - goto out_resource; - } - - rc = platform_device_add_resources(pdev, &res, 1); - if (rc < 0) { - pr_err("hmem resource allocation failure for %pr\n", &res); + pr_err("hmem memregion_info allocation failure for %pr\n", res); goto out_resource; } rc = platform_device_add(pdev); if (rc < 0) { - dev_err(&pdev->dev, "device add failed for %pr\n", &res); + dev_err(&pdev->dev, "device add failed for %pr\n", res); goto out_resource; } return; out_resource: - __release_region(&hmem_active, res.start, resource_size(&res)); + __release_region(&hmem_active, res->start, resource_size(res)); out_active: platform_device_put(pdev); out_pdev: diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index c7351e0dc8ff..5025a8c9850b 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -15,25 +15,17 @@ static int dax_hmem_probe(struct platform_device *pdev) struct memregion_info *mri; struct dev_dax_data data; struct dev_dax *dev_dax; - struct resource *res; - struct range range; - - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (!res) - return -ENOMEM; mri = dev->platform_data; - range.start = res->start; - range.end = res->end; - dax_region = alloc_dax_region(dev, pdev->id, &range, mri->target_node, - PMD_SIZE, 0); + dax_region = alloc_dax_region(dev, pdev->id, &mri->range, + mri->target_node, PMD_SIZE, 0); if (!dax_region) return -ENOMEM; data = (struct dev_dax_data) { .dax_region = dax_region, .id = -1, - .size = region_idle ? 0 : resource_size(res), + .size = region_idle ? 0 : range_len(&mri->range), }; dev_dax = devm_create_dev_dax(&data); if (IS_ERR(dev_dax)) diff --git a/include/linux/memregion.h b/include/linux/memregion.h index bf83363807ac..c01321467789 100644 --- a/include/linux/memregion.h +++ b/include/linux/memregion.h @@ -3,10 +3,12 @@ #define _MEMREGION_H_ #include #include +#include #include struct memregion_info { int target_node; + struct range range; }; #ifdef CONFIG_MEMREGION From patchwork Mon Feb 6 01:03:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4FEAC636CD for ; Mon, 6 Feb 2023 01:04:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229581AbjBFBEC (ORCPT ); Sun, 5 Feb 2023 20:04:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229553AbjBFBEB (ORCPT ); Sun, 5 Feb 2023 20:04:01 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B3ACCDF9; Sun, 5 Feb 2023 17:04:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645440; x=1707181440; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=udm2GNyy9Pfhnt9VZa/fhxu0/423zpDc7iWLFcwUZ5U=; b=LIqmGTc6B/dR6Ei3RnkaMQXOvqRkdICSn5dXYt3Pfu2QZqu/OFa1fX2K vgtweA5YADEuqqgL2TbP/PMDuiPFkHk5+YP/qD1kjsVOHkY7y07J2XoSX nER0LCGnqIgejM4WlFkNr5+S28nqFfE1WWo3GEQaEV68NoOSACoWzpDcw VduUNRY6VJE1TBKuhuih5bpM/Wg87/0zhopHB0e3ejTKHxy5Zy7Mc9B+b 9+8WXLagwqFplSxUqokwwPMZnhb1Tvh5qdC63kEzSjh03nNf9n8owSYlL 90D08XQd6UPkTVicwG2bc8ZgNto0IYtukD+8Yo/cQmLvhodVMP5k1SCBz A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442696" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442696" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:04:00 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006734" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006734" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:03:59 -0800 Subject: [PATCH 16/18] dax/hmem: Move hmem device registration to dax_hmem.ko From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:03:59 -0800 Message-ID: <167564543923.847146.9030380223622044744.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org In preparation for the CXL region driver to take over the responsibility of registering device-dax instances for CXL regions, move the registration of "hmem" devices to dax_hmem.ko. Previously the builtin component of this enabling (drivers/dax/hmem/device.o) would register platform devices for each address range and trigger the dax_hmem.ko module to load and attach device-dax instances to those devices. Now, the ranges are collected from the HMAT and EFI memory map walking, but the device creation is deferred. A new "hmem_platform" device is created which triggers dax_hmem.ko to load and register the platform devices. Signed-off-by: Dan Williams --- drivers/acpi/numa/hmat.c | 2 - drivers/dax/Kconfig | 2 - drivers/dax/hmem/device.c | 91 +++++++++++++++++++-------------------- drivers/dax/hmem/hmem.c | 105 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/dax.h | 7 ++- 5 files changed, 155 insertions(+), 52 deletions(-) diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c index ff24282301ab..bba268ecd802 100644 --- a/drivers/acpi/numa/hmat.c +++ b/drivers/acpi/numa/hmat.c @@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target) for (res = target->memregions.child; res; res = res->sibling) { int target_nid = pxm_to_node(target->memory_pxm); - hmem_register_device(target_nid, res); + hmem_register_resource(target_nid, res); } } diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index 5fdf269a822e..d13c889c2a64 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -46,7 +46,7 @@ config DEV_DAX_HMEM Say M if unsure. config DEV_DAX_HMEM_DEVICES - depends on DEV_DAX_HMEM && DAX=y + depends on DEV_DAX_HMEM && DAX def_bool y config DEV_DAX_KMEM diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c index b1b339bccfe5..f9e1a76a04a9 100644 --- a/drivers/dax/hmem/device.c +++ b/drivers/dax/hmem/device.c @@ -8,6 +8,8 @@ static bool nohmem; module_param_named(disable, nohmem, bool, 0444); +static bool platform_initialized; +static DEFINE_MUTEX(hmem_resource_lock); static struct resource hmem_active = { .name = "HMEM devices", .start = 0, @@ -15,71 +17,66 @@ static struct resource hmem_active = { .flags = IORESOURCE_MEM, }; -void hmem_register_device(int target_nid, struct resource *res) +int walk_hmem_resources(struct device *host, walk_hmem_fn fn) +{ + struct resource *res; + int rc = 0; + + mutex_lock(&hmem_resource_lock); + for (res = hmem_active.child; res; res = res->sibling) { + rc = fn(host, (int) res->desc, res); + if (rc) + break; + } + mutex_unlock(&hmem_resource_lock); + return rc; +} +EXPORT_SYMBOL_GPL(walk_hmem_resources); + +static void __hmem_register_resource(int target_nid, struct resource *res) { struct platform_device *pdev; - struct memregion_info info; - int rc, id; + struct resource *new; + int rc; - if (nohmem) + new = __request_region(&hmem_active, res->start, resource_size(res), "", + 0); + if (!new) { + pr_debug("hmem range %pr already active\n", res); return; + } - rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM, - IORES_DESC_SOFT_RESERVED); - if (rc != REGION_INTERSECTS) - return; + new->desc = target_nid; - id = memregion_alloc(GFP_KERNEL); - if (id < 0) { - pr_err("memregion allocation failure for %pr\n", res); + if (platform_initialized) return; - } - pdev = platform_device_alloc("hmem", id); + pdev = platform_device_alloc("hmem_platform", 0); if (!pdev) { - pr_err("hmem device allocation failure for %pr\n", res); - goto out_pdev; - } - - if (!__request_region(&hmem_active, res->start, resource_size(res), - dev_name(&pdev->dev), 0)) { - dev_dbg(&pdev->dev, "hmem range %pr already active\n", res); - goto out_active; - } - - pdev->dev.numa_node = numa_map_to_online_node(target_nid); - info = (struct memregion_info) { - .target_node = target_nid, - .range = { - .start = res->start, - .end = res->end, - }, - }; - rc = platform_device_add_data(pdev, &info, sizeof(info)); - if (rc < 0) { - pr_err("hmem memregion_info allocation failure for %pr\n", res); - goto out_resource; + pr_err_once("failed to register device-dax hmem_platform device\n"); + return; } rc = platform_device_add(pdev); - if (rc < 0) { - dev_err(&pdev->dev, "device add failed for %pr\n", res); - goto out_resource; - } + if (rc) + platform_device_put(pdev); + else + platform_initialized = true; +} - return; +void hmem_register_resource(int target_nid, struct resource *res) +{ + if (nohmem) + return; -out_resource: - __release_region(&hmem_active, res->start, resource_size(res)); -out_active: - platform_device_put(pdev); -out_pdev: - memregion_free(id); + mutex_lock(&hmem_resource_lock); + __hmem_register_resource(target_nid, res); + mutex_unlock(&hmem_resource_lock); } static __init int hmem_register_one(struct resource *res, void *data) { - hmem_register_device(phys_to_target_node(res->start), res); + hmem_register_resource(phys_to_target_node(res->start), res); return 0; } diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 5025a8c9850b..e7bdff3132fa 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -3,6 +3,7 @@ #include #include #include +#include #include "../bus.h" static bool region_idle; @@ -43,8 +44,110 @@ static struct platform_driver dax_hmem_driver = { }, }; -module_platform_driver(dax_hmem_driver); +static void release_memregion(void *data) +{ + memregion_free((long) data); +} + +static void release_hmem(void *pdev) +{ + platform_device_unregister(pdev); +} + +static int hmem_register_device(struct device *host, int target_nid, + const struct resource *res) +{ + struct platform_device *pdev; + struct memregion_info info; + long id; + int rc; + + rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM, + IORES_DESC_SOFT_RESERVED); + if (rc != REGION_INTERSECTS) + return 0; + + id = memregion_alloc(GFP_KERNEL); + if (id < 0) { + dev_err(host, "memregion allocation failure for %pr\n", res); + return -ENOMEM; + } + rc = devm_add_action_or_reset(host, release_memregion, (void *) id); + if (rc) + return rc; + + pdev = platform_device_alloc("hmem", id); + if (!pdev) { + dev_err(host, "device allocation failure for %pr\n", res); + return -ENOMEM; + } + + pdev->dev.numa_node = numa_map_to_online_node(target_nid); + info = (struct memregion_info) { + .target_node = target_nid, + .range = { + .start = res->start, + .end = res->end, + }, + }; + rc = platform_device_add_data(pdev, &info, sizeof(info)); + if (rc < 0) { + dev_err(host, "memregion_info allocation failure for %pr\n", + res); + goto out_put; + } + + rc = platform_device_add(pdev); + if (rc < 0) { + dev_err(host, "%s add failed for %pr\n", dev_name(&pdev->dev), + res); + goto out_put; + } + + return devm_add_action_or_reset(host, release_hmem, pdev); + +out_put: + platform_device_put(pdev); + return rc; +} + +static int dax_hmem_platform_probe(struct platform_device *pdev) +{ + return walk_hmem_resources(&pdev->dev, hmem_register_device); +} + +static struct platform_driver dax_hmem_platform_driver = { + .probe = dax_hmem_platform_probe, + .driver = { + .name = "hmem_platform", + }, +}; + +static __init int dax_hmem_init(void) +{ + int rc; + + rc = platform_driver_register(&dax_hmem_platform_driver); + if (rc) + return rc; + + rc = platform_driver_register(&dax_hmem_driver); + if (rc) + platform_driver_unregister(&dax_hmem_platform_driver); + + return rc; +} + +static __exit void dax_hmem_exit(void) +{ + platform_driver_unregister(&dax_hmem_driver); + platform_driver_unregister(&dax_hmem_platform_driver); +} + +module_init(dax_hmem_init); +module_exit(dax_hmem_exit); MODULE_ALIAS("platform:hmem*"); +MODULE_ALIAS("platform:hmem_platform*"); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR("Intel Corporation"); diff --git a/include/linux/dax.h b/include/linux/dax.h index 2b5ecb591059..bf6258472e49 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -262,11 +262,14 @@ static inline bool dax_mapping(struct address_space *mapping) } #ifdef CONFIG_DEV_DAX_HMEM_DEVICES -void hmem_register_device(int target_nid, struct resource *r); +void hmem_register_resource(int target_nid, struct resource *r); #else -static inline void hmem_register_device(int target_nid, struct resource *r) +static inline void hmem_register_resource(int target_nid, struct resource *r) { } #endif +typedef int (*walk_hmem_fn)(struct device *dev, int target_nid, + const struct resource *res); +int walk_hmem_resources(struct device *dev, walk_hmem_fn fn); #endif From patchwork Mon Feb 6 01:04:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8603C636CC for ; Mon, 6 Feb 2023 01:04:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229498AbjBFBEI (ORCPT ); Sun, 5 Feb 2023 20:04:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbjBFBEH (ORCPT ); Sun, 5 Feb 2023 20:04:07 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7034B8A53; Sun, 5 Feb 2023 17:04:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645446; x=1707181446; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hrvnto86ChTGsi4fHJ4Lqo9DG2au7l7fsMPus7NStq0=; b=GK1gs7gjNinPobEQFOc3k1Nm8HHd2JJM9E/4bB4HdriIxfEMKG5XNFt4 Jd5gEST7yqgEKCU9g485QmHjb1eas/X6CWWiF5J3NKnaaQCQ6Y0Usl0tH 5Jxi0PPg6cdsoLbfgzzLaWATrN8ECoQ8FpYmIqw26RJjxotTkde0r6sEu JDf59wAnKEN9RTg3MmJZaKr7n5oYDm+Khup3md3v11wwwjcp6n7sO4dTd h2AGgKld6nS6NzNNLBqKLsmE5jpj7Qjwg0tdkFQvfRrdmFsKiG9UjglUx prwWkl53zzNWPOhyljA7QUOC3JSDkk1lx1AhNnLbaYpAHvjzovuJo/qNG Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309442702" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309442702" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:04:06 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="775006766" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="775006766" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:04:05 -0800 Subject: [PATCH 17/18] dax: Assign RAM regions to memory-hotplug by default From: Dan Williams To: linux-cxl@vger.kernel.org Cc: Michal Hocko , David Hildenbrand , Dave Hansen , linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:04:05 -0800 Message-ID: <167564544513.847146.4645646177864365755.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org The default mode for device-dax instances is backwards for RAM-regions as evidenced by the fact that it tends to catch end users by surprise. "Where is my memory?". Recall that platforms are increasingly shipping with performance-differentiated memory pools beyond typical DRAM and NUMA effects. This includes HBM (high-bandwidth-memory) and CXL (dynamic interleave, varied media types, and future fabric attached possibilities). For this reason the EFI_MEMORY_SP (EFI Special Purpose Memory => Linux 'Soft Reserved') attribute is expected to be applied to all memory-pools that are not the general purpose pool. This designation gives an Operating System a chance to defer usage of a memory pool until later in the boot process where its performance properties can be interrogated and administrator policy can be applied. 'Soft Reserved' memory can be anything from too limited and precious to be part of the general purpose pool (HBM), too slow to host hot kernel data structures (some PMEM media), or anything in between. However, in the absence of an explicit policy, the memory should at least be made usable by default. The current device-dax default hides all non-general-purpose memory behind a device interface. The expectation is that the distribution of users that want the memory online by default vs device-dedicated-access by default follows the Pareto principle. A small number of enlightened users may want to do userspace memory management through a device, but general users just want the kernel to make the memory available with an option to get more advanced later. Arrange for all device-dax instances not backed by PMEM to default to attaching to the dax_kmem driver. From there the baseline memory hotplug policy (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE / memhp_default_state=) gates whether the memory comes online or stays offline. Where, if it stays offline, it can be reliably converted back to device-mode where it can be partitioned, or fronted by a userspace allocator. So, if someone wants device-dax instances for their 'Soft Reserved' memory: 1/ Build a kernel with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n or boot with memhp_default_state=offline, or roll the dice and hope that the kernel has not pinned a page in that memory before step 2. 2/ Write a udev rule to convert the target dax device(s) from 'system-ram' mode to 'devdax' mode: daxctl reconfigure-device $dax -m devdax -f Cc: Michal Hocko Cc: David Hildenbrand Cc: Dave Hansen Signed-off-by: Dan Williams Reviewed-by: Gregory Price --- drivers/dax/Kconfig | 2 +- drivers/dax/bus.c | 53 ++++++++++++++++++++--------------------------- drivers/dax/bus.h | 12 +++++++++-- drivers/dax/device.c | 3 +-- drivers/dax/hmem/hmem.c | 12 ++++++++++- drivers/dax/kmem.c | 1 + 6 files changed, 46 insertions(+), 37 deletions(-) diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index d13c889c2a64..1163eb62e5f6 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -50,7 +50,7 @@ config DEV_DAX_HMEM_DEVICES def_bool y config DEV_DAX_KMEM - tristate "KMEM DAX: volatile-use of persistent memory" + tristate "KMEM DAX: map dax-devices as System-RAM" default DEV_DAX depends on DEV_DAX depends on MEMORY_HOTPLUG # for add_memory() and friends diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 1dad813ee4a6..012d576004e9 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -56,6 +56,25 @@ static int dax_match_id(struct dax_device_driver *dax_drv, struct device *dev) return match; } +static int dax_match_type(struct dax_device_driver *dax_drv, struct device *dev) +{ + enum dax_driver_type type = DAXDRV_DEVICE_TYPE; + struct dev_dax *dev_dax = to_dev_dax(dev); + + if (dev_dax->region->res.flags & IORESOURCE_DAX_KMEM) + type = DAXDRV_KMEM_TYPE; + + if (dax_drv->type == type) + return 1; + + /* default to device mode if dax_kmem is disabled */ + if (dax_drv->type == DAXDRV_DEVICE_TYPE && + !IS_ENABLED(CONFIG_DEV_DAX_KMEM)) + return 1; + + return 0; +} + enum id_action { ID_REMOVE, ID_ADD, @@ -216,14 +235,9 @@ static int dax_bus_match(struct device *dev, struct device_driver *drv) { struct dax_device_driver *dax_drv = to_dax_drv(drv); - /* - * All but the 'device-dax' driver, which has 'match_always' - * set, requires an exact id match. - */ - if (dax_drv->match_always) + if (dax_match_id(dax_drv, dev)) return 1; - - return dax_match_id(dax_drv, dev); + return dax_match_type(dax_drv, dev); } /* @@ -1413,13 +1427,10 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data) } EXPORT_SYMBOL_GPL(devm_create_dev_dax); -static int match_always_count; - int __dax_driver_register(struct dax_device_driver *dax_drv, struct module *module, const char *mod_name) { struct device_driver *drv = &dax_drv->drv; - int rc = 0; /* * dax_bus_probe() calls dax_drv->probe() unconditionally. @@ -1434,26 +1445,7 @@ int __dax_driver_register(struct dax_device_driver *dax_drv, drv->mod_name = mod_name; drv->bus = &dax_bus_type; - /* there can only be one default driver */ - mutex_lock(&dax_bus_lock); - match_always_count += dax_drv->match_always; - if (match_always_count > 1) { - match_always_count--; - WARN_ON(1); - rc = -EINVAL; - } - mutex_unlock(&dax_bus_lock); - if (rc) - return rc; - - rc = driver_register(drv); - if (rc && dax_drv->match_always) { - mutex_lock(&dax_bus_lock); - match_always_count -= dax_drv->match_always; - mutex_unlock(&dax_bus_lock); - } - - return rc; + return driver_register(drv); } EXPORT_SYMBOL_GPL(__dax_driver_register); @@ -1463,7 +1455,6 @@ void dax_driver_unregister(struct dax_device_driver *dax_drv) struct dax_id *dax_id, *_id; mutex_lock(&dax_bus_lock); - match_always_count -= dax_drv->match_always; list_for_each_entry_safe(dax_id, _id, &dax_drv->ids, list) { list_del(&dax_id->list); kfree(dax_id); diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index fbb940293d6d..8cd79ab34292 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -11,7 +11,10 @@ struct dax_device; struct dax_region; void dax_region_put(struct dax_region *dax_region); -#define IORESOURCE_DAX_STATIC (1UL << 0) +/* dax bus specific ioresource flags */ +#define IORESOURCE_DAX_STATIC BIT(0) +#define IORESOURCE_DAX_KMEM BIT(1) + struct dax_region *alloc_dax_region(struct device *parent, int region_id, struct range *range, int target_node, unsigned int align, unsigned long flags); @@ -25,10 +28,15 @@ struct dev_dax_data { struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data); +enum dax_driver_type { + DAXDRV_KMEM_TYPE, + DAXDRV_DEVICE_TYPE, +}; + struct dax_device_driver { struct device_driver drv; struct list_head ids; - int match_always; + enum dax_driver_type type; int (*probe)(struct dev_dax *dev); void (*remove)(struct dev_dax *dev); }; diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 5494d745ced5..ecdff79e31f2 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -475,8 +475,7 @@ EXPORT_SYMBOL_GPL(dev_dax_probe); static struct dax_device_driver device_dax_driver = { .probe = dev_dax_probe, - /* all probe actions are unwound by devm, so .remove isn't necessary */ - .match_always = 1, + .type = DAXDRV_DEVICE_TYPE, }; static int __init dax_init(void) diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index e7bdff3132fa..5ec08f9f8a57 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -11,15 +11,25 @@ module_param_named(region_idle, region_idle, bool, 0644); static int dax_hmem_probe(struct platform_device *pdev) { + unsigned long flags = IORESOURCE_DAX_KMEM; struct device *dev = &pdev->dev; struct dax_region *dax_region; struct memregion_info *mri; struct dev_dax_data data; struct dev_dax *dev_dax; + /* + * @region_idle == true indicates that an administrative agent + * wants to manipulate the range partitioning before the devices + * are created, so do not send them to the dax_kmem driver by + * default. + */ + if (region_idle) + flags = 0; + mri = dev->platform_data; dax_region = alloc_dax_region(dev, pdev->id, &mri->range, - mri->target_node, PMD_SIZE, 0); + mri->target_node, PMD_SIZE, flags); if (!dax_region) return -ENOMEM; diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 4852a2dbdb27..918d01d3fbaa 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -239,6 +239,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax) static struct dax_device_driver device_dax_kmem_driver = { .probe = dev_dax_kmem_probe, .remove = dev_dax_kmem_remove, + .type = DAXDRV_KMEM_TYPE, }; static int __init dax_kmem_init(void) From patchwork Mon Feb 6 01:04:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13129264 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 055E3C636CC for ; Mon, 6 Feb 2023 01:04:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229503AbjBFBEQ (ORCPT ); Sun, 5 Feb 2023 20:04:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229468AbjBFBEP (ORCPT ); Sun, 5 Feb 2023 20:04:15 -0500 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1360BD537; Sun, 5 Feb 2023 17:04:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675645454; x=1707181454; h=subject:from:to:cc:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wPmMkCp6DCx1PzgRrsiFSVETspBvehWCurZIdpJuvvk=; b=D80baL58UYzAjhHmJYADerqUAj7KeGawsfDrfKHEkTLOHtzIgJTfBu7M lNo323RclCpB/YMZfJhY8ym5YZCeFkgjKSu3NKdg3C0PTaA9EmGLZtIEl nbWwTAsnQKSDslTmw0qlix2xcpTrDpuC7GKz8TyiGd3lLVufcmsV95qod NXWjDrfvJjP8OR8HUj1sJ+gyi730aLcSARcikg9VUw0OkZKtXBpd/LAek hCJfCHcIp+xDUykkEayEahpcGlJ/NKoZlAGGj08/sSjL0MPhRXAlvwxrQ TKB0rKMdidgUeH0n5mUL/dyjKHtBK7/j/UdWnnbkQZZuh3UXY7XSlr0Q6 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="329123119" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="329123119" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:04:12 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="698692901" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="698692901" Received: from mkrysak-mobl.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.212.255.187]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2023 17:04:11 -0800 Subject: [PATCH 18/18] cxl/dax: Create dax devices for CXL RAM regions From: Dan Williams To: linux-cxl@vger.kernel.org Cc: dave.hansen@linux.intel.com, linux-mm@kvack.org, linux-acpi@vger.kernel.org Date: Sun, 05 Feb 2023 17:04:11 -0800 Message-ID: <167564545116.847146.4741351262959589920.stgit@dwillia2-xfh.jf.intel.com> In-Reply-To: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> References: <167564534874.847146.5222419648551436750.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the remainder and hot-added devices. Platform firmware is also responsible for identifying the platform general purpose memory pool, typically DDR attached DRAM, and arranging for the remainder to be 'Soft Reserved'. That reservation allows the CXL subsystem to route the memory to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated access (device-dax). The new 'struct cxl_dax_region' object allows for a CXL memory resource (region) to be published, but also allow for udev and module policy to act on that event. It also prevents cxl_core.ko from having a module loading dependency on any drivers/dax/ modules. Signed-off-by: Dan Williams --- MAINTAINERS | 1 drivers/cxl/acpi.c | 3 + drivers/cxl/core/core.h | 3 + drivers/cxl/core/port.c | 2 + drivers/cxl/core/region.c | 108 ++++++++++++++++++++++++++++++++++++++++++++- drivers/cxl/cxl.h | 12 +++++ drivers/dax/Kconfig | 13 +++++ drivers/dax/Makefile | 2 + drivers/dax/cxl.c | 53 ++++++++++++++++++++++ drivers/dax/hmem/hmem.c | 14 ++++++ 10 files changed, 208 insertions(+), 3 deletions(-) create mode 100644 drivers/dax/cxl.c diff --git a/MAINTAINERS b/MAINTAINERS index 7f86d02cb427..73a9f3401e0e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6035,6 +6035,7 @@ M: Dan Williams M: Vishal Verma M: Dave Jiang L: nvdimm@lists.linux.dev +L: linux-cxl@vger.kernel.org S: Supported F: drivers/dax/ diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index ad0849af42d7..8ebb9a74790d 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -731,7 +731,8 @@ static void __exit cxl_acpi_exit(void) cxl_bus_drain(); } -module_init(cxl_acpi_init); +/* load before dax_hmem sees 'Soft Reserved' CXL ranges */ +subsys_initcall(cxl_acpi_init); module_exit(cxl_acpi_exit); MODULE_LICENSE("GPL v2"); MODULE_IMPORT_NS(CXL); diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 479f01da6d35..cde475e13216 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -15,12 +15,14 @@ extern struct device_attribute dev_attr_create_ram_region; extern struct device_attribute dev_attr_delete_region; extern struct device_attribute dev_attr_region; extern const struct device_type cxl_pmem_region_type; +extern const struct device_type cxl_dax_region_type; extern const struct device_type cxl_region_type; void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled); #define CXL_REGION_ATTR(x) (&dev_attr_##x.attr) #define CXL_REGION_TYPE(x) (&cxl_region_type) #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr), #define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type) +#define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type) int cxl_region_init(void); void cxl_region_exit(void); #else @@ -38,6 +40,7 @@ static inline void cxl_region_exit(void) #define CXL_REGION_TYPE(x) NULL #define SET_CXL_REGION_ATTR(x) #define CXL_PMEM_REGION_TYPE(x) NULL +#define CXL_DAX_REGION_TYPE(x) NULL #endif struct cxl_send_command; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 8d0895cbae93..0faeb1ffc212 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -46,6 +46,8 @@ static int cxl_device_id(struct device *dev) return CXL_DEVICE_NVDIMM; if (dev->type == CXL_PMEM_REGION_TYPE()) return CXL_DEVICE_PMEM_REGION; + if (dev->type == CXL_DAX_REGION_TYPE()) + return CXL_DEVICE_DAX_REGION; if (is_cxl_port(dev)) { if (is_cxl_root(to_cxl_port(dev))) return CXL_DEVICE_ROOT; diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 6fe8c70790df..86fa54015a99 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -2261,6 +2261,75 @@ static struct cxl_pmem_region *cxl_pmem_region_alloc(struct cxl_region *cxlr) return cxlr_pmem; } +static void cxl_dax_region_release(struct device *dev) +{ + struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev); + + kfree(cxlr_dax); +} + +static const struct attribute_group *cxl_dax_region_attribute_groups[] = { + &cxl_base_attribute_group, + NULL, +}; + +const struct device_type cxl_dax_region_type = { + .name = "cxl_dax_region", + .release = cxl_dax_region_release, + .groups = cxl_dax_region_attribute_groups, +}; + +static bool is_cxl_dax_region(struct device *dev) +{ + return dev->type == &cxl_dax_region_type; +} + +struct cxl_dax_region *to_cxl_dax_region(struct device *dev) +{ + if (dev_WARN_ONCE(dev, !is_cxl_dax_region(dev), + "not a cxl_dax_region device\n")) + return NULL; + return container_of(dev, struct cxl_dax_region, dev); +} +EXPORT_SYMBOL_NS_GPL(to_cxl_dax_region, CXL); + +static struct lock_class_key cxl_dax_region_key; + +static struct cxl_dax_region *cxl_dax_region_alloc(struct cxl_region *cxlr) +{ + struct cxl_region_params *p = &cxlr->params; + struct cxl_dax_region *cxlr_dax; + struct device *dev; + + down_read(&cxl_region_rwsem); + if (p->state != CXL_CONFIG_COMMIT) { + cxlr_dax = ERR_PTR(-ENXIO); + goto out; + } + + cxlr_dax = kzalloc(sizeof(*cxlr_dax), GFP_KERNEL); + if (!cxlr_dax) { + cxlr_dax = ERR_PTR(-ENOMEM); + goto out; + } + + cxlr_dax->hpa_range.start = p->res->start; + cxlr_dax->hpa_range.end = p->res->end; + + dev = &cxlr_dax->dev; + cxlr_dax->cxlr = cxlr; + device_initialize(dev); + lockdep_set_class(&dev->mutex, &cxl_dax_region_key); + device_set_pm_not_required(dev); + dev->parent = &cxlr->dev; + dev->bus = &cxl_bus_type; + dev->type = &cxl_dax_region_type; +out: + up_read(&cxl_region_rwsem); + + return cxlr_dax; +} + static void cxlr_pmem_unregister(void *_cxlr_pmem) { struct cxl_pmem_region *cxlr_pmem = _cxlr_pmem; @@ -2345,6 +2414,42 @@ static int devm_cxl_add_pmem_region(struct cxl_region *cxlr) return rc; } +static void cxlr_dax_unregister(void *_cxlr_dax) +{ + struct cxl_dax_region *cxlr_dax = _cxlr_dax; + + device_unregister(&cxlr_dax->dev); +} + +static int devm_cxl_add_dax_region(struct cxl_region *cxlr) +{ + struct cxl_dax_region *cxlr_dax; + struct device *dev; + int rc; + + cxlr_dax = cxl_dax_region_alloc(cxlr); + if (IS_ERR(cxlr_dax)) + return PTR_ERR(cxlr_dax); + + dev = &cxlr_dax->dev; + rc = dev_set_name(dev, "dax_region%d", cxlr->id); + if (rc) + goto err; + + rc = device_add(dev); + if (rc) + goto err; + + dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent), + dev_name(dev)); + + return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister, + cxlr_dax); +err: + put_device(dev); + return rc; +} + static int match_decoder_by_range(struct device *dev, void *data) { struct range *r1, *r2 = data; @@ -2606,8 +2711,7 @@ static int cxl_region_probe(struct device *dev) p->res->start, p->res->end, cxlr, is_system_ram) > 0) return 0; - dev_dbg(dev, "TODO: hookup devdax\n"); - return 0; + return devm_cxl_add_dax_region(cxlr); default: dev_dbg(&cxlr->dev, "unsupported region mode: %d\n", cxlr->mode); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 4c6ee6c96f23..336b65d333a2 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -505,6 +505,12 @@ struct cxl_pmem_region { struct cxl_pmem_region_mapping mapping[]; }; +struct cxl_dax_region { + struct device dev; + struct cxl_region *cxlr; + struct range hpa_range; +}; + /** * struct cxl_port - logical collection of upstream port devices and * downstream port devices to construct a CXL memory @@ -699,6 +705,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv); #define CXL_DEVICE_MEMORY_EXPANDER 5 #define CXL_DEVICE_REGION 6 #define CXL_DEVICE_PMEM_REGION 7 +#define CXL_DEVICE_DAX_REGION 8 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*") #define CXL_MODALIAS_FMT "cxl:t%d" @@ -715,6 +722,7 @@ struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct device *dev); #ifdef CONFIG_CXL_REGION bool is_cxl_pmem_region(struct device *dev); struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); +struct cxl_dax_region *to_cxl_dax_region(struct device *dev); void cxl_add_to_region(struct cxl_endpoint_decoder *cxled); #else static inline bool is_cxl_pmem_region(struct device *dev) @@ -725,6 +733,10 @@ static inline struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev) { return NULL; } +static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev) +{ + return NULL; +} static inline void cxl_add_to_region(struct cxl_endpoint_decoder *cxled) { } diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index 1163eb62e5f6..bd06e16c7ac8 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -45,6 +45,19 @@ config DEV_DAX_HMEM Say M if unsure. +config DEV_DAX_CXL + tristate "CXL DAX: direct access to CXL RAM regions" + depends on CXL_REGION && DEV_DAX + default CXL_REGION && DEV_DAX + help + CXL RAM regions are either mapped by platform-firmware + and published in the initial system-memory map as "System RAM", mapped + by platform-firmware as "Soft Reserved", or dynamically provisioned + after boot by the CXL driver. In the latter two cases a device-dax + instance is created to access that unmapped-by-default address range. + Per usual it can remain as dedicated access via a device interface, or + converted to "System RAM" via the dax_kmem facility. + config DEV_DAX_HMEM_DEVICES depends on DEV_DAX_HMEM && DAX def_bool y diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile index 90a56ca3b345..5ed5c39857c8 100644 --- a/drivers/dax/Makefile +++ b/drivers/dax/Makefile @@ -3,10 +3,12 @@ obj-$(CONFIG_DAX) += dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o +obj-$(CONFIG_DEV_DAX_CXL) += dax_cxl.o dax-y := super.o dax-y += bus.o device_dax-y := device.o dax_pmem-y := pmem.o +dax_cxl-y := cxl.o obj-y += hmem/ diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c new file mode 100644 index 000000000000..ccdf8de85bd5 --- /dev/null +++ b/drivers/dax/cxl.c @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2023 Intel Corporation. All rights reserved. */ +#include +#include + +#include "../cxl/cxl.h" +#include "bus.h" + +static int cxl_dax_region_probe(struct device *dev) +{ + struct cxl_dax_region *cxlr_dax = to_cxl_dax_region(dev); + int nid = phys_to_target_node(cxlr_dax->hpa_range.start); + struct cxl_region *cxlr = cxlr_dax->cxlr; + struct dax_region *dax_region; + struct dev_dax_data data; + struct dev_dax *dev_dax; + + if (nid == NUMA_NO_NODE) + nid = memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start); + + dax_region = alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid, + PMD_SIZE, IORESOURCE_DAX_KMEM); + if (!dax_region) + return -ENOMEM; + + data = (struct dev_dax_data) { + .dax_region = dax_region, + .id = -1, + .size = range_len(&cxlr_dax->hpa_range), + }; + dev_dax = devm_create_dev_dax(&data); + if (IS_ERR(dev_dax)) + return PTR_ERR(dev_dax); + + /* child dev_dax instances now own the lifetime of the dax_region */ + dax_region_put(dax_region); + return 0; +} + +static struct cxl_driver cxl_dax_region_driver = { + .name = "cxl_dax_region", + .probe = cxl_dax_region_probe, + .id = CXL_DEVICE_DAX_REGION, + .drv = { + .suppress_bind_attrs = true, + }, +}; + +module_cxl_driver(cxl_dax_region_driver); +MODULE_ALIAS_CXL(CXL_DEVICE_DAX_REGION); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Intel Corporation"); +MODULE_IMPORT_NS(CXL); diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 5ec08f9f8a57..e5fe8b39fb94 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -72,6 +72,13 @@ static int hmem_register_device(struct device *host, int target_nid, long id; int rc; + if (IS_ENABLED(CONFIG_CXL_REGION) && + region_intersects(res->start, resource_size(res), IORESOURCE_MEM, + IORES_DESC_CXL) != REGION_DISJOINT) { + dev_dbg(host, "deferring range to CXL: %pr\n", res); + return 0; + } + rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM, IORES_DESC_SOFT_RESERVED); if (rc != REGION_INTERSECTS) @@ -157,6 +164,13 @@ static __exit void dax_hmem_exit(void) module_init(dax_hmem_init); module_exit(dax_hmem_exit); +/* Allow for CXL to define its own dax regions */ +#if IS_ENABLED(CONFIG_CXL_REGION) +#if IS_MODULE(CONFIG_CXL_ACPI) +MODULE_SOFTDEP("pre: cxl_acpi"); +#endif +#endif + MODULE_ALIAS("platform:hmem*"); MODULE_ALIAS("platform:hmem_platform*"); MODULE_LICENSE("GPL v2");