From patchwork Wed Apr 13 18:37:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812372 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60027291E; Wed, 13 Apr 2022 18:38:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875091; x=1681411091; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aE4N3k3trSLj1FrxNaxvn8G0EZkVgDXfaelpiPyEsI4=; b=Q6UTjdWliT4jpBTktAtcXI62K8BQImwSM8sQoI1NZSYOn48Ehdqg45cD +Fko7Si/6ceCvcZlA2/SjwvzEyl9qCbaXOSVeCQTO9EkXu+Cqrb9yNaOJ 1Kxp7bz0n7vgUp79FiSf/+Hxo32dIZe8BpOitfq01g5m7bRix1dH++7KP CezHADDJDxJaO1829qg/KWqh4zk8n0fzcgaD3p7Z8IScRV1Mskm9cF2hJ jpHDt5Ftcs95gHWqF7HpbiOoN6sC9bfsu1LTY2aooRk4QlHgtaTL6N/OF y+nHHUUpRckQUb1VwcaPvz/C3py94BXrLzBKiy6K4aowYwQyQ57Qu0XsQ g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631838" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631838" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:47 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013563" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:47 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 01/15] cxl/core: Use is_endpoint_decoder Date: Wed, 13 Apr 2022 11:37:06 -0700 Message-Id: <20220413183720.2444089-2-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since we already know the decoder, after alloc, has a specified type. Signed-off-by: Ben Widawsky Reviewed-by: Dan Williams --- drivers/cxl/core/hdm.c | 2 +- drivers/cxl/core/port.c | 2 +- drivers/cxl/cxl.h | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 0e89a7a932d4..bfc8ee876278 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -197,7 +197,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, else cxld->target_type = CXL_DECODER_ACCELERATOR; - if (is_cxl_endpoint(to_cxl_port(cxld->dev.parent))) + if (is_endpoint_decoder(&cxld->dev)) return 0; target_list.value = diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 2ab1ba4499b3..74c8e47bf915 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -272,7 +272,7 @@ static const struct device_type cxl_decoder_root_type = { .groups = cxl_decoder_root_attribute_groups, }; -static bool is_endpoint_decoder(struct device *dev) +bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 990b6670222e..5102491e8d13 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -340,6 +340,7 @@ struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port, struct cxl_decoder *to_cxl_decoder(struct device *dev); bool is_root_decoder(struct device *dev); +bool is_endpoint_decoder(struct device *dev); bool is_cxl_decoder(struct device *dev); struct cxl_decoder *cxl_root_decoder_alloc(struct cxl_port *port, unsigned int nr_targets); From patchwork Wed Apr 13 18:37:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812371 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9B9323AD; Wed, 13 Apr 2022 18:38:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875090; x=1681411090; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9JmB6oJKPBBDOztp/9A53kUt0HnYQkAsrrNo+Sut5u0=; b=WDK14tT7Uk3Knm4zUL8vpuFVJGonlK3WOJC3k9UTlhLxk/l0aR8tRxhU p19YMHBYSzp15gzhyMBNm5hGkNmyJAlMtADpoxlRnX3shGr8c7Szn9RJH /ktu6hSP2R9D1MnpS/GASaSC718jIQtdUOfHmrxAl6u9tTRPc00O+pEAb y3Xt4Up/U0qTrwKFjIsnfOPEzS+IotEva+knmX4tVHb3n5zPpRErLb9dY UPiLRvvPXWQqMomSnkHPvmGZuX8gMFD81+vk7eV0MayACx4sWIAHB22yw zl27XX3qswQgUImPqSx1aK2aoG+L6M+i+8q5xeOnZuGNCDXWkV1sjG7aV Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631839" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631839" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013569" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:47 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 02/15] cxl/core/hdm: Bail on endpoint init fail Date: Wed, 13 Apr 2022 11:37:07 -0700 Message-Id: <20220413183720.2444089-3-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Endpoint decoder enumeration is the only way in which we can determine Device Physical Address (DPA) -> Host Physical Address (HPA) mappings. Information is obtained only when the register state can be read sequentially. If when enumerating the decoders a failure occurs, all other decoders must also fail since the decoders can no longer be accurately managed (unless it's the last decoder in which case it can still work). Signed-off-by: Ben Widawsky --- drivers/cxl/core/hdm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index bfc8ee876278..c3c021b54079 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -255,6 +255,8 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) cxlhdm->regs.hdm_decoder, i); if (rc) { put_device(&cxld->dev); + if (is_endpoint_decoder(&cxld->dev)) + return rc; failed++; continue; } From patchwork Wed Apr 13 18:37:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812374 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F15C22F32; Wed, 13 Apr 2022 18:38:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875093; x=1681411093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Aw4BdmJs9nutTAxQq4LYU5ScJx7qqBCM2l7xDnAPCjc=; b=bFMWwwT7lHtmgaUDDkNUWzaG8E82cRMk/wVgeo2DdqFmIxCmuMa2ob70 MuB+P6ZaA+DWqbiVaOoiWF/z3RMxvKC94ZdcplH6Ftl/Nb3gK+4QQF/5h Bq5DQ+mUyXHRcnZClv2yOJqH4JqbplEXBMeTDnz/E6JJpKoklzqdm95+B fx+EtdN1yknZgM6oTTerp2uKI7LUeHw+3AFfAptSMpbTUDX/545WPqa05 o/4FbzmUBwcRXLsbClgletCEb+UreLeYmI415Pgoz6LtThAyoCM+vxEa6 8+AvCW3ELpZfwcmCLGUpneol9z0FM6hxFHlW1/ccz3V3jL+jurjcEMBQT A==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631841" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631841" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013573" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 03/15] Revert "cxl/core: Convert decoder range to resource" Date: Wed, 13 Apr 2022 11:37:08 -0700 Message-Id: <20220413183720.2444089-4-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This reverts commit 608135db1b790170d22848815c4671407af74e37. All decoders do have a host physical address space and the revert allows us to keep that uniformity. Decoder disambiguation will allow for decoder type-specific members which is needed, but will be handled separately. Signed-off-by: Ben Widawsky --- The explanation for why it is impossible to make CFMWS ranges be iomem_resources is explain in a later patch. --- drivers/cxl/acpi.c | 17 ++++++++++------- drivers/cxl/core/hdm.c | 2 +- drivers/cxl/core/port.c | 28 ++++++---------------------- drivers/cxl/cxl.h | 8 ++------ 4 files changed, 19 insertions(+), 36 deletions(-) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index d15a6aec0331..9b69955b90cb 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -108,8 +108,10 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions); cxld->target_type = CXL_DECODER_EXPANDER; - cxld->platform_res = (struct resource)DEFINE_RES_MEM(cfmws->base_hpa, - cfmws->window_size); + cxld->range = (struct range){ + .start = cfmws->base_hpa, + .end = cfmws->base_hpa + cfmws->window_size - 1, + }; cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws); cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws); @@ -119,13 +121,14 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, else rc = cxl_decoder_autoremove(dev, cxld); if (rc) { - dev_err(dev, "Failed to add decoder for %pr\n", - &cxld->platform_res); + dev_err(dev, "Failed to add decoder for %#llx-%#llx\n", + cfmws->base_hpa, + cfmws->base_hpa + cfmws->window_size - 1); return 0; } - dev_dbg(dev, "add: %s node: %d range %pr\n", dev_name(&cxld->dev), - phys_to_target_node(cxld->platform_res.start), - &cxld->platform_res); + dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n", + dev_name(&cxld->dev), phys_to_target_node(cxld->range.start), + cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1); return 0; } diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index c3c021b54079..3055e246aab9 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -172,7 +172,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, return -ENXIO; } - cxld->decoder_range = (struct range) { + cxld->range = (struct range) { .start = base, .end = base + size - 1, }; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 74c8e47bf915..86f451ecb7ed 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -73,14 +73,8 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); - u64 start; - if (is_root_decoder(dev)) - start = cxld->platform_res.start; - else - start = cxld->decoder_range.start; - - return sysfs_emit(buf, "%#llx\n", start); + return sysfs_emit(buf, "%#llx\n", cxld->range.start); } static DEVICE_ATTR_ADMIN_RO(start); @@ -88,14 +82,8 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); - u64 size; - if (is_root_decoder(dev)) - size = resource_size(&cxld->platform_res); - else - size = range_len(&cxld->decoder_range); - - return sysfs_emit(buf, "%#llx\n", size); + return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range)); } static DEVICE_ATTR_RO(size); @@ -1228,7 +1216,10 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, cxld->interleave_ways = 1; cxld->interleave_granularity = PAGE_SIZE; cxld->target_type = CXL_DECODER_EXPANDER; - cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0); + cxld->range = (struct range) { + .start = 0, + .end = -1, + }; return cxld; err: @@ -1342,13 +1333,6 @@ int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map) if (rc) return rc; - /* - * Platform decoder resources should show up with a reasonable name. All - * other resources are just sub ranges within the main decoder resource. - */ - if (is_root_decoder(dev)) - cxld->platform_res.name = dev_name(dev); - return device_add(dev); } EXPORT_SYMBOL_NS_GPL(cxl_decoder_add_locked, CXL); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 5102491e8d13..6517d5cdf5ee 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -197,8 +197,7 @@ enum cxl_decoder_type { * struct cxl_decoder - CXL address range decode configuration * @dev: this decoder's device * @id: kernel device name id - * @platform_res: address space resources considered by root decoder - * @decoder_range: address space resources considered by midlevel decoder + * @range: address range considered by this decoder * @interleave_ways: number of cxl_dports in this decode * @interleave_granularity: data stride per dport * @target_type: accelerator vs expander (type2 vs type3) selector @@ -210,10 +209,7 @@ enum cxl_decoder_type { struct cxl_decoder { struct device dev; int id; - union { - struct resource platform_res; - struct range decoder_range; - }; + struct range range; int interleave_ways; int interleave_granularity; enum cxl_decoder_type target_type; From patchwork Wed Apr 13 18:37:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812377 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E0752F43; Wed, 13 Apr 2022 18:38:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875094; x=1681411094; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2cspjk3ehP2RxkSlZOKEdhvz3HCPZyVpB/cKlunYSRc=; b=XEhqxpjpyJfMgZUrWWIIyxD1CJS3YOq6rrulsB+dLsK+s+88ejZ+PLOT /uF0SfGnrUmakRkWcVop+n2zwu2/bC0nV0gSohrmbeKqYOqhuvUp3I3Ua FV0ItotXqKoqhwJZKmiqGWksZnF0C7bjNRSGOV6rw+1j/MqwLq2RvPtZB gm01rFNrmp/1N+/lzB2gPBZul3GhKiUFUly6KWBBIqqJBSP1jLe3j9liN UQz1N88U5+oO+DYlkmiprabphdsFT9qOdEFo2jI+U1Q1OPHXxR0ZPDIGa 0L2ibrzsvyrYa1pmL9XJNRuresCHBFACYoRPsj5t6Xvu2p9Mmtapt0hh8 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631842" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631842" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013581" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 04/15] cxl/core: Create distinct decoder structs Date: Wed, 13 Apr 2022 11:37:09 -0700 Message-Id: <20220413183720.2444089-5-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 CXL HDM decoders have distinct properties at each level in the hierarchy. Root decoders manage host physical address space. Switch decoders manage demultiplexing of data to downstream targets. Endpoint decoders must be aware of physical media size constraints. To properly support these unique needs, create these unique structures. CXL HDM decoders do have similar architectural properties at all levels: interleave properties, flags, types and consumption of host physical address space. Those are retained and when possible, still utilized. Signed-off-by: Ben Widawsky --- drivers/cxl/core/hdm.c | 3 +- drivers/cxl/core/port.c | 102 ++++++++++++++++++++++++----------- drivers/cxl/cxl.h | 69 +++++++++++++++++++++--- tools/testing/cxl/test/cxl.c | 2 +- 4 files changed, 137 insertions(+), 39 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 3055e246aab9..37c09c77e9a7 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -6,6 +6,7 @@ #include "cxlmem.h" #include "core.h" +#include "cxl.h" /** * DOC: cxl core hdm @@ -242,7 +243,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) struct cxl_decoder *cxld; if (is_cxl_endpoint(port)) - cxld = cxl_endpoint_decoder_alloc(port); + cxld = &cxl_endpoint_decoder_alloc(port)->base; else cxld = cxl_switch_decoder_alloc(port, target_count); if (IS_ERR(cxld)) { diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 86f451ecb7ed..8dd29c97e318 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -121,18 +121,19 @@ static DEVICE_ATTR_RO(target_type); static ssize_t emit_target_list(struct cxl_decoder *cxld, char *buf) { + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); ssize_t offset = 0; int i, rc = 0; for (i = 0; i < cxld->interleave_ways; i++) { - struct cxl_dport *dport = cxld->target[i]; + struct cxl_dport *dport = t->target[i]; struct cxl_dport *next = NULL; if (!dport) break; if (i + 1 < cxld->interleave_ways) - next = cxld->target[i + 1]; + next = t->target[i + 1]; rc = sysfs_emit_at(buf, offset, "%d%s", dport->port_id, next ? "," : ""); if (rc < 0) @@ -147,14 +148,15 @@ static ssize_t target_list_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); ssize_t offset; unsigned int seq; int rc; do { - seq = read_seqbegin(&cxld->target_lock); + seq = read_seqbegin(&t->target_lock); rc = emit_target_list(cxld, buf); - } while (read_seqretry(&cxld->target_lock, seq)); + } while (read_seqretry(&t->target_lock, seq)); if (rc < 0) return rc; @@ -199,23 +201,6 @@ static const struct attribute_group *cxl_decoder_root_attribute_groups[] = { NULL, }; -static struct attribute *cxl_decoder_switch_attrs[] = { - &dev_attr_target_type.attr, - &dev_attr_target_list.attr, - NULL, -}; - -static struct attribute_group cxl_decoder_switch_attribute_group = { - .attrs = cxl_decoder_switch_attrs, -}; - -static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = { - &cxl_decoder_switch_attribute_group, - &cxl_decoder_base_attribute_group, - &cxl_base_attribute_group, - NULL, -}; - static struct attribute *cxl_decoder_endpoint_attrs[] = { &dev_attr_target_type.attr, NULL, @@ -232,6 +217,12 @@ static const struct attribute_group *cxl_decoder_endpoint_attribute_groups[] = { NULL, }; +static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = { + &cxl_decoder_base_attribute_group, + &cxl_base_attribute_group, + NULL, +}; + static void cxl_decoder_release(struct device *dev) { struct cxl_decoder *cxld = to_cxl_decoder(dev); @@ -264,6 +255,7 @@ bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } +EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL); bool is_root_decoder(struct device *dev) { @@ -1136,6 +1128,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_find_dport_by_dev, CXL); static int decoder_populate_targets(struct cxl_decoder *cxld, struct cxl_port *port, int *target_map) { + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); int i, rc = 0; if (!target_map) @@ -1146,21 +1139,72 @@ static int decoder_populate_targets(struct cxl_decoder *cxld, if (list_empty(&port->dports)) return -EINVAL; - write_seqlock(&cxld->target_lock); - for (i = 0; i < cxld->nr_targets; i++) { + write_seqlock(&t->target_lock); + for (i = 0; i < t->nr_targets; i++) { struct cxl_dport *dport = find_dport(port, target_map[i]); if (!dport) { rc = -ENXIO; break; } - cxld->target[i] = dport; + t->target[i] = dport; } - write_sequnlock(&cxld->target_lock); + write_sequnlock(&t->target_lock); return rc; } +static struct cxl_decoder *__cxl_decoder_alloc(struct cxl_port *port, + unsigned int nr_targets) +{ + struct cxl_decoder *cxld; + + if (is_cxl_endpoint(port)) { + struct cxl_endpoint_decoder *cxled; + + cxled = kzalloc(sizeof(*cxled), GFP_KERNEL); + if (!cxled) + return NULL; + cxld = &cxled->base; + } else if (is_cxl_root(port)) { + struct cxl_root_decoder *cxlrd; + + cxlrd = kzalloc(sizeof(*cxlrd), GFP_KERNEL); + if (!cxlrd) + return NULL; + + cxlrd->targets = + kzalloc(struct_size(cxlrd->targets, target, nr_targets), + GFP_KERNEL); + if (!cxlrd->targets) { + kfree(cxlrd); + return NULL; + } + cxlrd->targets->nr_targets = nr_targets; + seqlock_init(&cxlrd->targets->target_lock); + cxld = &cxlrd->base; + } else { + struct cxl_switch_decoder *cxlsd; + + cxlsd = kzalloc(sizeof(*cxlsd), GFP_KERNEL); + if (!cxlsd) + return NULL; + + cxlsd->targets = + kzalloc(struct_size(cxlsd->targets, target, nr_targets), + GFP_KERNEL); + if (!cxlsd->targets) { + kfree(cxlsd); + return NULL; + } + cxlsd->targets->nr_targets = nr_targets; + seqlock_init(&cxlsd->targets->target_lock); + cxld = &cxlsd->base; + } + + return cxld; +} + /** * cxl_decoder_alloc - Allocate a new CXL decoder * @port: owning port of this decoder @@ -1186,7 +1230,7 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, if (nr_targets > CXL_DECODER_MAX_INTERLEAVE) return ERR_PTR(-EINVAL); - cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL); + cxld = __cxl_decoder_alloc(port, nr_targets); if (!cxld) return ERR_PTR(-ENOMEM); @@ -1198,8 +1242,6 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, get_device(&port->dev); cxld->id = rc; - cxld->nr_targets = nr_targets; - seqlock_init(&cxld->target_lock); dev = &cxld->dev; device_initialize(dev); device_set_pm_not_required(dev); @@ -1274,12 +1316,12 @@ EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL); * * Return: A new cxl decoder to be registered by cxl_decoder_add() */ -struct cxl_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) +struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) { if (!is_cxl_endpoint(port)) return ERR_PTR(-EINVAL); - return cxl_decoder_alloc(port, 0); + return to_cxl_endpoint_decoder(cxl_decoder_alloc(port, 0)); } EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 6517d5cdf5ee..85fd5e84f978 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -193,6 +193,18 @@ enum cxl_decoder_type { */ #define CXL_DECODER_MAX_INTERLEAVE 16 +/** + * struct cxl_decoder_targets - Target information for root and switch decoders. + * @target_lock: coordinate coherent reads of the target list + * @nr_targets: number of elements in @target + * @target: active ordered target list in current decoder configuration + */ +struct cxl_decoder_targets { + seqlock_t target_lock; + int nr_targets; + struct cxl_dport *target[]; +}; + /** * struct cxl_decoder - CXL address range decode configuration * @dev: this decoder's device @@ -202,9 +214,6 @@ enum cxl_decoder_type { * @interleave_granularity: data stride per dport * @target_type: accelerator vs expander (type2 vs type3) selector * @flags: memory type capabilities and locking - * @target_lock: coordinate coherent reads of the target list - * @nr_targets: number of elements in @target - * @target: active ordered target list in current decoder configuration */ struct cxl_decoder { struct device dev; @@ -214,11 +223,46 @@ struct cxl_decoder { int interleave_granularity; enum cxl_decoder_type target_type; unsigned long flags; - seqlock_t target_lock; - int nr_targets; - struct cxl_dport *target[]; }; +/** + * struct cxl_endpoint_decoder - An decoder residing in a CXL endpoint. + * @base: Base class decoder + */ +struct cxl_endpoint_decoder { + struct cxl_decoder base; +}; + +/** + * struct cxl_switch_decoder - A decoder in a switch or hostbridge. + * @base: Base class decoder + * @targets: Downstream targets for this switch. + */ +struct cxl_switch_decoder { + struct cxl_decoder base; + struct cxl_decoder_targets *targets; +}; + +/** + * struct cxl_root_decoder - A toplevel/platform decoder + * @base: Base class decoder + * @targets: Downstream targets (ie. hostbridges). + */ +struct cxl_root_decoder { + struct cxl_decoder base; + struct cxl_decoder_targets *targets; +}; + +#define _to_cxl_decoder(x) \ + static inline struct cxl_##x##_decoder *to_cxl_##x##_decoder( \ + struct cxl_decoder *cxld) \ + { \ + return container_of(cxld, struct cxl_##x##_decoder, base); \ + } + +_to_cxl_decoder(root) +_to_cxl_decoder(switch) +_to_cxl_decoder(endpoint) /** * enum cxl_nvdimm_brige_state - state machine for managing bus rescans @@ -343,11 +387,22 @@ struct cxl_decoder *cxl_root_decoder_alloc(struct cxl_port *port, struct cxl_decoder *cxl_switch_decoder_alloc(struct cxl_port *port, unsigned int nr_targets); int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map); -struct cxl_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port); +struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port); int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map); int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld); int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint); +static inline struct cxl_decoder_targets * +cxl_get_decoder_targets(struct cxl_decoder *cxld) +{ + if (is_root_decoder(&cxld->dev)) + return to_cxl_root_decoder(cxld)->targets; + else if (is_endpoint_decoder(&cxld->dev)) + return NULL; + else + return to_cxl_switch_decoder(cxld)->targets; +} + struct cxl_hdm; struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port); int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm); diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 431f2bddf6c8..0534d96486eb 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -454,7 +454,7 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) if (target_count) cxld = cxl_switch_decoder_alloc(port, target_count); else - cxld = cxl_endpoint_decoder_alloc(port); + cxld = &cxl_endpoint_decoder_alloc(port)->base; if (IS_ERR(cxld)) { dev_warn(&port->dev, "Failed to allocate the decoder\n"); From patchwork Wed Apr 13 18:37:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812382 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50A35320B; Wed, 13 Apr 2022 18:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875097; x=1681411097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ueQH3WwkSvY8y69EXKcXNLoaONnTPsnAVGBYWnEIn+E=; b=DdG60cki6n9rlDEuEN92LDlNMMBQKC3aciNSB9A+HI2bPbDCBp0qBxel YyTkEbTJobpW+ttMvN+FxlWEvumtDzPtipuxtqLL37mQnazTGEB783xMb rLpPm5/VM5jP1pGDIGEDoahWmpVrMjnjSVxuJuFWglna1TFpccoWklwC8 3xoF/jqoHgGrbne0LTplVbkaZQAnXIXUl1jksYF0vBZCvMYszV5MTEPUd PocrMdQKZ+lFPFlPWZIOs0V6Hq8ofdNcRykw85edMogr4VnMIdFEf5/CY xuHBCHUenaS2GXZkSn+YPVC0lQdKXDNvEvIcyCF96T6hoOcGZWpzMeRzi g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631844" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631844" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013586" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:48 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Dan Williams , Alison Schofield , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 05/15] cxl/acpi: Reserve CXL resources from request_free_mem_region Date: Wed, 13 Apr 2022 11:37:10 -0700 Message-Id: <20220413183720.2444089-6-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Define an API which allows CXL drivers to manage CXL address space. CXL is unique in that the address space and various properties are only known after CXL drivers come up, and therefore cannot be part of core memory enumeration. Compute Express Link 2.0 [ECN] defines a concept called CXL Fixed Memory Window Structures (CFMWS). Each CFMWS conveys a region of host physical address (HPA) space which has certain properties that are familiar to CXL, mainly interleave properties, and restrictions, such as persistence. The HPA ranges therefore should be owned, or at least guided by the relevant CXL driver, cxl_acpi [1]. It would be desirable to simply insert this address space into iomem_resource with a new flag to denote this is CXL memory. This would permit request_free_mem_region() to be reused for CXL memory provided it learned some new tricks. For that, it is tempting to simply use insert_resource(). The API was designed specifically for cases where new devices may offer new address space. This cannot work in the general case. Boot firmware can pass, some, none, or all of the CFMWS range as various types of memory to the kernel, and this may be left alone, merged, or even expanded. As a result iomem_resource may intersect CFMWS regions in ways insert_resource cannot handle [2]. Similar reasoning applies to allocate_resource(). With the insert_resource option out, the only reasonable approach left is to let the CXL driver manage the address space independently of iomem_resource and attempt to prevent users of device private memory APIs from using CXL memory. In the case where cxl_acpi comes up first, the new API allows cxl to block use of any CFMWS defined address space by assuming everything above the highest CFMWS entry is fair game. It is expected that this effectively will prevent usage of device private memory, but if such behavior is undesired, cxl_acpi can be blocked from loading, or unloaded. When device private memory is used before CXL comes up, or, there are intersections as described above, the CXL driver will have to make sure to not reuse sysram that is BUSY. [1]: The specification defines enumeration via ACPI, however, one could envision devicetree, or some other hardcoded mechanisms for doing the same thing. [2]: A common way to hit this case is when BIOS creates a volatile region with extra space for hotplug. In this case, you're likely to have |<--------------HPA space---------------------->| |<---iomem_resource -->| | DDR | CXL Volatile | | | CFMWS for volatile w/ hotplug | Suggested-by: Dan Williams Signed-off-by: Ben Widawsky --- drivers/cxl/acpi.c | 26 ++++++++++++++++++++++++++ include/linux/ioport.h | 1 + kernel/resource.c | 11 ++++++++++- 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 9b69955b90cb..0870904fe4b5 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -76,6 +76,7 @@ static int cxl_acpi_cfmws_verify(struct device *dev, struct cxl_cfmws_context { struct device *dev; struct cxl_port *root_port; + struct acpi_cedt_cfmws *high_cfmws; }; static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, @@ -126,6 +127,14 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, cfmws->base_hpa + cfmws->window_size - 1); return 0; } + + if (ctx->high_cfmws) { + if (cfmws->base_hpa > ctx->high_cfmws->base_hpa) + ctx->high_cfmws = cfmws; + } else { + ctx->high_cfmws = cfmws; + } + dev_dbg(dev, "add: %s node: %d range %#llx-%#llx\n", dev_name(&cxld->dev), phys_to_target_node(cxld->range.start), cfmws->base_hpa, cfmws->base_hpa + cfmws->window_size - 1); @@ -299,6 +308,7 @@ static int cxl_acpi_probe(struct platform_device *pdev) ctx = (struct cxl_cfmws_context) { .dev = host, .root_port = root_port, + .high_cfmws = NULL, }; acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, cxl_parse_cfmws, &ctx); @@ -317,10 +327,25 @@ static int cxl_acpi_probe(struct platform_device *pdev) if (rc < 0) return rc; + if (ctx.high_cfmws) { + resource_size_t end = + ctx.high_cfmws->base_hpa + ctx.high_cfmws->window_size; + dev_dbg(host, + "Disabling free device private regions below %#llx\n", + end); + set_request_free_min_base(end); + } + /* In case PCI is scanned before ACPI re-trigger memdev attach */ return cxl_bus_rescan(); } +static int cxl_acpi_remove(struct platform_device *pdev) +{ + set_request_free_min_base(0); + return 0; +} + static const struct acpi_device_id cxl_acpi_ids[] = { { "ACPI0017" }, { }, @@ -329,6 +354,7 @@ MODULE_DEVICE_TABLE(acpi, cxl_acpi_ids); static struct platform_driver cxl_acpi_driver = { .probe = cxl_acpi_probe, + .remove = cxl_acpi_remove, .driver = { .name = KBUILD_MODNAME, .acpi_match_table = cxl_acpi_ids, diff --git a/include/linux/ioport.h b/include/linux/ioport.h index ec5f71f7135b..dc41e4be5635 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -325,6 +325,7 @@ extern int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end, void *arg, int (*func)(struct resource *, void *)); +void set_request_free_min_base(resource_size_t val); struct resource *devm_request_free_mem_region(struct device *dev, struct resource *base, unsigned long size); struct resource *request_free_mem_region(struct resource *base, diff --git a/kernel/resource.c b/kernel/resource.c index 34eaee179689..a4750689e529 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1774,6 +1774,14 @@ void resource_list_free(struct list_head *head) EXPORT_SYMBOL(resource_list_free); #ifdef CONFIG_DEVICE_PRIVATE +static resource_size_t request_free_min_base; + +void set_request_free_min_base(resource_size_t val) +{ + request_free_min_base = val; +} +EXPORT_SYMBOL_GPL(set_request_free_min_base); + static struct resource *__request_free_mem_region(struct device *dev, struct resource *base, unsigned long size, const char *name) { @@ -1799,7 +1807,8 @@ static struct resource *__request_free_mem_region(struct device *dev, } write_lock(&resource_lock); - for (; addr > size && addr >= base->start; addr -= size) { + for (; addr > size && addr >= max(base->start, request_free_min_base); + addr -= size) { if (__region_intersects(addr, size, 0, IORES_DESC_NONE) != REGION_DISJOINT) continue; From patchwork Wed Apr 13 18:37:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812373 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A31452C80; Wed, 13 Apr 2022 18:38:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875092; x=1681411092; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xCFUJR0nc5jPOvEXzPtBuiu7iwfD2xyQ9QD7Fv5gBes=; b=ilMFh4nNhe9o782EUk5hmcJO8x1dz5kJD/sDPJW787WeXYF898h/JBZR gSTn5zVHFPQkwBCRG/IgFKgCNjkz51DtBfdtpFXMAk1lGxC4eBxi9seZT 3lV2V+6e8h3c0J56WfdkEWsUIf5IJLZRnbvpMHHzqx/atzNVtMinAl2R+ sS2yy9uBGdqJ4Scj4c4thrwATLrcYX3y372K1JmRETrgPQlMd/61B+KWj MByLGyCwHgX/kF1euy2/l/FmH3CDawk6gG5ZhZzZdXvcjvSOAJHLDwccz hWwHgmzF0+PpQznnAaWkPbC5DrDBZj+D5YoafMfDgCSCZRfc2u5t3epVY w==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631845" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631845" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013590" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 06/15] cxl/acpi: Manage root decoder's address space Date: Wed, 13 Apr 2022 11:37:11 -0700 Message-Id: <20220413183720.2444089-7-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Use a gen_pool to manage the physical address space that is routed by the platform decoder (root decoder). As described in 'cxl/acpi: Resereve CXL resources from request_free_mem_region' the address space does not coexist well if part of all of it is conveyed in the memory map to the kernel. Since the existing resource APIs of interest all rely on the root decoder's address space being in iomem_resource, the choices are to roll a new allocator because on struct resource, or use gen_pool. gen_pool is a good choice because it already has all the capabilities needed to satisfy CXL programming. Signed-off-by: Ben Widawsky --- drivers/cxl/acpi.c | 36 ++++++++++++++++++++++++++++++++++++ drivers/cxl/cxl.h | 2 ++ 2 files changed, 38 insertions(+) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 0870904fe4b5..a6b0c3181d0e 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2021 Intel Corporation. All rights reserved. */ #include +#include #include #include #include @@ -79,6 +80,25 @@ struct cxl_cfmws_context { struct acpi_cedt_cfmws *high_cfmws; }; +static int cfmws_cookie; + +static int fill_busy_mem(struct resource *res, void *_window) +{ + struct gen_pool *window = _window; + struct genpool_data_fixed gpdf; + unsigned long addr; + void *type; + + gpdf.offset = res->start; + addr = gen_pool_alloc_algo_owner(window, resource_size(res), + gen_pool_fixed_alloc, &gpdf, &type); + if (addr != res->start || (res->start == 0 && type != &cfmws_cookie)) + return -ENXIO; + + pr_devel("%pR removed from CFMWS\n", res); + return 0; +} + static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, const unsigned long end) { @@ -88,6 +108,8 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, struct device *dev = ctx->dev; struct acpi_cedt_cfmws *cfmws; struct cxl_decoder *cxld; + struct gen_pool *window; + char name[64]; int rc, i; cfmws = (struct acpi_cedt_cfmws *) header; @@ -116,6 +138,20 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws); cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws); + sprintf(name, "cfmws@%#llx", cfmws->base_hpa); + window = devm_gen_pool_create(dev, ilog2(SZ_256M), NUMA_NO_NODE, name); + if (IS_ERR(window)) + return 0; + + gen_pool_add_owner(window, cfmws->base_hpa, -1, cfmws->window_size, + NUMA_NO_NODE, &cfmws_cookie); + + /* Area claimed by other resources, remove those from the gen_pool. */ + walk_iomem_res_desc(IORES_DESC_NONE, 0, cfmws->base_hpa, + cfmws->base_hpa + cfmws->window_size - 1, window, + fill_busy_mem); + to_cxl_root_decoder(cxld)->window = window; + rc = cxl_decoder_add(cxld, target_map); if (rc) put_device(&cxld->dev); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 85fd5e84f978..0e1c65761ead 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -246,10 +246,12 @@ struct cxl_switch_decoder { /** * struct cxl_root_decoder - A toplevel/platform decoder * @base: Base class decoder + * @window: host address space allocator * @targets: Downstream targets (ie. hostbridges). */ struct cxl_root_decoder { struct cxl_decoder base; + struct gen_pool *window; struct cxl_decoder_targets *targets; }; From patchwork Wed Apr 13 18:37:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812376 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96E2C2F48; Wed, 13 Apr 2022 18:38:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875094; x=1681411094; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9hd4g8zIfuHktQme6cUN0eLi2pbyX+YysGO8+PPESAs=; b=fwe0ryLxOQf7IF9EGJpJaIlm54+QSO5dE7XzWKe+ERawQbKRhAdGjyUD B9iQ/ymxsShpgZ0FBj5uie0GbsVAWoRplseh8rXJmW3zDvbhAaP0csTm7 OD9YLn/ilLEcnM4ixqUbDbwlSJHtb6gjb3Vx6rHl9gEeylrSl80dLPF8q J3ZLG/JxbJboT9T0M1rZfYJKheCjMq+6K8bRW2l7QWB18197dxbc9SoFD KOM/4xoPAXguyosPJnUvLljmU8Pglal8G0FGKeNcIPXZdW6wjMy7kTOzO /W//ck2Lt90+WFx0wyRfW9Iesl9x89iTQGZ9tqyJBO+nup/8VKi5noU82 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631846" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631846" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013594" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 07/15] cxl/port: Surface ram and pmem resources Date: Wed, 13 Apr 2022 11:37:12 -0700 Message-Id: <20220413183720.2444089-8-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 CXL Type 2 and 3 endpoints may contain Host-managed Device Memory (HDM). This memory can be either volatile, persistent, or some combination of both. Similar to the root decoder the port's resources can be considered the host memory of which decoders allocate out of. Unlike the root decoder resource, device resources are in the device physical address space domain. The CXL specification mandates a specific partitioning of volatile vs. persistent capacities. While an endpoint may contain one, or both capacities the volatile capacity while always be first. To accommodate this, two parameters are added to port creation, the offset of the split, and the total capacity. Signed-off-by: Ben Widawsky --- drivers/cxl/core/port.c | 19 +++++++++++++++++++ drivers/cxl/cxl.h | 11 +++++++++++ drivers/cxl/mem.c | 7 +++++-- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 8dd29c97e318..0d946711685b 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -2,6 +2,7 @@ /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ #include #include +#include #include #include #include @@ -469,6 +470,24 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, } EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL); +struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, + struct device *uport, + resource_size_t component_reg_phys, + u64 capacity, u64 pmem_offset, + struct cxl_port *parent_port) +{ + struct cxl_port *ep = + devm_cxl_add_port(host, uport, component_reg_phys, parent_port); + if (IS_ERR(ep) || !capacity) + return ep; + + ep->capacity = capacity; + ep->pmem_offset = pmem_offset; + + return ep; +} +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_endpoint_port, CXL); + struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port) { /* There is no pci_bus associated with a CXL platform-root port */ diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 0e1c65761ead..52295548a071 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -309,6 +309,9 @@ struct cxl_nvdimm { * @component_reg_phys: component register capability base address (optional) * @dead: last ep has been removed, force port re-creation * @depth: How deep this port is relative to the root. depth 0 is the root. + * @capacity: How much total storage the media can hold (endpoint only) + * @pmem_offset: Partition dividing volatile, [0, pmem_offset -1 ], and persistent + * [pmem_offset, capacity - 1] addresses. */ struct cxl_port { struct device dev; @@ -320,6 +323,9 @@ struct cxl_port { resource_size_t component_reg_phys; bool dead; unsigned int depth; + + u64 capacity; + u64 pmem_offset; }; /** @@ -368,6 +374,11 @@ struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port); struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, resource_size_t component_reg_phys, struct cxl_port *parent_port); +struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, + struct device *uport, + resource_size_t component_reg_phys, + u64 capacity, u64 pmem_offset, + struct cxl_port *parent_port); struct cxl_port *find_cxl_root(struct device *dev); int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd); int cxl_bus_rescan(void); diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index 49a4b1c47299..b27ce13c1872 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -50,9 +50,12 @@ static int create_endpoint(struct cxl_memdev *cxlmd, { struct cxl_dev_state *cxlds = cxlmd->cxlds; struct cxl_port *endpoint; + u64 partition = range_len(&cxlds->ram_range); + u64 size = range_len(&cxlds->ram_range) + range_len(&cxlds->pmem_range); - endpoint = devm_cxl_add_port(&parent_port->dev, &cxlmd->dev, - cxlds->component_reg_phys, parent_port); + endpoint = devm_cxl_add_endpoint_port(&parent_port->dev, &cxlmd->dev, + cxlds->component_reg_phys, size, + partition, parent_port); if (IS_ERR(endpoint)) return PTR_ERR(endpoint); From patchwork Wed Apr 13 18:37:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812380 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32AAD2F43; Wed, 13 Apr 2022 18:38:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875096; x=1681411096; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pRLZars87pwyZ+h8xfv4rMp3ejVZKIggcp6TbwcwTjk=; b=Lvl6lUESA0bx1p7Qby2tzZcGnNtWVVIAm3Sroy7SqMRfaIxl8oFeHPln rONrKWj6ltEX7mOJup/lHEsWR6EWki4YHjTO5dA2SiHoSuwWGDnVFx4YA VFbE37ncDAKe0EbVjkboxwSNDScfgqXUaCB0KUGooLu3+HnTFa7Q3VTQR AtuzbYwKd6vHj502z4EXQgCpMTo+To0uQZ8AAAexp8ax3TcM1sHN56Z6t oHnws/e49X/0PSBjTg1CUsujTK5wdwzBa5P7bH3VTNhcHetHXK+WtUVvI MmaPHcyFI9WlWJQpcWyEltBXFFKrxaKZ2LyfAWJWAorw7R2CQS31MRaJN Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631847" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631847" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013599" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:49 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 08/15] cxl/core/hdm: Allocate resources from the media Date: Wed, 13 Apr 2022 11:37:13 -0700 Message-Id: <20220413183720.2444089-9-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Similar to how decoders consume address space for the root decoder, they also consume space on the device's physical media. For future allocations, it's required to mark those as used/busy. The CXL specification requires that HDM decoder are programmed in ascending physical address order. The device's address space can therefore be managed by a simple allocator. Fragmentation may occur if devices are taken in and out of active decoding. Fixing this is left to userspace to handle. Signed-off-by: Ben Widawsky --- drivers/cxl/core/core.h | 3 +++ drivers/cxl/core/hdm.c | 26 +++++++++++++++++++++++++- drivers/cxl/core/port.c | 9 ++++++++- drivers/cxl/cxl.h | 10 ++++++++++ 4 files changed, 46 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 1a50c0fc399c..a507a2502127 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -9,6 +9,9 @@ extern const struct device_type cxl_nvdimm_type; extern struct attribute_group cxl_base_attribute_group; +extern struct device_attribute dev_attr_create_pmem_region; +extern struct device_attribute dev_attr_delete_region; + struct cxl_send_command; struct cxl_mem_query_commands; int cxl_query_cmd(struct cxl_memdev *cxlmd, diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 37c09c77e9a7..5326a2cd6968 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ #include +#include #include #include @@ -198,8 +199,11 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, else cxld->target_type = CXL_DECODER_ACCELERATOR; - if (is_endpoint_decoder(&cxld->dev)) + if (is_endpoint_decoder(&cxld->dev)) { + to_cxl_endpoint_decoder(cxld)->skip = + ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); return 0; + } target_list.value = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); @@ -218,6 +222,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) void __iomem *hdm = cxlhdm->regs.hdm_decoder; struct cxl_port *port = cxlhdm->port; int i, committed, failed; + u64 base = 0; u32 ctrl; /* @@ -240,6 +245,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) for (i = 0, failed = 0; i < cxlhdm->decoder_count; i++) { int target_map[CXL_DECODER_MAX_INTERLEAVE] = { 0 }; int rc, target_count = cxlhdm->target_count; + struct cxl_endpoint_decoder *cxled; struct cxl_decoder *cxld; if (is_cxl_endpoint(port)) @@ -267,6 +273,24 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) "Failed to add decoder to port\n"); return rc; } + + if (!is_cxl_endpoint(port)) + continue; + + cxled = to_cxl_endpoint_decoder(cxld); + cxled->drange = (struct range) { + .start = base, + .end = base + range_len(&cxld->range) - 1, + }; + + if (!range_len(&cxld->range)) + continue; + + dev_dbg(&cxld->dev, + "Enumerated decoder with DPA range %#llx-%#llx\n", base, + base + range_len(&cxled->drange)); + base += cxled->skip + range_len(&cxld->range); + port->last_cxled = cxled; } if (failed == cxlhdm->decoder_count) { diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 0d946711685b..9ef8d69dbfa5 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -84,7 +84,14 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr, { struct cxl_decoder *cxld = to_cxl_decoder(dev); - return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range)); + if (is_endpoint_decoder(dev)) { + struct cxl_endpoint_decoder *cxled; + + cxled = to_cxl_endpoint_decoder(cxld); + return sysfs_emit(buf, "%#llx\n", range_len(&cxled->drange)); + } else { + return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range)); + } } static DEVICE_ATTR_RO(size); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 52295548a071..33f8a55f2f84 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -228,9 +228,13 @@ struct cxl_decoder { /** * struct cxl_endpoint_decoder - An decoder residing in a CXL endpoint. * @base: Base class decoder + * @drange: Device physical address space this decoder is using + * @skip: The skip count as specified in the CXL specification. */ struct cxl_endpoint_decoder { struct cxl_decoder base; + struct range drange; + u64 skip; }; /** @@ -248,11 +252,15 @@ struct cxl_switch_decoder { * @base: Base class decoder * @window: host address space allocator * @targets: Downstream targets (ie. hostbridges). + * @next_region_id: The pre-cached next region id. + * @id_lock: Protects next_region_id */ struct cxl_root_decoder { struct cxl_decoder base; struct gen_pool *window; struct cxl_decoder_targets *targets; + int next_region_id; + struct mutex id_lock; /* synchronizes access to next_region_id */ }; #define _to_cxl_decoder(x) \ @@ -312,6 +320,7 @@ struct cxl_nvdimm { * @capacity: How much total storage the media can hold (endpoint only) * @pmem_offset: Partition dividing volatile, [0, pmem_offset -1 ], and persistent * [pmem_offset, capacity - 1] addresses. + * @last_cxled: Last active decoder doing decode (endpoint only) */ struct cxl_port { struct device dev; @@ -326,6 +335,7 @@ struct cxl_port { u64 capacity; u64 pmem_offset; + struct cxl_endpoint_decoder *last_cxled; }; /** From patchwork Wed Apr 13 18:37:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812379 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F2872F3B; Wed, 13 Apr 2022 18:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875095; x=1681411095; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pGYR65MFaiiuy5Lo24QFC6NqUiqMmD6aLbcUy9Q7Rcg=; b=Jg8tWTZAy5cXv487f70A0d5clX2AdyAOM4AYo5W5R4Lus19O2ETxeMn3 5r/oDWz6FPHLkzl/3Z/nlCKfjSe1buxcvV/Nrfk6Pn2BwMi5LUl5TXbnr TVCLLGlhly3wEVvnpFRzUV6qugLYmD6QIBp3Px/kLAxeK6i9X527srtPo j/wJ9WbtKy8IaZv04AiCiENbw7/kb74huzNN/uYaXMNBPzCWB16YuUBSz SHJ9jaaGEuQp+WFWF0NRustC9JQxtM8CaRbDc57ql90KceA7C9IVbKtqy aOUrSTR4A2go4h+DJIm9zfeHr2Okg6ddWecHzeFQtScso0mBejzh3II0g g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631851" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631851" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013608" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 09/15] cxl/core/port: Add attrs for size and volatility Date: Wed, 13 Apr 2022 11:37:14 -0700 Message-Id: <20220413183720.2444089-10-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Endpoint decoders have the decoder-unique properties of having their range being constrained by the media they're a part of, and, having a concrete need to disambiguate between volatile and persistent capacity (due to partitioning). As part of region programming, these decoders will be required to be pre-configured, ie, have the size and volatility set. Endpoint decoders must consider two different address spaces for address allocation. Sysram will need to be mapped for use of this memory if not set up in the EFI memory map. Additionally, the CXL device itself has it's own address space domain which requires allocation and management. Device address space is managed with a simple allocator and host physical address space is managed by the region driver/core. /sys/bus/cxl/devices/decoder3.0 ├── devtype ├── interleave_granularity ├── interleave_ways ├── locked ├── modalias ├── size ├── start ├── subsystem -> ../../../../../../../bus/cxl ├── target_type ├── uevent └── volatile Signed-off-by: Ben Widawsky --- Documentation/ABI/testing/sysfs-bus-cxl | 13 ++- drivers/cxl/Kconfig | 3 +- drivers/cxl/core/port.c | 145 +++++++++++++++++++++++- drivers/cxl/cxl.h | 6 + 4 files changed, 163 insertions(+), 4 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 7c2b846521f3..01fee09b8473 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -117,7 +117,9 @@ Description: range is fixed. For decoders of devtype "cxl_decoder_switch" the address is bounded by the decode range of the cxl_port ancestor of the decoder's cxl_port, and dynamically updates based on the - active memory regions in that address space. + active memory regions in that address space. For decoders of + devtype "cxl_decoder_endpoint", size is a mutable value which + carves our space from the physical media. What: /sys/bus/cxl/devices/decoderX.Y/locked Date: June, 2021 @@ -163,3 +165,12 @@ Description: memory (type-3). The 'target_type' attribute indicates the current setting which may dynamically change based on what memory regions are activated in this decode hierarchy. + +What: /sys/bus/cxl/devices/decoderX.Y/volatile +Date: March, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Provide a knob to set/get whether the desired media is volatile + or persistent. This applies only to decoders of devtype + "cxl_decoder_endpoint", diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index b88ab956bb7c..8796fd4b22bc 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -95,7 +95,8 @@ config CXL_MEM If unsure say 'm'. config CXL_PORT - default CXL_BUS tristate + default CXL_BUS + select DEVICE_PRIVATE endif diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 9ef8d69dbfa5..bdafdec80d98 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include @@ -80,7 +81,7 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr, static DEVICE_ATTR_ADMIN_RO(start); static ssize_t size_show(struct device *dev, struct device_attribute *attr, - char *buf) + char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); @@ -93,7 +94,144 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr, return sysfs_emit(buf, "%#llx\n", range_len(&cxld->range)); } } -static DEVICE_ATTR_RO(size); + +static struct cxl_endpoint_decoder * +get_prev_decoder(struct cxl_endpoint_decoder *cxled) +{ + struct cxl_port *port = to_cxl_port(cxled->base.dev.parent); + struct device *cxldd; + char *name; + + if (cxled->base.id == 0) + return NULL; + + name = kasprintf(GFP_KERNEL, "decoder%u.%u", port->id, cxled->base.id); + if (!name) + return ERR_PTR(-ENOMEM); + + cxldd = device_find_child_by_name(&port->dev, name); + kfree(name); + if (cxldd) { + struct cxl_decoder *cxld = to_cxl_decoder(cxldd); + + if (dev_WARN_ONCE(&port->dev, + (cxld->flags & CXL_DECODER_F_ENABLE) == 0, + "%s should be enabled\n", + dev_name(&cxld->dev))) + return NULL; + return to_cxl_endpoint_decoder(cxld); + } + + return NULL; +} + +static ssize_t size_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + struct cxl_port *port = to_cxl_port(cxled->base.dev.parent); + struct cxl_endpoint_decoder *prev = get_prev_decoder(cxled); + u64 size, dpa_base = 0; + int rc; + + rc = kstrtou64(buf, 0, &size); + if (rc) + return rc; + + if (size % SZ_256M) + return -EINVAL; + + rc = mutex_lock_interruptible(&cxled->res_lock); + if (rc) + return rc; + + /* No change */ + if (range_len(&cxled->drange) == size) + goto out; + + rc = mutex_lock_interruptible(&port->media_lock); + if (rc) + goto out; + + /* Extent was previously set */ + if (port->last_cxled == cxled) { + if (size == range_len(&cxled->drange)) { + mutex_unlock(&port->media_lock); + goto out; + } + + if (!size) { + dev_dbg(dev, + "freeing previous reservation %#llx-%#llx\n", + cxled->drange.start, cxled->drange.end); + port->last_cxled = prev; + mutex_unlock(&port->media_lock); + goto out; + } + } + + if (prev) + dpa_base = port->last_cxled->drange.end + 1; + + if ((dpa_base + size) > port->capacity) + rc = -ENOSPC; + else + port->last_cxled = cxled; + + mutex_unlock(&port->media_lock); + if (rc) + goto out; + + cxled->drange = (struct range) { + .start = dpa_base, + .end = dpa_base + size - 1 + }; + + dev_dbg(dev, "Allocated %#llx-%#llx from media\n", cxled->drange.start, + cxled->drange.end); + +out: + mutex_unlock(&cxled->res_lock); + return rc ? rc : len; +} +static DEVICE_ATTR_RW(size); + +static ssize_t volatile_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + + return sysfs_emit(buf, "%u\n", cxled->volatil); +} + +static ssize_t volatile_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + bool p; + int rc; + + rc = kstrtobool(buf, &p); + if (rc) + return rc; + + rc = mutex_lock_interruptible(&cxled->res_lock); + if (rc) + return rc; + + if (range_len(&cxled->drange) > 0) + rc = -EBUSY; + mutex_unlock(&cxled->res_lock); + if (rc) + return rc; + + cxled->volatil = p; + return len; +} +static DEVICE_ATTR_RW(volatile); #define CXL_DECODER_FLAG_ATTR(name, flag) \ static ssize_t name##_show(struct device *dev, \ @@ -211,6 +349,7 @@ static const struct attribute_group *cxl_decoder_root_attribute_groups[] = { static struct attribute *cxl_decoder_endpoint_attrs[] = { &dev_attr_target_type.attr, + &dev_attr_volatile.attr, NULL, }; @@ -413,6 +552,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport, ida_init(&port->decoder_ida); INIT_LIST_HEAD(&port->dports); INIT_LIST_HEAD(&port->endpoints); + mutex_init(&port->media_lock); device_initialize(dev); device_set_pm_not_required(dev); @@ -1191,6 +1331,7 @@ static struct cxl_decoder *__cxl_decoder_alloc(struct cxl_port *port, cxled = kzalloc(sizeof(*cxled), GFP_KERNEL); if (!cxled) return NULL; + mutex_init(&cxled->res_lock); cxld = &cxled->base; } else if (is_cxl_root(port)) { struct cxl_root_decoder *cxlrd; diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 33f8a55f2f84..07df13f05d3d 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -230,11 +230,15 @@ struct cxl_decoder { * @base: Base class decoder * @drange: Device physical address space this decoder is using * @skip: The skip count as specified in the CXL specification. + * @res_lock: Synchronize device's resource usage + * @volatil: Configuration param. Decoder target is non-persistent mem */ struct cxl_endpoint_decoder { struct cxl_decoder base; struct range drange; u64 skip; + struct mutex res_lock; /* sync access to decoder's resource */ + bool volatil; }; /** @@ -321,6 +325,7 @@ struct cxl_nvdimm { * @pmem_offset: Partition dividing volatile, [0, pmem_offset -1 ], and persistent * [pmem_offset, capacity - 1] addresses. * @last_cxled: Last active decoder doing decode (endpoint only) + * @media_lock: Synchronizes use of allocation of media (endpoint only) */ struct cxl_port { struct device dev; @@ -336,6 +341,7 @@ struct cxl_port { u64 capacity; u64 pmem_offset; struct cxl_endpoint_decoder *last_cxled; + struct mutex media_lock; /* sync access to media allocator */ }; /** From patchwork Wed Apr 13 18:37:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812378 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF5792F56; Wed, 13 Apr 2022 18:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875095; x=1681411095; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZwNEpFjzgDKXYjZD1HIgwd1pFDMWLxI8kgblOJUA2nE=; b=Z7sc8AmTEgkbybBOprvHx5B5aHdD+2Ft9V2GRErVD+YqvafMAPxuJM1+ kIELsJDxZiiRHynZby4k63HTm+m/vdPvdlJBgGn45Kh+8rthcNZn6G4Q5 bIfosuvSICeHAAf3RKJEeM68XeIjsdmU96UPNitIG99d9TsDS/+IYodwl CS5NyxalMOC/Cood999UHhgg4DJY/VoW+uTlNsWltf9mqeoZayzfpZtwZ 2LM+EEL+t8/gdQOh8hdlOtTBOJuT3evhzBLAaQysoRSNxkpwjQlxmJBpR sGTUdfrnQBYWgBQhD87iSh3hT0sQS28p1knlEfg3lKK8XVnZ5ooa8lrEi g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631854" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631854" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013614" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 10/15] cxl/core: Extract IW/IG decoding Date: Wed, 13 Apr 2022 11:37:15 -0700 Message-Id: <20220413183720.2444089-11-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Interleave granularity and ways have specification defined encodings. Extracting this functionality into the common header file allows other consumers to make use of it. Signed-off-by: Ben Widawsky --- drivers/cxl/core/hdm.c | 11 ++--------- drivers/cxl/cxl.h | 17 +++++++++++++++++ 2 files changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 5326a2cd6968..b4b65aa55bd2 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -134,21 +134,14 @@ static int to_interleave_granularity(u32 ctrl) { int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IG_MASK, ctrl); - return 256 << val; + return cxl_to_interleave_granularity(val); } static int to_interleave_ways(u32 ctrl) { int val = FIELD_GET(CXL_HDM_DECODER0_CTRL_IW_MASK, ctrl); - switch (val) { - case 0 ... 4: - return 1 << val; - case 8 ... 10: - return 3 << (val - 8); - default: - return 0; - } + return cxl_to_interleave_ways(val); } static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 07df13f05d3d..0586c3d4592c 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -64,6 +64,23 @@ static inline int cxl_hdm_decoder_count(u32 cap_hdr) return val ? val * 2 : 1; } +static inline int cxl_to_interleave_granularity(u16 ig) +{ + return 256 << ig; +} + +static inline int cxl_to_interleave_ways(u8 eniw) +{ + switch (eniw) { + case 0 ... 4: + return 1 << eniw; + case 8 ... 10: + return 3 << (eniw - 8); + default: + return 0; + } +} + /* CXL 2.0 8.2.8.1 Device Capabilities Array Register */ #define CXLDEV_CAP_ARRAY_OFFSET 0x0 #define CXLDEV_CAP_ARRAY_CAP_ID 0 From patchwork Wed Apr 13 18:37:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812381 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 39F963205; Wed, 13 Apr 2022 18:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875097; x=1681411097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rab7+/1vK899gfd3hJN113ssNf+C+OwnTeELqFJrFc0=; b=ZAd/9LRaMw9MIQpk9TTE50O+RH83lj4CW24vkMQKYDU2UyFuM6vFwiYl /b/C/f1UQ3vgkiW8722mGpqUVqd/C0Fs6hSLefycB+D3bakdznVkY+UWk FryQQycbRiocbLg/T4z9EuvhOrYfGPYiYpQR/czFxpCLJVteIJsrCtboX jSLcv5In0Ib0y/cGStPL9xsrApx3YbsX1fGRGUWx88PvAMk9Uent/7Mfw IL47vZYeZ6rbSVjle3kWxWsoBJhpVaA5gLUwd9s6aSR0ExhlFZ5uMsqYO wYVz7g4k5A5FYopE0IJvIps2kvH3oF2keREdUsllK2Da0i1yw11QttNMd g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631855" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631855" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013619" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:50 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 11/15] cxl/acpi: Use common IW/IG decoding Date: Wed, 13 Apr 2022 11:37:16 -0700 Message-Id: <20220413183720.2444089-12-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that functionality to decode interleave ways and granularity is in a common place, use that functionality in the cxl_acpi driver. Signed-off-by: Ben Widawsky --- drivers/cxl/acpi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index a6b0c3181d0e..50e54e5d58c0 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -11,8 +11,8 @@ #include "cxl.h" /* Encode defined in CXL 2.0 8.2.5.12.7 HDM Decoder Control Register */ -#define CFMWS_INTERLEAVE_WAYS(x) (1 << (x)->interleave_ways) -#define CFMWS_INTERLEAVE_GRANULARITY(x) ((x)->granularity + 8) +#define CFMWS_INTERLEAVE_WAYS(x) (cxl_to_interleave_ways((x)->interleave_ways)) +#define CFMWS_INTERLEAVE_GRANULARITY(x) (cxl_to_interleave_granularity((x)->granularity)) static unsigned long cfmws_to_decoder_flags(int restrictions) { From patchwork Wed Apr 13 18:37:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812383 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0A873D8D; Wed, 13 Apr 2022 18:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875097; x=1681411097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A08ZtqBkpxmYlMioCtWhkKAwbN/JwTf6N/V87wbOrSU=; b=FE6VXa4tpdGKY1egNPoVGBHzdid3IrEViD3z8ijoPHSCYMyO3Hb7y1do KUpeHS+Z8SaM2TfgDNnL7xfSC6ZwIxs1ItQVYyzMtRAmgP3RjHiMmUcku 54QBDrA40MN8H9aqzxFwTPRBhHx7g/DkJkzWwS1x6uTb5rEFBD33Anlbz 88UBBaA2NdNPLdTIhi7UBnFIpHclSh+jBGlz4+Wk8G/LWJYWCl44E9CQP zRSetTq869bf77uaVPqIZygqD1JgX1tEvuEqJmOEHcSI+871NcuemiiX6 c1btpRh2ZurG0YKScn9v7t3s7nkxKvF6GtiaF/iRqIVcv3Cmqh+4Ekh94 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631858" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631858" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013626" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 12/15] cxl/region: Add region creation ABI Date: Wed, 13 Apr 2022 11:37:17 -0700 Message-Id: <20220413183720.2444089-13-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Regions are created as a child of the decoder that encompasses an address space with constraints. Regions have a number of attributes that must be configured before the region can be activated. Multiple processes which are trying not to race with each other shouldn't need special userspace synchronization to do so. // Allocate a new region name region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) // Create a new region by name while region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) ! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region do true; done // Region now exists in sysfs stat -t /sys/bus/cxl/devices/decoder0.0/$region // Delete the region, and name echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region Signed-off-by: Ben Widawsky --- Documentation/ABI/testing/sysfs-bus-cxl | 23 ++ .../driver-api/cxl/memory-devices.rst | 11 + drivers/cxl/Kconfig | 5 + drivers/cxl/core/Makefile | 1 + drivers/cxl/core/port.c | 39 ++- drivers/cxl/core/region.c | 234 ++++++++++++++++++ drivers/cxl/cxl.h | 7 + drivers/cxl/region.h | 29 +++ tools/testing/cxl/Kbuild | 1 + 9 files changed, 347 insertions(+), 3 deletions(-) create mode 100644 drivers/cxl/core/region.c create mode 100644 drivers/cxl/region.h diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 01fee09b8473..5229f4bd109a 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -174,3 +174,26 @@ Description: Provide a knob to set/get whether the desired media is volatile or persistent. This applies only to decoders of devtype "cxl_decoder_endpoint", + +What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write an integer value to instantiate a new region to be named + regionZ within the decode range bounded by decoderX.Y. Where X, + Y, and Z are unsigned integers, and where decoderX.Y exists in + the CXL sysfs topology. The value written must match the current + value returned from reading this attribute. This behavior lets + the kernel arbitrate racing attempts to create a region. The + thread that fails to write loops and tries the next value. + Regions must subsequently configured and bound to a region + driver before they can be used. + +What: /sys/bus/cxl/devices/decoderX.Y/delete_region +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Deletes the named region. The attribute expects a region number + as an integer. diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index db476bb170b6..66ddc58a21b1 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -362,6 +362,17 @@ CXL Core .. kernel-doc:: drivers/cxl/core/mbox.c :doc: cxl mbox +CXL Regions +----------- +.. kernel-doc:: drivers/cxl/region.h + :identifiers: + +.. kernel-doc:: drivers/cxl/core/region.c + :doc: cxl core region + +.. kernel-doc:: drivers/cxl/core/region.c + :identifiers: + External Interfaces =================== diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 8796fd4b22bc..7ce86eee8bda 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -99,4 +99,9 @@ config CXL_PORT default CXL_BUS select DEVICE_PRIVATE +config CXL_REGION + tristate + default CXL_BUS + select MEMREGION + endif diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 6d37cd78b151..39ce8f2f2373 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_CXL_BUS) += cxl_core.o ccflags-y += -I$(srctree)/drivers/cxl cxl_core-y := port.o cxl_core-y += pmem.o +cxl_core-y += region.o cxl_core-y += regs.o cxl_core-y += memdev.o cxl_core-y += mbox.o diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index bdafdec80d98..5ef8a6e1ea23 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ #include +#include #include #include #include @@ -11,6 +12,7 @@ #include #include #include +#include #include #include "core.h" @@ -328,6 +330,8 @@ static struct attribute_group cxl_decoder_base_attribute_group = { }; static struct attribute *cxl_decoder_root_attrs[] = { + &dev_attr_create_pmem_region.attr, + &dev_attr_delete_region.attr, &dev_attr_cap_pmem.attr, &dev_attr_cap_ram.attr, &dev_attr_cap_type2.attr, @@ -375,6 +379,8 @@ static void cxl_decoder_release(struct device *dev) struct cxl_decoder *cxld = to_cxl_decoder(dev); struct cxl_port *port = to_cxl_port(dev->parent); + if (is_root_decoder(dev)) + memregion_free(to_cxl_root_decoder(cxld)->next_region_id); ida_free(&port->decoder_ida, cxld->id); kfree(cxld); put_device(&port->dev); @@ -1414,12 +1420,22 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, device_set_pm_not_required(dev); dev->parent = &port->dev; dev->bus = &cxl_bus_type; - if (is_cxl_root(port)) + if (is_cxl_root(port)) { + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + cxld->dev.type = &cxl_decoder_root_type; - else if (is_cxl_endpoint(port)) + mutex_init(&cxlrd->id_lock); + rc = memregion_alloc(GFP_KERNEL); + if (rc < 0) + goto err; + + cxlrd->next_region_id = rc; + cxld->dev.type = &cxl_decoder_root_type; + } else if (is_cxl_endpoint(port)) { cxld->dev.type = &cxl_decoder_endpoint_type; - else + } else { cxld->dev.type = &cxl_decoder_switch_type; + } /* Pre initialize an "empty" decoder */ cxld->interleave_ways = 1; @@ -1582,6 +1598,17 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL); static void cxld_unregister(void *dev) { + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled; + + if (!is_endpoint_decoder(&cxld->dev)) + goto out; + + mutex_lock(&cxled->cxlr->remove_lock); + device_release_driver(&cxled->cxlr->dev); + mutex_unlock(&cxled->cxlr->remove_lock); + +out: device_unregister(dev); } @@ -1681,6 +1708,12 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd) } EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL); +bool schedule_cxl_region_unregister(struct cxl_region *cxlr) +{ + return queue_work(cxl_bus_wq, &cxlr->detach_work); +} +EXPORT_SYMBOL_NS_GPL(schedule_cxl_region_unregister, CXL); + /* for user tooling to ensure port disable work has completed */ static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count) { diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c new file mode 100644 index 000000000000..16829bf2f73a --- /dev/null +++ b/drivers/cxl/core/region.c @@ -0,0 +1,234 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include "core.h" + +/** + * DOC: cxl core region + * + * CXL Regions represent mapped memory capacity in system physical address + * space. Whereas the CXL Root Decoders identify the bounds of potential CXL + * Memory ranges, Regions represent the active mapped capacity by the HDM + * Decoder Capability structures throughout the Host Bridges, Switches, and + * Endpoints in the topology. + */ + +static struct cxl_region *to_cxl_region(struct device *dev); + +static void cxl_region_release(struct device *dev) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + memregion_free(cxlr->id); + kfree(cxlr); +} + +static const struct device_type cxl_region_type = { + .name = "cxl_region", + .release = cxl_region_release, +}; + +bool is_cxl_region(struct device *dev) +{ + return dev->type == &cxl_region_type; +} +EXPORT_SYMBOL_NS_GPL(is_cxl_region, CXL); + +static struct cxl_region *to_cxl_region(struct device *dev) +{ + if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type, + "not a cxl_region device\n")) + return NULL; + + return container_of(dev, struct cxl_region, dev); +} + +static void unregister_region(struct work_struct *work) +{ + struct cxl_region *cxlr; + + cxlr = container_of(work, typeof(*cxlr), detach_work); + device_unregister(&cxlr->dev); +} + +static void schedule_unregister(void *cxlr) +{ + schedule_cxl_region_unregister(cxlr); +} + +static struct cxl_region *cxl_region_alloc(struct cxl_decoder *cxld) +{ + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + struct cxl_region *cxlr; + struct device *dev; + int rc; + + lockdep_assert_held(&cxlrd->id_lock); + + rc = memregion_alloc(GFP_KERNEL); + if (rc < 0) { + dev_dbg(dev, "Failed to get next cached id (%d)\n", rc); + return ERR_PTR(rc); + } + + cxlr = kzalloc(sizeof(*cxlr), GFP_KERNEL); + if (!cxlr) { + memregion_free(rc); + return ERR_PTR(-ENOMEM); + } + + cxlr->id = cxlrd->next_region_id; + cxlrd->next_region_id = rc; + + dev = &cxlr->dev; + device_initialize(dev); + dev->parent = &cxld->dev; + device_set_pm_not_required(dev); + dev->bus = &cxl_bus_type; + dev->type = &cxl_region_type; + INIT_WORK(&cxlr->detach_work, unregister_region); + mutex_init(&cxlr->remove_lock); + + return cxlr; +} + +/** + * devm_cxl_add_region - Adds a region to a decoder + * @cxld: Parent decoder. + * + * This is the second step of region initialization. Regions exist within an + * address space which is mapped by a @cxld. That @cxld must be a root decoder, + * and it enforces constraints upon the region as it is configured. + * + * Return: 0 if the region was added to the @cxld, else returns negative error + * code. The region will be named "regionX.Y.Z" where X is the port, Y is the + * decoder id, and Z is the region number. + */ +static struct cxl_region *devm_cxl_add_region(struct cxl_decoder *cxld) +{ + struct cxl_port *port = to_cxl_port(cxld->dev.parent); + struct cxl_region *cxlr; + struct device *dev; + int rc; + + cxlr = cxl_region_alloc(cxld); + if (IS_ERR(cxlr)) + return cxlr; + + dev = &cxlr->dev; + + rc = dev_set_name(dev, "region%d", cxlr->id); + if (rc) + goto err_out; + + rc = device_add(dev); + if (rc) + goto err_put; + + rc = devm_add_action_or_reset(port->uport, schedule_unregister, cxlr); + if (rc) + goto err_put; + + return cxlr; + +err_put: + put_device(&cxld->dev); + +err_out: + put_device(dev); + return ERR_PTR(rc); +} + +static ssize_t create_pmem_region_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + size_t rc; + + /* + * There's no point in returning known bad answers when the lock is held + * on the store side, even though the answer given here may be + * immediately invalidated as soon as the lock is dropped it's still + * useful to throttle readers in the presence of writers. + */ + rc = mutex_lock_interruptible(&cxlrd->id_lock); + if (rc) + return rc; + rc = sysfs_emit(buf, "%d\n", cxlrd->next_region_id); + mutex_unlock(&cxlrd->id_lock); + + return rc; +} + +static ssize_t create_pmem_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + struct cxl_region *cxlr; + size_t id, rc; + + rc = kstrtoul(buf, 10, &id); + if (rc) + return rc; + + rc = mutex_lock_interruptible(&cxlrd->id_lock); + if (rc) + return rc; + + if (cxlrd->next_region_id != id) { + rc = -EINVAL; + goto out; + } + + cxlr = devm_cxl_add_region(cxld); + rc = 0; + dev_dbg(dev, "Created %s\n", dev_name(&cxlr->dev)); + +out: + mutex_unlock(&cxlrd->id_lock); + if (rc) + return rc; + return len; +} +DEVICE_ATTR_RW(create_pmem_region); + +static struct cxl_region *cxl_find_region_by_name(struct cxl_decoder *cxld, + const char *name) +{ + struct device *region_dev; + + region_dev = device_find_child_by_name(&cxld->dev, name); + if (!region_dev) + return ERR_PTR(-ENOENT); + + return to_cxl_region(region_dev); +} + +static ssize_t delete_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_port *port = to_cxl_port(dev->parent); + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_region *cxlr; + + cxlr = cxl_find_region_by_name(cxld, buf); + if (IS_ERR(cxlr)) + return PTR_ERR(cxlr); + + /* Reference held for wq */ + devm_release_action(port->uport, schedule_unregister, cxlr); + + return len; +} +DEVICE_ATTR_WO(delete_region); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 0586c3d4592c..3abc8b0cf8f4 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -249,6 +249,7 @@ struct cxl_decoder { * @skip: The skip count as specified in the CXL specification. * @res_lock: Synchronize device's resource usage * @volatil: Configuration param. Decoder target is non-persistent mem + * @cxlr: Region this decoder belongs to. */ struct cxl_endpoint_decoder { struct cxl_decoder base; @@ -256,6 +257,7 @@ struct cxl_endpoint_decoder { u64 skip; struct mutex res_lock; /* sync access to decoder's resource */ bool volatil; + struct cxl_region *cxlr; }; /** @@ -454,6 +456,8 @@ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port); int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm); int devm_cxl_add_passthrough_decoder(struct cxl_port *port); +bool is_cxl_region(struct device *dev); + extern struct bus_type cxl_bus_type; struct cxl_driver { @@ -508,6 +512,7 @@ enum cxl_lock_class { CXL_ANON_LOCK, CXL_NVDIMM_LOCK, CXL_NVDIMM_BRIDGE_LOCK, + CXL_REGION_LOCK, CXL_PORT_LOCK, /* * Be careful to add new lock classes here, CXL_PORT_LOCK is @@ -536,6 +541,8 @@ static inline void cxl_nested_lock(struct device *dev) mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_BRIDGE_LOCK); else if (is_cxl_nvdimm(dev)) mutex_lock_nested(&dev->lockdep_mutex, CXL_NVDIMM_LOCK); + else if (is_cxl_region(dev)) + mutex_lock_nested(&dev->lockdep_mutex, CXL_REGION_LOCK); else mutex_lock_nested(&dev->lockdep_mutex, CXL_ANON_LOCK); } diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h new file mode 100644 index 000000000000..66d9ba195c34 --- /dev/null +++ b/drivers/cxl/region.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright(c) 2021 Intel Corporation. */ +#ifndef __CXL_REGION_H__ +#define __CXL_REGION_H__ + +#include + +#include "cxl.h" + +/** + * struct cxl_region - CXL region + * @dev: This region's device. + * @id: This region's id. Id is globally unique across all regions. + * @flags: Flags representing the current state of the region. + * @detach_work: Async unregister to allow attrs to take device_lock. + * @remove_lock: Coordinates region removal against decoder removal + */ +struct cxl_region { + struct device dev; + int id; + unsigned long flags; +#define REGION_DEAD 0 + struct work_struct detach_work; + struct mutex remove_lock; /* serialize region removal */ +}; + +bool schedule_cxl_region_unregister(struct cxl_region *cxlr); + +#endif diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild index 82e49ab0937d..3fe6d34e6d59 100644 --- a/tools/testing/cxl/Kbuild +++ b/tools/testing/cxl/Kbuild @@ -46,6 +46,7 @@ cxl_core-y += $(CXL_CORE_SRC)/memdev.o cxl_core-y += $(CXL_CORE_SRC)/mbox.o cxl_core-y += $(CXL_CORE_SRC)/pci.o cxl_core-y += $(CXL_CORE_SRC)/hdm.o +cxl_core-y += $(CXL_CORE_SRC)/region.o cxl_core-y += config_check.o obj-m += test/ From patchwork Wed Apr 13 18:37:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812384 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCEEE3D97; Wed, 13 Apr 2022 18:38:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875098; x=1681411098; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=x8b/LibVqBx1G9rOTQThgsQbECDv7IbBtNdHI9yYHsw=; b=X+CM8J9Vqot2rvwhB3DUnZXtTupHUmj6MqyhY7lvrCBISnIE+a5V9qup DrSuKR7t3zvF/wxQN1LjLe3dQknxTJy0dRqO6SwIk1XqPdlatgAZL7Pjs wXymvc0o/4oTUu7HLkfNpX+Lmavt7rEyU4b2EpLLe4ctsFsd7gbUVSZz0 q7Pr5SeRhTgNwLS17HCGxb34exOCfHWfSwqM1wUHNTrn2+r9fp6BeAIgV Xk5ZZrPRcYxLJWnGwsOEY5EqZzS9U4Z1qSTsrxCUhbFS+t+/nLAUVMccV 0G8OfXqDRRVuIHniMokVSEqSG4aQE9O2/K5AHjR8gXeIZhEZa3e7EmnIN A==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631861" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631861" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013632" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 13/15] cxl/core/port: Add attrs for root ways & granularity Date: Wed, 13 Apr 2022 11:37:18 -0700 Message-Id: <20220413183720.2444089-14-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Region programming requires knowledge of root decoder attributes. For example, if the root decoder supports only 256b granularity then a region with > 256b granularity cannot work. Add sysfs attributes in order to provide this information to userspace. The CXL driver controls programming of switch and endpoint decoders, but the attributes are also exported for informational purposes. Signed-off-by: Ben Widawsky --- drivers/cxl/core/port.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 5ef8a6e1ea23..19cf1fd16118 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -318,10 +318,31 @@ static ssize_t target_list_show(struct device *dev, } static DEVICE_ATTR_RO(target_list); +static ssize_t interleave_granularity_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + + return sysfs_emit(buf, "%d\n", cxld->interleave_granularity); +} +static DEVICE_ATTR_RO(interleave_granularity); + +static ssize_t interleave_ways_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + + return sysfs_emit(buf, "%d\n", cxld->interleave_ways); +} +static DEVICE_ATTR_RO(interleave_ways); + static struct attribute *cxl_decoder_base_attrs[] = { &dev_attr_start.attr, &dev_attr_size.attr, &dev_attr_locked.attr, + &dev_attr_interleave_granularity.attr, + &dev_attr_interleave_ways.attr, NULL, }; From patchwork Wed Apr 13 18:37:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812386 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7162D3D9C; Wed, 13 Apr 2022 18:38:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875099; x=1681411099; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GE8WO4JA2nFKTniAJnVBzWWQMIr5tk1hbaCLNLMyyK0=; b=Fvw8tZQ8CtgMG0jIxBNqBNC8jaO9yvHmSkaTCXqmZXqlaRlmCWtgUXud W0+yn/DQfzLZsBLe4+P5MehtdU5U2myf24kQD2inX78dp6NpYKp8UoUIQ sNtICYRAGkvk6jiM8Jct+dPTS7zb0BpZEcjDDL+wNu68J7orTBvRh4ccm 8ShpExmeSAX56bZQcmpN6IeNaRL6BZeSotagfdrBBazMF7LtSI1ZKukZg DDsKI9Wnmg/wm4VQHhC+NGyjw4obQnATE0yeXJtXASC2jBU1TNkaByTC2 qm+x8ZIr8xU7XtXLRpVEsUmfTv76/IIghhJS5iDuCIfVFl/Nqrw0jAhfg Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631862" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631862" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:52 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013635" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:51 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 14/15] cxl/region: Introduce configuration Date: Wed, 13 Apr 2022 11:37:19 -0700 Message-Id: <20220413183720.2444089-15-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The region creation APIs create a vacant region. Configuring the region works in the same way as similar subsystems such as devdax. Sysfs attrs will be provided to allow userspace to configure the region. Finally once all configuration is complete, userspace may activate the region by binding the driver. Introduced here are the most basic attributes needed to configure a region. Details of these attribute are described in the ABI Documentation. A example is provided below: /sys/bus/cxl/devices/region0 ├── devtype ├── interleave_granularity ├── interleave_ways ├── modalias ├── offset ├── size ├── subsystem -> ../../../../../../bus/cxl ├── target0 ├── uevent └── uuid Signed-off-by: Ben Widawsky --- Documentation/ABI/testing/sysfs-bus-cxl | 64 +++- drivers/cxl/core/region.c | 455 +++++++++++++++++++++++- drivers/cxl/cxl.h | 15 + drivers/cxl/region.h | 76 ++++ 4 files changed, 598 insertions(+), 12 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 5229f4bd109a..9ace58635942 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -195,5 +195,65 @@ Date: January, 2022 KernelVersion: v5.19 Contact: linux-cxl@vger.kernel.org Description: - Deletes the named region. The attribute expects a region number - as an integer. + Deletes the named region. The attribute expects a region name in + the form regionZ where Z is an integer value. + +What: /sys/bus/cxl/devices/decoderX.Y/regionZ/resource +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + A region is a contiguous partition of a CXL root decoder address + space. Region capacity is allocated by writing to the size + attribute, the resulting physical address space determined by + the driver is reflected here. + +What: /sys/bus/cxl/devices/decoderX.Y/regionZ/size +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + System physical address space to be consumed by the region. When + written to, this attribute will allocate space out of the CXL + root decoder's address space. When read the size of the address + space is reported and should match the span of the region's + resource attribute. + +What: /sys/bus/cxl/devices/decoderX.Y/regionZ/interleave_ways +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Configures the number of devices participating in the region is + set by writing this value. Each device will provide + 1/interleave_ways of storage for the region. + +What: /sys/bus/cxl/devices/decoderX.Y/regionZ/interleave_granularity +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Set the number of consecutive bytes each device in the + interleave set will claim. The possible interleave granularity + values are determined by the CXL spec and the participating + devices. + +What: /sys/bus/cxl/devices/decoderX.Y/regionZ/uuid +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write a unique identifier for the region. This field must be set + for persistent regions and it must not conflict with the UUID of + another region. If this field is set for volatile regions, the + value is ignored. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/target[0..interleave_ways] +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write a [endpoint] decoder object that is unused and will + participate in decoding memory transactions for the interleave + set, ie. decoderX.Y. All required attributes of the decoder must + be populated. diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 16829bf2f73a..4766d897f4bf 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -4,9 +4,12 @@ #include #include #include +#include #include +#include #include #include +#include #include #include "core.h" @@ -18,21 +21,453 @@ * Memory ranges, Regions represent the active mapped capacity by the HDM * Decoder Capability structures throughout the Host Bridges, Switches, and * Endpoints in the topology. + * + * Region configuration has ordering constraints: + * - Targets: Must be set after size + * - Size: Must be set after interleave ways + * - Interleave ways: Must be set after Interleave Granularity + * + * UUID may be set at any time before binding the driver to the region. */ -static struct cxl_region *to_cxl_region(struct device *dev); +static const struct attribute_group region_interleave_group; + +static void remove_target(struct cxl_region *cxlr, int target) +{ + struct cxl_endpoint_decoder *cxled; + + mutex_lock(&cxlr->remove_lock); + cxled = cxlr->targets[target]; + if (cxled) { + cxled->cxlr = NULL; + put_device(&cxled->base.dev); + } + cxlr->targets[target] = NULL; + mutex_unlock(&cxlr->remove_lock); +} static void cxl_region_release(struct device *dev) { struct cxl_region *cxlr = to_cxl_region(dev); + int i; memregion_free(cxlr->id); + for (i = 0; i < cxlr->interleave_ways; i++) + remove_target(cxlr, i); kfree(cxlr); } +static ssize_t interleave_ways_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%d\n", cxlr->interleave_ways); +} + +static ssize_t interleave_ways_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_decoder *rootd; + int rc, val; + + rc = kstrtoint(buf, 0, &val); + if (rc) + return rc; + + cxl_device_lock(dev); + + if (dev->driver) { + cxl_device_unlock(dev); + return -EBUSY; + } + + if (cxlr->interleave_ways) { + cxl_device_unlock(dev); + return -EEXIST; + } + + if (!cxlr->interleave_granularity) { + dev_dbg(&cxlr->dev, "IG must be set before IW\n"); + cxl_device_unlock(dev); + return -EILSEQ; + } + + rootd = to_cxl_decoder(cxlr->dev.parent); + if (!cxl_region_ways_valid(rootd, val, cxlr->interleave_granularity)) { + cxl_device_unlock(dev); + return -EINVAL; + } + + cxlr->interleave_ways = val; + cxl_device_unlock(dev); + + rc = sysfs_update_group(&cxlr->dev.kobj, ®ion_interleave_group); + if (rc < 0) { + cxlr->interleave_ways = 0; + return rc; + } + + return len; +} +static DEVICE_ATTR_RW(interleave_ways); + +static ssize_t interleave_granularity_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%d\n", cxlr->interleave_granularity); +} + +static ssize_t interleave_granularity_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_decoder *rootd; + int val, ret; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + cxl_device_lock(dev); + + if (dev->driver) { + cxl_device_unlock(dev); + return -EBUSY; + } + + if (cxlr->interleave_granularity) { + cxl_device_unlock(dev); + return -EEXIST; + } + + rootd = to_cxl_decoder(cxlr->dev.parent); + if (!cxl_region_granularity_valid(rootd, val)) { + cxl_device_unlock(dev); + return -EINVAL; + } + + cxlr->interleave_granularity = val; + cxl_device_unlock(dev); + + return len; +} +static DEVICE_ATTR_RW(interleave_granularity); + +static ssize_t resource_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%#llx\n", cxlr->range.start); +} +static DEVICE_ATTR_RO(resource); + +static ssize_t size_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_root_decoder *cxlrd; + struct cxl_decoder *rootd; + unsigned long addr; + u64 val; + int rc; + + rc = kstrtou64(buf, 0, &val); + if (rc) + return rc; + + if (!cxl_region_size_valid(val, cxlr->interleave_ways)) { + dev_dbg(&cxlr->dev, "Size must be a multiple of %dM\n", + cxlr->interleave_ways * 256); + return -EINVAL; + } + + cxl_device_lock(dev); + + if (dev->driver) { + cxl_device_unlock(dev); + return -EBUSY; + } + + if (!cxlr->interleave_ways) { + dev_dbg(&cxlr->dev, "IW must be set before size\n"); + cxl_device_unlock(dev); + return -EILSEQ; + } + + rootd = to_cxl_decoder(cxlr->dev.parent); + cxlrd = to_cxl_root_decoder(rootd); + + addr = gen_pool_alloc(cxlrd->window, val); + if (addr == 0 && rootd->range.start != 0) { + rc = -ENOSPC; + goto out; + } + + cxlr->range = (struct range) { + .start = addr, + .end = addr + val - 1, + }; + +out: + cxl_device_unlock(dev); + return rc ? rc : len; +} + +static ssize_t size_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%#llx\n", range_len(&cxlr->range)); +} +static DEVICE_ATTR_RW(size); + +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%pUb\n", &cxlr->uuid); +} + +static int is_dupe(struct device *match, void *_cxlr) +{ + struct cxl_region *c, *cxlr = _cxlr; + + if (!is_cxl_region(match)) + return 0; + + if (&cxlr->dev == match) + return 0; + + c = to_cxl_region(match); + if (uuid_equal(&c->uuid, &cxlr->uuid)) + return -EEXIST; + + return 0; +} + +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + ssize_t rc; + uuid_t temp; + + if (len != UUID_STRING_LEN + 1) + return -EINVAL; + + rc = uuid_parse(buf, &temp); + if (rc) + return rc; + + cxl_device_lock(dev); + + if (dev->driver) { + cxl_device_unlock(dev); + return -EBUSY; + } + + if (!uuid_is_null(&cxlr->uuid)) { + cxl_device_unlock(dev); + return -EEXIST; + } + + rc = bus_for_each_dev(&cxl_bus_type, NULL, cxlr, is_dupe); + if (rc < 0) { + cxl_device_unlock(dev); + return false; + } + + cxlr->uuid = temp; + cxl_device_unlock(dev); + return len; +} +static DEVICE_ATTR_RW(uuid); + +static struct attribute *region_attrs[] = { + &dev_attr_resource.attr, + &dev_attr_interleave_ways.attr, + &dev_attr_interleave_granularity.attr, + &dev_attr_size.attr, + &dev_attr_uuid.attr, + NULL, +}; + +static const struct attribute_group region_group = { + .attrs = region_attrs, +}; + +static size_t show_targetN(struct cxl_region *cxlr, char *buf, int n) +{ + if (!cxlr->targets[n]) + return sysfs_emit(buf, "\n"); + + return sysfs_emit(buf, "%s\n", dev_name(&cxlr->targets[n]->base.dev)); +} + +static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int n, + size_t len) +{ + struct cxl_endpoint_decoder *cxled; + struct cxl_decoder *cxld; + struct device *cxld_dev; + struct cxl_port *port; + + cxl_device_lock(&cxlr->dev); + + if (cxlr->dev.driver) { + cxl_device_unlock(&cxlr->dev); + return -EBUSY; + } + + /* The target attrs don't exist until ways are set. No need to check */ + + if (cxlr->targets[n]) { + cxl_device_unlock(&cxlr->dev); + return -EEXIST; + } + + cxld_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf); + if (!cxld_dev) { + cxl_device_unlock(&cxlr->dev); + return -ENOENT; + } + + if (!is_cxl_decoder(cxld_dev)) { + put_device(cxld_dev); + cxl_device_unlock(&cxlr->dev); + dev_info(cxld_dev, "Not a decoder\n"); + return -EINVAL; + } + + if (!is_cxl_endpoint(to_cxl_port(cxld_dev->parent))) { + put_device(cxld_dev); + cxl_device_unlock(&cxlr->dev); + dev_info(cxld_dev, "Not an endpoint decoder\n"); + return -EINVAL; + } + + cxld = to_cxl_decoder(cxld_dev); + if (cxld->flags & CXL_DECODER_F_ENABLE) { + put_device(cxld_dev); + cxl_device_unlock(&cxlr->dev); + return -EBUSY; + } + + /* Decoder reference is held until region probe can complete. */ + cxled = to_cxl_endpoint_decoder(cxld); + + if (range_len(&cxled->drange) != + range_len(&cxlr->range) / cxlr->interleave_ways) { + put_device(cxld_dev); + cxl_device_unlock(&cxlr->dev); + dev_info(cxld_dev, "Decoder is the wrong size\n"); + return -EINVAL; + } + + port = to_cxl_port(cxld->dev.parent); + if (port->last_cxled && + cxlr->range.start <= port->last_cxled->drange.start) { + put_device(cxld_dev); + cxl_device_unlock(&cxlr->dev); + dev_info(cxld_dev, "Decoder in set has higher HPA than region. Try different device\n"); + return -EINVAL; + } + + cxlr->targets[n] = cxled; + cxled->cxlr = cxlr; + + cxl_device_unlock(&cxlr->dev); + + return len; +} + +#define TARGET_ATTR_RW(n) \ + static ssize_t target##n##_show( \ + struct device *dev, struct device_attribute *attr, char *buf) \ + { \ + return show_targetN(to_cxl_region(dev), buf, (n)); \ + } \ + static ssize_t target##n##_store(struct device *dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ + { \ + return store_targetN(to_cxl_region(dev), buf, (n), len); \ + } \ + static DEVICE_ATTR_RW(target##n) + +TARGET_ATTR_RW(0); +TARGET_ATTR_RW(1); +TARGET_ATTR_RW(2); +TARGET_ATTR_RW(3); +TARGET_ATTR_RW(4); +TARGET_ATTR_RW(5); +TARGET_ATTR_RW(6); +TARGET_ATTR_RW(7); +TARGET_ATTR_RW(8); +TARGET_ATTR_RW(9); +TARGET_ATTR_RW(10); +TARGET_ATTR_RW(11); +TARGET_ATTR_RW(12); +TARGET_ATTR_RW(13); +TARGET_ATTR_RW(14); +TARGET_ATTR_RW(15); + +static struct attribute *interleave_attrs[] = { + &dev_attr_target0.attr, + &dev_attr_target1.attr, + &dev_attr_target2.attr, + &dev_attr_target3.attr, + &dev_attr_target4.attr, + &dev_attr_target5.attr, + &dev_attr_target6.attr, + &dev_attr_target7.attr, + &dev_attr_target8.attr, + &dev_attr_target9.attr, + &dev_attr_target10.attr, + &dev_attr_target11.attr, + &dev_attr_target12.attr, + &dev_attr_target13.attr, + &dev_attr_target14.attr, + &dev_attr_target15.attr, + NULL, +}; + +static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n) +{ + struct device *dev = container_of(kobj, struct device, kobj); + struct cxl_region *cxlr = to_cxl_region(dev); + + if (n < cxlr->interleave_ways) + return a->mode; + return 0; +} + +static const struct attribute_group region_interleave_group = { + .attrs = interleave_attrs, + .is_visible = visible_targets, +}; + +static const struct attribute_group *region_groups[] = { + ®ion_group, + ®ion_interleave_group, + &cxl_base_attribute_group, + NULL, +}; + static const struct device_type cxl_region_type = { .name = "cxl_region", .release = cxl_region_release, + .groups = region_groups }; bool is_cxl_region(struct device *dev) @@ -41,7 +476,7 @@ bool is_cxl_region(struct device *dev) } EXPORT_SYMBOL_NS_GPL(is_cxl_region, CXL); -static struct cxl_region *to_cxl_region(struct device *dev) +struct cxl_region *to_cxl_region(struct device *dev) { if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type, "not a cxl_region device\n")) @@ -49,6 +484,7 @@ static struct cxl_region *to_cxl_region(struct device *dev) return container_of(dev, struct cxl_region, dev); } +EXPORT_SYMBOL_NS_GPL(to_cxl_region, CXL); static void unregister_region(struct work_struct *work) { @@ -96,20 +532,20 @@ static struct cxl_region *cxl_region_alloc(struct cxl_decoder *cxld) INIT_WORK(&cxlr->detach_work, unregister_region); mutex_init(&cxlr->remove_lock); + cxlr->range = (struct range) { + .start = 0, + .end = -1, + }; + return cxlr; } /** * devm_cxl_add_region - Adds a region to a decoder - * @cxld: Parent decoder. - * - * This is the second step of region initialization. Regions exist within an - * address space which is mapped by a @cxld. That @cxld must be a root decoder, - * and it enforces constraints upon the region as it is configured. + * @cxld: Root decoder. * * Return: 0 if the region was added to the @cxld, else returns negative error - * code. The region will be named "regionX.Y.Z" where X is the port, Y is the - * decoder id, and Z is the region number. + * code. The region will be named "regionX" where Z is the region number. */ static struct cxl_region *devm_cxl_add_region(struct cxl_decoder *cxld) { @@ -191,7 +627,6 @@ static ssize_t create_pmem_region_store(struct device *dev, } cxlr = devm_cxl_add_region(cxld); - rc = 0; dev_dbg(dev, "Created %s\n", dev_name(&cxlr->dev)); out: diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 3abc8b0cf8f4..db69dfa16f71 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -81,6 +81,19 @@ static inline int cxl_to_interleave_ways(u8 eniw) } } +static inline int cxl_from_ways(u8 ways) +{ + if (is_power_of_2(ways)) + return ilog2(ways); + + return ways / 3 + 8; +} + +static inline int cxl_from_granularity(u16 g) +{ + return ilog2(g) - 8; +} + /* CXL 2.0 8.2.8.1 Device Capabilities Array Register */ #define CXLDEV_CAP_ARRAY_OFFSET 0x0 #define CXLDEV_CAP_ARRAY_CAP_ID 0 @@ -277,6 +290,7 @@ struct cxl_switch_decoder { * @targets: Downstream targets (ie. hostbridges). * @next_region_id: The pre-cached next region id. * @id_lock: Protects next_region_id + * @regions: List of active regions in this decoder's address space */ struct cxl_root_decoder { struct cxl_decoder base; @@ -284,6 +298,7 @@ struct cxl_root_decoder { struct cxl_decoder_targets *targets; int next_region_id; struct mutex id_lock; /* synchronizes access to next_region_id */ + struct list_head regions; }; #define _to_cxl_decoder(x) \ diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h index 66d9ba195c34..e6457ea3d388 100644 --- a/drivers/cxl/region.h +++ b/drivers/cxl/region.h @@ -14,6 +14,12 @@ * @flags: Flags representing the current state of the region. * @detach_work: Async unregister to allow attrs to take device_lock. * @remove_lock: Coordinates region removal against decoder removal + * @list: Node in decoder's region list. + * @range: Resource this region carves out of the platform decode range. + * @uuid: The UUID for this region. + * @interleave_ways: Number of interleave ways this region is configured for. + * @interleave_granularity: Interleave granularity of region + * @targets: The memory devices comprising the region. */ struct cxl_region { struct device dev; @@ -22,8 +28,78 @@ struct cxl_region { #define REGION_DEAD 0 struct work_struct detach_work; struct mutex remove_lock; /* serialize region removal */ + + struct list_head list; + struct range range; + + uuid_t uuid; + int interleave_ways; + int interleave_granularity; + struct cxl_endpoint_decoder *targets[CXL_DECODER_MAX_INTERLEAVE]; }; +bool is_cxl_region(struct device *dev); +struct cxl_region *to_cxl_region(struct device *dev); bool schedule_cxl_region_unregister(struct cxl_region *cxlr); +/** + * cxl_region_ways_valid - Determine if ways is valid for the given + * decoder. + * @rootd: The decoder for which validity will be checked + * @ways: Determination if ways is valid given @rootd and @granularity + * @granularity: The granularity the region will be interleaved + */ +static inline bool cxl_region_ways_valid(const struct cxl_decoder *rootd, + u8 ways, u16 granularity) +{ + int root_ig, region_ig, root_eniw; + + switch (ways) { + case 0 ... 4: + case 6: + case 8: + case 12: + case 16: + break; + default: + return false; + } + + if (rootd->interleave_ways == 1) + return true; + + root_ig = cxl_from_granularity(rootd->interleave_granularity); + region_ig = cxl_from_granularity(granularity); + root_eniw = cxl_from_ways(rootd->interleave_ways); + + return ((1 << (root_ig - region_ig)) * (1 << root_eniw)) <= ways; +} + +static inline bool cxl_region_granularity_valid(const struct cxl_decoder *rootd, + int ig) +{ + int rootd_hbig; + + if (!is_power_of_2(ig)) + return false; + + /* 16K is the max */ + if (ig >> 15) + return false; + + rootd_hbig = cxl_from_granularity(rootd->interleave_granularity); + if (rootd_hbig < cxl_from_granularity(ig)) + return false; + + return true; +} + +static inline bool cxl_region_size_valid(u64 size, int ways) +{ + int rem; + + div_u64_rem(size, SZ_256M * ways, &rem); + return rem == 0; +} + #endif From patchwork Wed Apr 13 18:37:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12812385 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 649793D94; Wed, 13 Apr 2022 18:38:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649875098; x=1681411098; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aeZKwA6KXw/WgR7UJB3SvHnRt3gEatPvEZCT9hDGrhA=; b=Sr9tSRbhRV7XCYYvaGkUF0LMugeusSdJM0K3rbWIpjRqHyit0z5NTchp YkOR0MTRLqi/yTWIxR3WPUDYmO/S25t09ZgDzIKN83ZZXVm1uyNRAVT9H AOLm6ttc5vo8zDa79qCJ9H0mxJU2wJQCZnkvTjt1mAg82+Wp47SK7oJkx qTOdaQB0w5rgpcd41NJnIhEwIuqxejn4GZFfLmAQXKbzRNpxgJZK+L5mN +z02NdK20h15ex101kU5I4F+rNlCr839vGyTRyjY831ntPAeFkiSu1CmY 9FmKLw3G0UxYSr3qRvg4It4ImTwKfMt1IPtFT5wzfREcAZBN7NIMqK4J8 A==; X-IronPort-AV: E=McAfee;i="6400,9594,10316"; a="244631866" X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="244631866" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:52 -0700 X-IronPort-AV: E=Sophos;i="5.90,257,1643702400"; d="scan'208";a="725013640" Received: from sushobhi-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.131.238]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Apr 2022 11:37:52 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 15/15] cxl/region: Introduce a cxl_region driver Date: Wed, 13 Apr 2022 11:37:20 -0700 Message-Id: <20220413183720.2444089-16-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220413183720.2444089-1-ben.widawsky@intel.com> References: <20220413183720.2444089-1-ben.widawsky@intel.com> Precedence: bulk X-Mailing-List: nvdimm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The cxl_region driver is responsible for managing the HDM decoder programming in the CXL topology. Once a region is created it must be configured and bound to the driver in order to activate it. The following is a sample of how such controls might work: region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region echo 256 > /sys/bus/cxl/devices/decoder0.0/region0/interleave_granularity echo 2 > /sys/bus/cxl/devices/decoder0.0/region0/interleave_ways echo $((256<<20)) > /sys/bus/cxl/devices/decoder0.0/region0/size echo decoder3.0 > /sys/bus/cxl/devices/decoder0.0/region0/target0 echo decoder4.0 > /sys/bus/cxl/devices/decoder0.0/region0/target1 echo region0 > /sys/bus/cxl/drivers/cxl_region/bind Note that the above is not complete as the endpoint decoders also need configuration. Signed-off-by: Ben Widawsky --- .../driver-api/cxl/memory-devices.rst | 3 + drivers/cxl/Kconfig | 4 + drivers/cxl/Makefile | 2 + drivers/cxl/core/core.h | 1 + drivers/cxl/core/port.c | 2 + drivers/cxl/core/region.c | 2 +- drivers/cxl/cxl.h | 6 + drivers/cxl/region.c | 333 ++++++++++++++++++ 8 files changed, 352 insertions(+), 1 deletion(-) create mode 100644 drivers/cxl/region.c diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 66ddc58a21b1..8cb4dece5b17 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -364,6 +364,9 @@ CXL Core CXL Regions ----------- +.. kernel-doc:: drivers/cxl/region.c + :doc: cxl region + .. kernel-doc:: drivers/cxl/region.h :identifiers: diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 7ce86eee8bda..d5c41c96971f 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -104,4 +104,8 @@ config CXL_REGION default CXL_BUS select MEMREGION +config CXL_REGION + default CXL_PORT + tristate + endif diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index ce267ef11d93..02a4776e7ab9 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -5,9 +5,11 @@ obj-$(CONFIG_CXL_MEM) += cxl_mem.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o obj-$(CONFIG_CXL_PORT) += cxl_port.o +obj-$(CONFIG_CXL_REGION) += cxl_region.o cxl_mem-y := mem.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o cxl_port-y := port.o +cxl_region-y := region.o diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index a507a2502127..8871a3385604 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -6,6 +6,7 @@ extern const struct device_type cxl_nvdimm_bridge_type; extern const struct device_type cxl_nvdimm_type; +extern const struct device_type cxl_region_type; extern struct attribute_group cxl_base_attribute_group; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 19cf1fd16118..f22579cd031d 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -53,6 +53,8 @@ static int cxl_device_id(struct device *dev) } if (is_cxl_memdev(dev)) return CXL_DEVICE_MEMORY_EXPANDER; + if (dev->type == &cxl_region_type) + return CXL_DEVICE_REGION; return 0; } diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 4766d897f4bf..1c28d9623cb8 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -464,7 +464,7 @@ static const struct attribute_group *region_groups[] = { NULL, }; -static const struct device_type cxl_region_type = { +const struct device_type cxl_region_type = { .name = "cxl_region", .release = cxl_region_release, .groups = region_groups diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index db69dfa16f71..184af920113d 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -212,6 +212,10 @@ void __iomem *devm_cxl_iomap_block(struct device *dev, resource_size_t addr, #define CXL_DECODER_F_ENABLE BIT(5) #define CXL_DECODER_F_MASK GENMASK(5, 0) +#define cxl_is_pmem_t3(flags) \ + (((flags) & (CXL_DECODER_F_TYPE3 | CXL_DECODER_F_PMEM)) == \ + (CXL_DECODER_F_TYPE3 | CXL_DECODER_F_PMEM)) + enum cxl_decoder_type { CXL_DECODER_ACCELERATOR = 2, CXL_DECODER_EXPANDER = 3, @@ -440,6 +444,7 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port, resource_size_t component_reg_phys); struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port, const struct device *dev); +struct cxl_port *ep_find_cxl_port(struct cxl_memdev *cxlmd, unsigned int depth); struct cxl_decoder *to_cxl_decoder(struct device *dev); bool is_root_decoder(struct device *dev); @@ -501,6 +506,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv); #define CXL_DEVICE_PORT 3 #define CXL_DEVICE_ROOT 4 #define CXL_DEVICE_MEMORY_EXPANDER 5 +#define CXL_DEVICE_REGION 6 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*") #define CXL_MODALIAS_FMT "cxl:t%d" diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c new file mode 100644 index 000000000000..f5de640623c0 --- /dev/null +++ b/drivers/cxl/region.c @@ -0,0 +1,333 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2021 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include "cxlmem.h" +#include "region.h" +#include "cxl.h" + +/** + * DOC: cxl region + * + * This module implements a region driver that is capable of programming CXL + * hardware to setup regions. + * + * A CXL region encompasses a chunk of host physical address space that may be + * consumed by a single device (x1 interleave aka linear) or across multiple + * devices (xN interleaved). The region driver has the following + * responsibilities: + * + * * Walk topology to obtain decoder resources for region configuration. + * * Program decoder resources based on region configuration. + * * Bridge CXL regions to LIBNVDIMM + * * Initiates reading and configuring LSA regions + * * Enumerates regions created by BIOS (typically volatile) + */ + +#define for_each_cxled(cxled, idx, cxlr) \ + for (idx = 0; idx < cxlr->interleave_ways && (cxled = cxlr->targets[idx]); idx++) + +static struct cxl_decoder *rootd_from_region(const struct cxl_region *cxlr) +{ + struct device *d = cxlr->dev.parent; + + if (WARN_ONCE(!is_root_decoder(d), + "Corrupt topology for root region\n")) + return NULL; + + return to_cxl_decoder(d); +} + +static struct cxl_port *get_hostbridge(const struct cxl_memdev *ep) +{ + struct cxl_port *port = dev_get_drvdata(&ep->dev); + + while (!is_cxl_root(port)) { + port = to_cxl_port(port->dev.parent); + if (port->depth == 1) + return port; + } + + BUG(); + return NULL; +} + +static struct cxl_port *get_root_decoder(const struct cxl_memdev *endpoint) +{ + struct cxl_port *hostbridge = get_hostbridge(endpoint); + + if (hostbridge) + return to_cxl_port(hostbridge->dev.parent); + + return NULL; +} + +/** + * validate_region() - Check is region is reasonably configured + * @cxlr: The region to check + * + * Determination as to whether or not a region can possibly be configured is + * described in CXL Memory Device SW Guide. In order to implement the algorithms + * described there, certain more basic configuration parameters must first need + * to be validated. That is accomplished by this function. + * + * Returns 0 if the region is reasonably configured, else returns a negative + * error code. + */ +static int validate_region(const struct cxl_region *cxlr) +{ + const struct cxl_decoder *rootd = rootd_from_region(cxlr); + const int gran = cxlr->interleave_granularity; + const int ways = cxlr->interleave_ways; + struct cxl_endpoint_decoder *cxled; + int i; + + /* + * Interleave attributes should be caught by later math, but it's + * easiest to find those issues here, now. + */ + if (!cxl_region_granularity_valid(rootd, gran)) { + dev_dbg(&cxlr->dev, "Invalid interleave granularity\n"); + return -ENXIO; + } + + if (!cxl_region_ways_valid(rootd, ways, gran)) { + dev_dbg(&cxlr->dev, "Invalid number of ways\n"); + return -ENXIO; + } + + if (!cxl_region_size_valid(range_len(&cxlr->range), ways)) { + dev_dbg(&cxlr->dev, "Invalid size. Must be multiple of %uM\n", + 256 * ways); + return -ENXIO; + } + + for_each_cxled(cxled, i, cxlr) { + struct cxl_memdev *cxlmd; + struct cxl_port *port; + + port = to_cxl_port(cxled->base.dev.parent); + cxlmd = to_cxl_memdev(port->uport); + if (!cxlmd->dev.driver) { + dev_dbg(&cxlr->dev, "%s isn't CXL.mem capable\n", + dev_name(&cxled->base.dev)); + return -ENODEV; + } + + if ((range_len(&cxlr->range) / ways) != + range_len(&cxled->drange)) { + dev_dbg(&cxlr->dev, "%s is the wrong size\n", + dev_name(&cxled->base.dev)); + return -ENXIO; + } + } + + if (i != cxlr->interleave_ways) { + dev_dbg(&cxlr->dev, "Missing memory device target%u", i); + return -ENXIO; + } + + return 0; +} + +/** + * find_cdat_dsmas() - Find a valid DSMAS for the region + * @cxlr: The region + */ +static bool find_cdat_dsmas(const struct cxl_region *cxlr) +{ + return true; +} + +/** + * qtg_match() - Does this root decoder have desirable QTG for the endpoint + * @rootd: The root decoder for the region + * + * Prior to calling this function, the caller should verify that all endpoints + * in the region have the same QTG ID. + * + * Returns true if the QTG ID of the root decoder matches the endpoint + */ +static bool qtg_match(const struct cxl_decoder *rootd) +{ + /* TODO: */ + return true; +} + +/** + * region_xhb_config_valid() - determine cross host bridge validity + * @cxlr: The region being programmed + * @rootd: The root decoder to check against + * + * The algorithm is outlined in 2.13.14 "Verify XHB configuration sequence" of + * the CXL Memory Device SW Guide (Rev1p0). + * + * Returns true if the configuration is valid. + */ +static bool region_xhb_config_valid(const struct cxl_region *cxlr, + const struct cxl_decoder *rootd) +{ + /* TODO: */ + return true; +} + +/** + * region_hb_rp_config_valid() - determine root port ordering is correct + * @cxlr: Region to validate + * @rootd: root decoder for this @cxlr + * + * The algorithm is outlined in 2.13.15 "Verify HB root port configuration + * sequence" of the CXL Memory Device SW Guide (Rev1p0). + * + * Returns true if the configuration is valid. + */ +static bool region_hb_rp_config_valid(const struct cxl_region *cxlr, + const struct cxl_decoder *rootd) +{ + /* TODO: */ + return true; +} + +/** + * rootd_contains() - determine if this region can exist in the root decoder + * @rootd: root decoder that potentially decodes to this region + * @cxlr: region to be routed by the @rootd + */ +static bool rootd_contains(const struct cxl_region *cxlr, + const struct cxl_decoder *rootd) +{ + /* TODO: */ + return true; +} + +static bool rootd_valid(const struct cxl_region *cxlr, + const struct cxl_decoder *rootd) +{ + if (!qtg_match(rootd)) + return false; + + if (!cxl_is_pmem_t3(rootd->flags)) + return false; + + if (!region_xhb_config_valid(cxlr, rootd)) + return false; + + if (!region_hb_rp_config_valid(cxlr, rootd)) + return false; + + if (!rootd_contains(cxlr, rootd)) + return false; + + return true; +} + +struct rootd_context { + const struct cxl_region *cxlr; + struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE]; + int count; +}; + +static int rootd_match(struct device *dev, void *data) +{ + struct rootd_context *ctx = (struct rootd_context *)data; + const struct cxl_region *cxlr = ctx->cxlr; + + if (!is_root_decoder(dev)) + return 0; + + return !!rootd_valid(cxlr, to_cxl_decoder(dev)); +} + +/* + * This is a roughly equivalent implementation to "Figure 45 - High-level + * sequence: Finding CFMWS for region" from the CXL Memory Device SW Guide + * Rev1p0. + */ +static struct cxl_decoder *find_rootd(const struct cxl_region *cxlr, + const struct cxl_port *root) +{ + struct rootd_context ctx; + struct device *ret; + + ctx.cxlr = cxlr; + + ret = device_find_child((struct device *)&root->dev, &ctx, rootd_match); + if (ret) + return to_cxl_decoder(ret); + + return NULL; +} + +static int bind_region(const struct cxl_region *cxlr) +{ + struct cxl_endpoint_decoder *cxled; + int i; + /* TODO: */ + + /* + * Natural decoder teardown can occur at this point, put the + * reference which was taken when the target was set. + */ + for_each_cxled(cxled, i, cxlr) + put_device(&cxled->base.dev); + + WARN_ON(i != cxlr->interleave_ways); + return 0; +} + +static int cxl_region_probe(struct device *dev) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_port *root_port, *ep_port; + struct cxl_decoder *rootd, *ours; + struct cxl_memdev *cxlmd; + int ret; + + if (uuid_is_null(&cxlr->uuid)) + uuid_gen(&cxlr->uuid); + + /* TODO: What about volatile, and LSA generated regions? */ + + ret = validate_region(cxlr); + if (ret) + return ret; + + if (!find_cdat_dsmas(cxlr)) + return -ENXIO; + + rootd = rootd_from_region(cxlr); + if (!rootd) { + dev_err(dev, "Couldn't find root decoder\n"); + return -ENXIO; + } + + if (!rootd_valid(cxlr, rootd)) { + dev_err(dev, "Picked invalid rootd\n"); + return -ENXIO; + } + + ep_port = to_cxl_port(cxlr->targets[0]->base.dev.parent); + cxlmd = to_cxl_memdev(ep_port->uport); + root_port = get_root_decoder(cxlmd); + ours = find_rootd(cxlr, root_port); + if (ours != rootd) + dev_dbg(dev, "Picked different rootd %s %s\n", + dev_name(&rootd->dev), dev_name(&ours->dev)); + if (ours) + put_device(&ours->dev); + + return bind_region(cxlr); +} + +static struct cxl_driver cxl_region_driver = { + .name = "cxl_region", + .probe = cxl_region_probe, + .id = CXL_DEVICE_REGION, +}; +module_cxl_driver(cxl_region_driver); + +MODULE_LICENSE("GPL"); +MODULE_IMPORT_NS(CXL); +MODULE_ALIAS_CXL(CXL_DEVICE_REGION);