From patchwork Wed Mar 16 23:02:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE2D2C433EF for ; Wed, 16 Mar 2022 23:03:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229706AbiCPXEd (ORCPT ); Wed, 16 Mar 2022 19:04:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229667AbiCPXEc (ORCPT ); Wed, 16 Mar 2022 19:04:32 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5722D78 for ; Wed, 16 Mar 2022 16:03:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471797; x=1679007797; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yEvtPHFFysHl3pMp6x7GJywjDfLrIS+UGRgeUgaP29s=; b=ipsK7oqzl8LgIqhOP5DN+anaFKqdTOvm7CfVgDaZcAt4MTy53+J6H1Ns u4FatUEM1V1Hg/IxtR5/PCCall5SiQLPuokhf6iXSlr6nDWf3dn360TRQ +S2ub9ZX4ppoG0zGeja3HUC1z75Gl+l5TAaIYGwz5ae2TvOQsYHxzM6mx A020BD+AagNLHlwH/0FruhAvrhwOyR3bxb8Od1k2qGv3TiQz0Bo5AUPq3 xAI/Vv58SOw8iaDl81E06m0bAArIVavzKR8xhGu5pSJPnpKK1qStDEDCt 9RyijXamkxsgg1CgX2j1W4LW2ibb7mtdd+52y8R5M39ExlaFiDOI2ypqz Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677527" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677527" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:16 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621384" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:16 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 1/7] cxl/core: Use is_endpoint_decoder Date: Wed, 16 Mar 2022 16:02:57 -0700 Message-Id: <20220316230303.1813397-2-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Save some characters and directly check decoder type rather than port type. There's no need to check if the port is an endpoint port since we already know the decoder, after alloc, has a specified type. Signed-off-by: Ben Widawsky --- drivers/cxl/core/hdm.c | 2 +- drivers/cxl/core/port.c | 2 +- drivers/cxl/cxl.h | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index a9ae8620701f..808b19215425 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -190,7 +190,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, else cxld->target_type = CXL_DECODER_ACCELERATOR; - if (is_cxl_endpoint(to_cxl_port(cxld->dev.parent))) + if (is_endpoint_decoder(&cxld->dev)) return 0; target_list.value = diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 1a0c3d757a29..bda40e91af2b 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -293,7 +293,7 @@ static const struct device_type cxl_decoder_root_type = { .groups = cxl_decoder_root_attribute_groups, }; -static bool is_endpoint_decoder(struct device *dev) +bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 4c1963c0c8d2..4a93d409328f 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -357,6 +357,7 @@ struct cxl_dport *cxl_find_dport_by_dev(struct cxl_port *port, struct cxl_decoder *to_cxl_decoder(struct device *dev); bool is_root_decoder(struct device *dev); +bool is_endpoint_decoder(struct device *dev); bool is_cxl_decoder(struct device *dev); struct cxl_decoder *cxl_root_decoder_alloc(struct cxl_port *port, unsigned int nr_targets); From patchwork Wed Mar 16 23:02:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16EDBC433F5 for ; Wed, 16 Mar 2022 23:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229501AbiCPXEe (ORCPT ); Wed, 16 Mar 2022 19:04:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229812AbiCPXEd (ORCPT ); Wed, 16 Mar 2022 19:04:33 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB4F0D85 for ; Wed, 16 Mar 2022 16:03:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471797; x=1679007797; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mbC/izFadaPQKeRdEu9/VcKIuWmzXmC6nvjMugZI8ic=; b=OXEQaj1D8try1gJJD5TVsR0ttBFJVN0u+jxRlEHQZOcMQK+Y6tR0vZQm n15MM4Ip2yrJqq8GCRIwMveVkiRoHru+Tzg6JgqkV8pQV9IMJQc86iZP/ HpAnTVmmWLbzA2MFgSW0LV0vESxXutpp5vsdE6t5tuZArsJIxNjwvzfKv gFQhDLE7GVFr0icy0H8fHab5U3B7og/qaLm2CaB/OxhtMffmv9nhxq2fu 7sWxhmrO4BZHGYj/flsqX3wBRPp4rO3tv7v6Tij7q8zRc9ttcDy+UWlEj VP8OTbgDRlIK85Fs/3VqaDlyteiXriN/oPmsLB1PGgqKFRg/rF2pVk752 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677528" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677528" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:16 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621388" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:16 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 2/7] cxl/core: Distinguish cxl_decoder into types Date: Wed, 16 Mar 2022 16:02:58 -0700 Message-Id: <20220316230303.1813397-3-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org CXL HDM decoders have distinct properties at each level in the hierarchy. Root decoders manage host physical address space. Switch decoders manage demultiplexing of data to downstream targets. Endpoint decoders must be aware of physical media size constraints. To properly support these unique needs, create these unique structures. As endpoint decoders don't handle media size accounting, that is saved for a later patch. CXL HDM decoders do have similar architectural properties at all levels: interleave properties, flags and types. Those are retained and when possible, still utilized. Signed-off-by: Ben Widawsky --- drivers/cxl/acpi.c | 9 ++- drivers/cxl/core/hdm.c | 8 +-- drivers/cxl/core/port.c | 99 ++++++++++++++++++++--------- drivers/cxl/cxl.h | 118 +++++++++++++++++++++++++++++++---- tools/testing/cxl/test/cxl.c | 7 +-- 5 files changed, 186 insertions(+), 55 deletions(-) diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c index 09d6811736f2..822b615a25f4 100644 --- a/drivers/cxl/acpi.c +++ b/drivers/cxl/acpi.c @@ -108,8 +108,7 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, cxld->flags = cfmws_to_decoder_flags(cfmws->restrictions); cxld->target_type = CXL_DECODER_EXPANDER; - cxld->platform_res = (struct resource)DEFINE_RES_MEM(cfmws->base_hpa, - cfmws->window_size); + cxl_set_decoder_extent(cxld, cfmws->base_hpa, cfmws->window_size); cxld->interleave_ways = CFMWS_INTERLEAVE_WAYS(cfmws); cxld->interleave_granularity = CFMWS_INTERLEAVE_GRANULARITY(cfmws); @@ -120,12 +119,12 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, rc = cxl_decoder_autoremove(dev, cxld); if (rc) { dev_err(dev, "Failed to add decoder for %pr\n", - &cxld->platform_res); + &to_cxl_root_decoder(cxld)->res); return 0; } dev_dbg(dev, "add: %s node: %d range %pr\n", dev_name(&cxld->dev), - phys_to_target_node(cxld->platform_res.start), - &cxld->platform_res); + phys_to_target_node(to_cxl_root_decoder(cxld)->res.start), + &to_cxl_root_decoder(cxld)->res); return 0; } diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 808b19215425..83404cdb846b 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -6,6 +6,7 @@ #include "cxlmem.h" #include "core.h" +#include "cxl.h" /** * DOC: cxl core hdm @@ -165,10 +166,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, return -ENXIO; } - cxld->decoder_range = (struct range) { - .start = base, - .end = base + size - 1, - }; + cxl_set_decoder_extent(cxld, base, size); /* switch decoders are always enabled if committed */ if (ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED) { @@ -235,7 +233,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) struct cxl_decoder *cxld; if (is_cxl_endpoint(port)) - cxld = cxl_endpoint_decoder_alloc(port); + cxld = &cxl_endpoint_decoder_alloc(port)->base; else cxld = cxl_switch_decoder_alloc(port, target_count); if (IS_ERR(cxld)) { diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index bda40e91af2b..c46f0b01ce3c 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -73,14 +73,8 @@ static ssize_t start_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); - u64 start; - if (is_root_decoder(dev)) - start = cxld->platform_res.start; - else - start = cxld->decoder_range.start; - - return sysfs_emit(buf, "%#llx\n", start); + return sysfs_emit(buf, "%#llx\n", cxl_get_decoder_extent(cxld).start); } static DEVICE_ATTR_ADMIN_RO(start); @@ -88,14 +82,9 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); - u64 size; + struct range r = cxl_get_decoder_extent(cxld); - if (is_root_decoder(dev)) - size = resource_size(&cxld->platform_res); - else - size = range_len(&cxld->decoder_range); - - return sysfs_emit(buf, "%#llx\n", size); + return sysfs_emit(buf, "%#llx\n", range_len(&r)); } static DEVICE_ATTR_RO(size); @@ -152,18 +141,19 @@ static DEVICE_ATTR_RO(target_type); static ssize_t emit_target_list(struct cxl_decoder *cxld, char *buf) { + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); ssize_t offset = 0; int i, rc = 0; for (i = 0; i < cxld->interleave_ways; i++) { - struct cxl_dport *dport = cxld->target[i]; + struct cxl_dport *dport = t->target[i]; struct cxl_dport *next = NULL; if (!dport) break; if (i + 1 < cxld->interleave_ways) - next = cxld->target[i + 1]; + next = t->target[i + 1]; rc = sysfs_emit_at(buf, offset, "%d%s", dport->port_id, next ? "," : ""); if (rc < 0) @@ -178,14 +168,15 @@ static ssize_t target_list_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); ssize_t offset; unsigned int seq; int rc; do { - seq = read_seqbegin(&cxld->target_lock); + seq = read_seqbegin(&t->target_lock); rc = emit_target_list(cxld, buf); - } while (read_seqretry(&cxld->target_lock, seq)); + } while (read_seqretry(&t->target_lock, seq)); if (rc < 0) return rc; @@ -297,6 +288,7 @@ bool is_endpoint_decoder(struct device *dev) { return dev->type == &cxl_decoder_endpoint_type; } +EXPORT_SYMBOL_NS_GPL(is_endpoint_decoder, CXL); bool is_root_decoder(struct device *dev) { @@ -1167,6 +1159,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_find_dport_by_dev, CXL); static int decoder_populate_targets(struct cxl_decoder *cxld, struct cxl_port *port, int *target_map) { + struct cxl_decoder_targets *t = cxl_get_decoder_targets(cxld); int i, rc = 0; if (!target_map) @@ -1177,21 +1170,70 @@ static int decoder_populate_targets(struct cxl_decoder *cxld, if (list_empty(&port->dports)) return -EINVAL; - write_seqlock(&cxld->target_lock); - for (i = 0; i < cxld->nr_targets; i++) { + write_seqlock(&t->target_lock); + for (i = 0; i < t->nr_targets; i++) { struct cxl_dport *dport = find_dport(port, target_map[i]); if (!dport) { rc = -ENXIO; break; } - cxld->target[i] = dport; + t->target[i] = dport; } - write_sequnlock(&cxld->target_lock); + write_sequnlock(&t->target_lock); return rc; } +static struct cxl_decoder *__cxl_decoder_alloc(struct cxl_port *port, + unsigned int nr_targets) +{ + struct cxl_decoder *cxld; + + if (is_cxl_endpoint(port)) { + struct cxl_endpoint_decoder *cxled; + + cxled = kzalloc(sizeof(*cxled), GFP_KERNEL); + if (!cxled) + return NULL; + cxld = &cxled->base; + } else if (is_cxl_root(port)) { + struct cxl_root_decoder *cxlrd; + + cxlrd = kzalloc(sizeof(*cxlrd), GFP_KERNEL); + if (!cxlrd) + return NULL; + + cxlrd->targets = + kzalloc(struct_size(cxlrd->targets, target, nr_targets), GFP_KERNEL); + if (!cxlrd->targets) { + kfree(cxlrd); + return NULL; + } + cxlrd->targets->nr_targets = nr_targets; + seqlock_init(&cxlrd->targets->target_lock); + cxld = &cxlrd->base; + } else { + struct cxl_switch_decoder *cxlsd; + + cxlsd = kzalloc(sizeof(*cxlsd), GFP_KERNEL); + if (!cxlsd) + return NULL; + + cxlsd->targets = + kzalloc(struct_size(cxlsd->targets, target, nr_targets), GFP_KERNEL); + if (!cxlsd->targets) { + kfree(cxlsd); + return NULL; + } + cxlsd->targets->nr_targets = nr_targets; + seqlock_init(&cxlsd->targets->target_lock); + cxld = &cxlsd->base; + } + + return cxld; +} + /** * cxl_decoder_alloc - Allocate a new CXL decoder * @port: owning port of this decoder @@ -1217,9 +1259,10 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, if (nr_targets > CXL_DECODER_MAX_INTERLEAVE) return ERR_PTR(-EINVAL); - cxld = kzalloc(struct_size(cxld, target, nr_targets), GFP_KERNEL); + cxld = __cxl_decoder_alloc(port, nr_targets); if (!cxld) return ERR_PTR(-ENOMEM); + ; rc = ida_alloc(&port->decoder_ida, GFP_KERNEL); if (rc < 0) @@ -1229,8 +1272,6 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, get_device(&port->dev); cxld->id = rc; - cxld->nr_targets = nr_targets; - seqlock_init(&cxld->target_lock); dev = &cxld->dev; device_initialize(dev); device_set_pm_not_required(dev); @@ -1247,7 +1288,7 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, cxld->interleave_ways = 1; cxld->interleave_granularity = PAGE_SIZE; cxld->target_type = CXL_DECODER_EXPANDER; - cxld->platform_res = (struct resource)DEFINE_RES_MEM(0, 0); + cxl_set_decoder_extent(cxld, 0, 0); return cxld; err: @@ -1302,12 +1343,12 @@ EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL); * * Return: A new cxl decoder to be registered by cxl_decoder_add() */ -struct cxl_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) +struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) { if (!is_cxl_endpoint(port)) return ERR_PTR(-EINVAL); - return cxl_decoder_alloc(port, 0); + return to_cxl_endpoint_decoder(cxl_decoder_alloc(port, 0)); } EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL); @@ -1366,7 +1407,7 @@ int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map) * other resources are just sub ranges within the main decoder resource. */ if (is_root_decoder(dev)) - cxld->platform_res.name = dev_name(dev); + to_cxl_root_decoder(cxld)->res.name = dev_name(dev); cxl_set_lock_class(dev); return device_add(dev); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 4a93d409328f..f523268060fd 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -210,6 +210,18 @@ enum cxl_decoder_type { */ #define CXL_DECODER_MAX_INTERLEAVE 16 +/** + * struct cxl_decoder_targets - Target information for root and switch decoders. + * @target_lock: coordinate coherent reads of the target list + * @nr_targets: number of elements in @target + * @target: active ordered target list in current decoder configuration + */ +struct cxl_decoder_targets { + seqlock_t target_lock; + int nr_targets; + struct cxl_dport *target[]; +}; + /** * struct cxl_decoder - CXL address range decode configuration * @dev: this decoder's device @@ -220,26 +232,60 @@ enum cxl_decoder_type { * @interleave_granularity: data stride per dport * @target_type: accelerator vs expander (type2 vs type3) selector * @flags: memory type capabilities and locking - * @target_lock: coordinate coherent reads of the target list - * @nr_targets: number of elements in @target - * @target: active ordered target list in current decoder configuration */ struct cxl_decoder { struct device dev; int id; - union { - struct resource platform_res; - struct range decoder_range; - }; int interleave_ways; int interleave_granularity; enum cxl_decoder_type target_type; unsigned long flags; - seqlock_t target_lock; - int nr_targets; - struct cxl_dport *target[]; }; +/** + * struct cxl_endpoint_decoder - An decoder residing in a CXL endpoint. + * @base: Base class decoder + * @range: Host physical address space consumed by this decoder. + */ +struct cxl_endpoint_decoder { + struct cxl_decoder base; + struct range range; +}; + +/** + * struct cxl_switch_decoder - A decoder in a switch or hostbridge. + * @base: Base class decoder + * @range: Host physical address space consumed by this decoder. + * @targets: Downstream targets for this switch. + */ +struct cxl_switch_decoder { + struct cxl_decoder base; + struct range range; + struct cxl_decoder_targets *targets; +}; + +/** + * struct cxl_root_decoder - A toplevel/platform decoder + * @base: Base class decoder + * @res: host address space owned by this decoder + * @targets: Downstream targets (ie. hostbridges). + */ +struct cxl_root_decoder { + struct cxl_decoder base; + struct resource res; + struct cxl_decoder_targets *targets; +}; + +#define _to_cxl_decoder(x) \ + static inline struct cxl_##x##_decoder *to_cxl_##x##_decoder( \ + struct cxl_decoder *cxld) \ + { \ + return container_of(cxld, struct cxl_##x##_decoder, base); \ + } + +_to_cxl_decoder(root) +_to_cxl_decoder(switch) +_to_cxl_decoder(endpoint) /** * enum cxl_nvdimm_brige_state - state machine for managing bus rescans @@ -364,11 +410,61 @@ struct cxl_decoder *cxl_root_decoder_alloc(struct cxl_port *port, struct cxl_decoder *cxl_switch_decoder_alloc(struct cxl_port *port, unsigned int nr_targets); int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map); -struct cxl_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port); +struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port); int cxl_decoder_add_locked(struct cxl_decoder *cxld, int *target_map); int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld); int cxl_endpoint_autoremove(struct cxl_memdev *cxlmd, struct cxl_port *endpoint); +static inline struct cxl_decoder_targets * +cxl_get_decoder_targets(struct cxl_decoder *cxld) +{ + if (is_root_decoder(&cxld->dev)) + return to_cxl_root_decoder(cxld)->targets; + else if (is_endpoint_decoder(&cxld->dev)) + return NULL; + else + return to_cxl_switch_decoder(cxld)->targets; +} + +static inline void cxl_set_decoder_extent(struct cxl_decoder *cxld, + resource_size_t base, + resource_size_t size) +{ + if (is_root_decoder(&cxld->dev)) + to_cxl_root_decoder(cxld)->res = + (struct resource)DEFINE_RES_MEM(base, size); + else if (is_endpoint_decoder(&cxld->dev)) + to_cxl_endpoint_decoder(cxld)->range = (struct range){ + .start = base, + .end = base + size - 1 + }; + else + to_cxl_switch_decoder(cxld)->range = (struct range){ + .start = base, + .end = base + size - 1 + }; +} + +static inline struct range cxl_get_decoder_extent(struct cxl_decoder *cxld) +{ + struct range ret; + + if (is_root_decoder(&cxld->dev)) { + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + + ret = (struct range) { + .start = cxlrd->res.start, + .end = cxlrd->res.end + }; + } else if (is_endpoint_decoder(&cxld->dev)) { + ret = to_cxl_endpoint_decoder(cxld)->range; + } else { + ret = to_cxl_switch_decoder(cxld)->range; + } + + return ret; +} + struct cxl_hdm; struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port); int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm); diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 431f2bddf6c8..5b9fe64e4582 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -454,17 +454,14 @@ static int mock_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) if (target_count) cxld = cxl_switch_decoder_alloc(port, target_count); else - cxld = cxl_endpoint_decoder_alloc(port); + cxld = &cxl_endpoint_decoder_alloc(port)->base; if (IS_ERR(cxld)) { dev_warn(&port->dev, "Failed to allocate the decoder\n"); return PTR_ERR(cxld); } - cxld->decoder_range = (struct range) { - .start = 0, - .end = -1, - }; + cxl_set_decoder_extent(cxld, 0, 0); cxld->flags = CXL_DECODER_F_ENABLE; cxld->interleave_ways = min_not_zero(target_count, 1); From patchwork Wed Mar 16 23:02:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 335B9C4332F for ; Wed, 16 Mar 2022 23:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229817AbiCPXEe (ORCPT ); Wed, 16 Mar 2022 19:04:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229667AbiCPXEd (ORCPT ); Wed, 16 Mar 2022 19:04:33 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B58BCFD8 for ; Wed, 16 Mar 2022 16:03:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471798; x=1679007798; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3TxITivwCM+dr7cUscjF6XgtGPRS3PeiWb3I5HOqWKs=; b=bFyWSJjx2zZpgGAIlKXPGnLEPqVSnCxkNoyPYOBv/K2HCRpl4tQdMB/n WBLXoPs6ykh1hsg40E4rtpMm4BsKQB48una+aEZJGipixYnOBFt3sfCvH nm/gJ/L8ZVKOztiU0Gh12bOnuSE6fACvjABsYK3OVLnOXbb7Kx5zYdFvD SKActHJj2I9RdflZVbI3kBfZimpcPiiNRpDJ+zUrB+FXorxn4wUApBcTw 6BAVKcib26+nrfDXew0L+Jq+9tv5JjLvm7d94Yn53V27Sxb6jTVKHhvsG oNjOKWuG5rfXD3sd2qQtqaqThLwI7F6yKi6pPYJdlzWNBUMbyKPRwpsuu w==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677530" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677530" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621393" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:16 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 3/7] cxl/port: Surface ram and pmem resources Date: Wed, 16 Mar 2022 16:02:59 -0700 Message-Id: <20220316230303.1813397-4-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org CXL Type 2 and 3 endpoints may contain Host-managed Device Memory (HDM). This memory can be either volatile, persistent, or some combination of both. Similar to the root decoder the port's resources can be considered the host memory of which decoders allocate out of. Unlike the root decoder resource, device resources are in the device physical address space domain. Signed-off-by: Ben Widawsky --- drivers/cxl/core/port.c | 54 +++++++++++++++++++++++++++++++++++++++++ drivers/cxl/cxl.h | 7 ++++++ drivers/cxl/mem.c | 7 ++++-- 3 files changed, 66 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index c46f0b01ce3c..6653de4dfb43 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -2,6 +2,7 @@ /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ #include #include +#include #include #include #include @@ -503,6 +504,59 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, } EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL); +static int *gen_pool_vcookie; +static int *gen_pool_pcookie; + +struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, + struct device *uport, + resource_size_t component_reg_phys, + u64 capacity, u64 pmem_offset, + struct cxl_port *parent_port) +{ + int rc; + struct cxl_port *ep = + devm_cxl_add_port(host, uport, component_reg_phys, parent_port); + if (IS_ERR(ep) || !capacity) + return ep; + + ep->media = devm_gen_pool_create(&ep->dev, ilog2(SZ_256M), NUMA_NO_NODE, NULL); + if (IS_ERR(ep->media)) { + ep = ERR_CAST(ep->media); + goto err_out; + } + + if (pmem_offset) { + rc = gen_pool_add_owner(ep->media, 0, -1, pmem_offset, + NUMA_NO_NODE, &gen_pool_vcookie); + if (rc) { + ep = ERR_PTR(rc); + goto err_out; + } + dev_dbg(&ep->dev, "Created volatile capacity pool: %zx\n", + gen_pool_avail(ep->media)); + } + + if (pmem_offset < capacity) { + rc = gen_pool_add_owner(ep->media, pmem_offset, -1, + capacity - pmem_offset, NUMA_NO_NODE, + &gen_pool_pcookie); + if (rc) { + ep = ERR_PTR(rc); + goto err_out; + } + dev_dbg(&ep->dev, "Created persistent capacity pool: %zx\n", + gen_pool_avail(ep->media)); + } + + return ep; + +err_out: + dev_err(&ep->dev, "Failed to allocated gen pools\n"); + put_device(&ep->dev); + return ep; +} +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_endpoint_port, CXL); + struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port) { /* There is no pci_bus associated with a CXL platform-root port */ diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index f523268060fd..d18e93e77f7e 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -330,6 +330,7 @@ struct cxl_nvdimm { * @component_reg_phys: component register capability base address (optional) * @dead: last ep has been removed, force port re-creation * @depth: How deep this port is relative to the root. depth 0 is the root. + * @media: Media's address space (endpoint only) */ struct cxl_port { struct device dev; @@ -341,6 +342,7 @@ struct cxl_port { resource_size_t component_reg_phys; bool dead; unsigned int depth; + struct gen_pool *media; }; /** @@ -389,6 +391,11 @@ struct pci_bus *cxl_port_to_pci_bus(struct cxl_port *port); struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, resource_size_t component_reg_phys, struct cxl_port *parent_port); +struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, + struct device *uport, + resource_size_t component_reg_phys, + u64 capacity, u64 pmem_offset, + struct cxl_port *parent_port); struct cxl_port *find_cxl_root(struct device *dev); int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd); int cxl_bus_rescan(void); diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index 91fb8d5b21a7..b6f8edaed802 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -50,9 +50,12 @@ static int create_endpoint(struct cxl_memdev *cxlmd, { struct cxl_dev_state *cxlds = cxlmd->cxlds; struct cxl_port *endpoint; + u64 partition = range_len(&cxlds->ram_range); + u64 size = range_len(&cxlds->ram_range) + range_len(&cxlds->pmem_range); - endpoint = devm_cxl_add_port(&parent_port->dev, &cxlmd->dev, - cxlds->component_reg_phys, parent_port); + endpoint = devm_cxl_add_endpoint_port(&parent_port->dev, &cxlmd->dev, + cxlds->component_reg_phys, size, + partition, parent_port); if (IS_ERR(endpoint)) return PTR_ERR(endpoint); From patchwork Wed Mar 16 23:03:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E798AC43217 for ; Wed, 16 Mar 2022 23:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229667AbiCPXEf (ORCPT ); Wed, 16 Mar 2022 19:04:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229812AbiCPXEe (ORCPT ); Wed, 16 Mar 2022 19:04:34 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC8051029 for ; Wed, 16 Mar 2022 16:03:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471798; x=1679007798; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/Qhz+WuORuP5aJTt2PZoTffg5U/+iwZT6m7GyRTAYWA=; b=QKNgxphExbBXR7DfxiXA/EPAyww7/+c4GTvZ3Ur+o7vN9iwUkUba1XDz AttI/jx+UHwOTaep237tDBOqCIhk9tuZ3AG+lWl+up/xbPtPsEKsIPI+w hzBDkQgs8KRpXLc7QbpEDfRVZRtWtMv2zrkBENCzwb18kAF1yaCq0ZKvW MqVffMlrSFLZhDXOpV6Rb79sVbQ6zGV5t+GYCfjNFaBGzrdQxocwtnRHQ CAolJNT7aEw4NkwE2+z6zN3vmuyliwor70x9uIDa4oxy9fbPdIAHy1OTA 391N9ZqqbYSsscbYpRtSM2uapgcO4y62IoTuACm9dl0gvB59Lp36YBYph Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677531" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677531" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621397" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 4/7] cxl/core/hdm: Allocate resources from the media Date: Wed, 16 Mar 2022 16:03:00 -0700 Message-Id: <20220316230303.1813397-5-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Similar to how decoders consume address space for the root decoder, they also consume space on the device's physical media. For future allocations, it's important to mark those as used/busy. Signed-off-by: Ben Widawsky --- drivers/cxl/core/core.h | 6 +++ drivers/cxl/core/hdm.c | 44 +++++++++++++++++- drivers/cxl/core/port.c | 99 +++++++++++++++++++++++++++++++++-------- drivers/cxl/cxl.h | 16 +++++-- 4 files changed, 142 insertions(+), 23 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 1a50c0fc399c..1dea4dbb4f33 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -9,6 +9,12 @@ extern const struct device_type cxl_nvdimm_type; extern struct attribute_group cxl_base_attribute_group; +extern struct device_attribute dev_attr_create_pmem_region; +extern struct device_attribute dev_attr_delete_region; + +extern int *gen_pool_vcookie; +extern int *gen_pool_pcookie; + struct cxl_send_command; struct cxl_mem_query_commands; int cxl_query_cmd(struct cxl_memdev *cxlmd, diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 83404cdb846b..4a4e07a010ec 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ #include +#include #include #include @@ -188,8 +189,11 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, else cxld->target_type = CXL_DECODER_ACCELERATOR; - if (is_endpoint_decoder(&cxld->dev)) + if (is_endpoint_decoder(&cxld->dev)) { + to_cxl_endpoint_decoder(cxld)->skip = + ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); return 0; + } target_list.value = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); @@ -199,6 +203,35 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld, return 0; } +static void cxl_request_regions(struct cxl_endpoint_decoder *cxled, + resource_size_t base) +{ + struct cxl_port *port = to_cxl_port(cxled->base.dev.parent); + struct genpool_data_fixed gpdf; + struct range r = cxled->hrange; + unsigned long addr; + void *type; + + if (!(cxled->base.flags & CXL_DECODER_F_ENABLE)) + return; + + gpdf.offset = base; + addr = gen_pool_alloc_algo_owner(port->media, range_len(&r), + gen_pool_fixed_alloc, &gpdf, &type); + if (addr != base || (base == 0 && !type)) { + dev_warn(&port->dev, "Couldn't allocate media\n"); + return; + } else if (type == &gen_pool_pcookie) { + dev_warn(&port->dev, "Enumerated a persistent capacity\n"); + return; + } + + cxled->drange = (struct range) { + .start = addr, + .end = addr + range_len(&r) - 1, + }; +} + /** * devm_cxl_enumerate_decoders - add decoder objects per HDM register set * @cxlhdm: Structure to populate with HDM capabilities @@ -208,6 +241,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) void __iomem *hdm = cxlhdm->regs.hdm_decoder; struct cxl_port *port = cxlhdm->port; int i, committed, failed; + u64 base = 0; u32 ctrl; /* @@ -230,6 +264,7 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) for (i = 0, failed = 0; i < cxlhdm->decoder_count; i++) { int target_map[CXL_DECODER_MAX_INTERLEAVE] = { 0 }; int rc, target_count = cxlhdm->target_count; + struct cxl_endpoint_decoder *cxled; struct cxl_decoder *cxld; if (is_cxl_endpoint(port)) @@ -255,6 +290,13 @@ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm) "Failed to add decoder to port\n"); return rc; } + + if (!is_cxl_endpoint(port)) + continue; + + cxled = to_cxl_endpoint_decoder(cxld); + cxl_request_regions(cxled, base + cxled->skip); + base += cxled->skip + range_len(&cxled->hrange); } if (failed == cxlhdm->decoder_count) { diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 6653de4dfb43..fe50a42bed7b 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -452,16 +452,44 @@ static struct cxl_port *cxl_port_alloc(struct device *uport, return ERR_PTR(rc); } -/** - * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy - * @host: host device for devm operations - * @uport: "physical" device implementing this upstream port - * @component_reg_phys: (optional) for configurable cxl_port instances - * @parent_port: next hop up in the CXL memory decode hierarchy - */ -struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, - resource_size_t component_reg_phys, - struct cxl_port *parent_port) +int *gen_pool_vcookie; +int *gen_pool_pcookie; + +static int create_gen_pools(struct cxl_port *ep, u64 capacity, u64 pmem_offset) +{ + int rc; + + ep->media = gen_pool_create(ilog2(SZ_256M), NUMA_NO_NODE); + if (IS_ERR(ep->media)) + return PTR_ERR(ep->media); + + if (pmem_offset) { + rc = gen_pool_add_owner(ep->media, 0, -1, pmem_offset, + NUMA_NO_NODE, &gen_pool_vcookie); + if (rc) { + gen_pool_destroy(ep->media); + return rc; + } + } + + if (pmem_offset < capacity) { + rc = gen_pool_add_owner(ep->media, pmem_offset, -1, + capacity - pmem_offset, NUMA_NO_NODE, + &gen_pool_pcookie); + if (rc) { + gen_pool_destroy(ep->media); + return rc; + } + } + + return 0; +} + +static struct cxl_port *__devm_cxl_add_port(struct device *host, + struct device *uport, + resource_size_t component_reg_phys, + u64 capacity, u64 pmem_offset, + struct cxl_port *parent_port) { struct cxl_port *port; struct device *dev; @@ -474,12 +502,15 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, if (parent_port) port->depth = parent_port->depth + 1; dev = &port->dev; - if (is_cxl_memdev(uport)) + if (is_cxl_memdev(uport)) { rc = dev_set_name(dev, "endpoint%d", port->id); - else if (parent_port) + if (!rc) + rc = create_gen_pools(port, capacity, pmem_offset); + } else if (parent_port) { rc = dev_set_name(dev, "port%d", port->id); - else + } else { rc = dev_set_name(dev, "root%d", port->id); + } if (rc) goto err; @@ -502,10 +533,22 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, put_device(dev); return ERR_PTR(rc); } -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL); -static int *gen_pool_vcookie; -static int *gen_pool_pcookie; +/** + * devm_cxl_add_port - register a cxl_port in CXL memory decode hierarchy + * @host: host device for devm operations + * @uport: "physical" device implementing this upstream port + * @component_reg_phys: (optional) for configurable cxl_port instances + * @parent_port: next hop up in the CXL memory decode hierarchy + */ +struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport, + resource_size_t component_reg_phys, + struct cxl_port *parent_port) +{ + return __devm_cxl_add_port(host, uport, component_reg_phys, 0, 0, + parent_port); +} +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_port, CXL); struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, struct device *uport, @@ -513,9 +556,9 @@ struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, u64 capacity, u64 pmem_offset, struct cxl_port *parent_port) { - int rc; struct cxl_port *ep = - devm_cxl_add_port(host, uport, component_reg_phys, parent_port); + __devm_cxl_add_port(host, uport, component_reg_phys, capacity, + pmem_offset, parent_port); if (IS_ERR(ep) || !capacity) return ep; @@ -526,6 +569,8 @@ struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, } if (pmem_offset) { + int rc; + rc = gen_pool_add_owner(ep->media, 0, -1, pmem_offset, NUMA_NO_NODE, &gen_pool_vcookie); if (rc) { @@ -537,6 +582,8 @@ struct cxl_port *devm_cxl_add_endpoint_port(struct device *host, } if (pmem_offset < capacity) { + int rc; + rc = gen_pool_add_owner(ep->media, pmem_offset, -1, capacity - pmem_offset, NUMA_NO_NODE, &gen_pool_pcookie); @@ -1250,6 +1297,7 @@ static struct cxl_decoder *__cxl_decoder_alloc(struct cxl_port *port, cxled = kzalloc(sizeof(*cxled), GFP_KERNEL); if (!cxled) return NULL; + cxled->drange = (struct range){ 0, -1 }; cxld = &cxled->base; } else if (is_cxl_root(port)) { struct cxl_root_decoder *cxlrd; @@ -1504,6 +1552,21 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL); static void cxld_unregister(void *dev) { + struct cxl_decoder *cxld = to_cxl_decoder(dev); + + if (is_endpoint_decoder(&cxld->dev)) { + struct cxl_endpoint_decoder *cxled = + to_cxl_endpoint_decoder(cxld); + struct cxl_port *ep = to_cxl_port(cxld->dev.parent); + + if (!range_len(&cxled->drange)) + goto out; + + gen_pool_free(ep->media, cxled->drange.start, + range_len(&cxled->drange)); + } + +out: device_unregister(dev); } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index d18e93e77f7e..e88b1efe54d3 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -245,11 +245,15 @@ struct cxl_decoder { /** * struct cxl_endpoint_decoder - An decoder residing in a CXL endpoint. * @base: Base class decoder - * @range: Host physical address space consumed by this decoder. + * @hrange: Host physical address space consumed by this decoder. + * @drange: device space consumed by this decoder. + * @skip: The skip count as specified in the CXL specification. */ struct cxl_endpoint_decoder { struct cxl_decoder base; - struct range range; + struct range hrange; + struct range drange; + u64 skip; }; /** @@ -269,11 +273,15 @@ struct cxl_switch_decoder { * @base: Base class decoder * @res: host address space owned by this decoder * @targets: Downstream targets (ie. hostbridges). + * @next_region_id: The pre-cached next region id. + * @id_lock: Protects next_region_id */ struct cxl_root_decoder { struct cxl_decoder base; struct resource res; struct cxl_decoder_targets *targets; + int next_region_id; + struct mutex id_lock; /* synchronizes access to next_region_id */ }; #define _to_cxl_decoder(x) \ @@ -441,7 +449,7 @@ static inline void cxl_set_decoder_extent(struct cxl_decoder *cxld, to_cxl_root_decoder(cxld)->res = (struct resource)DEFINE_RES_MEM(base, size); else if (is_endpoint_decoder(&cxld->dev)) - to_cxl_endpoint_decoder(cxld)->range = (struct range){ + to_cxl_endpoint_decoder(cxld)->drange = (struct range){ .start = base, .end = base + size - 1 }; @@ -464,7 +472,7 @@ static inline struct range cxl_get_decoder_extent(struct cxl_decoder *cxld) .end = cxlrd->res.end }; } else if (is_endpoint_decoder(&cxld->dev)) { - ret = to_cxl_endpoint_decoder(cxld)->range; + ret = to_cxl_endpoint_decoder(cxld)->drange; } else { ret = to_cxl_switch_decoder(cxld)->range; } From patchwork Wed Mar 16 23:03:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91968C433FE for ; Wed, 16 Mar 2022 23:03:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229902AbiCPXEg (ORCPT ); Wed, 16 Mar 2022 19:04:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229820AbiCPXEf (ORCPT ); Wed, 16 Mar 2022 19:04:35 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F571767A for ; Wed, 16 Mar 2022 16:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471799; x=1679007799; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yLPjTPk+YcZ7IE5SaZ98JxjgaFfrYMmFsrLhnfjv7Hc=; b=baXNiPC/oOVKL408Os71aLIaQIMTWNL17ekXAQj+uPP6OAdTPnlK79Ed GongdWDIzFRQdC8nLeZbydW0KknQVw/jC9Cdv0zUGYVW9tADHP1vR8obv KVLMJ1Vc3tSgU+MC+d1z00iioqp3toq+mODMxVnFX3LWT3rDCtV7HhFWd pMzBOVRDdDorbtsKzzIGjUQpYPJQXBC5il2AFUnee9i5JqMjWj94l+tIt i4EmWxwhDH0iqx0wb4iCT4bvyLj+cVPxkOfYxbJPb12gvHHz4zWRBsImn 2abFZ07U7VvPuN7GnzrwZO8AMXwRQZjKRrWVKCmQf+unr1xI1fQIPHhr3 w==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677532" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677532" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621401" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 5/7] cxl/core/port: add decoder attrs for size and volatility Date: Wed, 16 Mar 2022 16:03:01 -0700 Message-Id: <20220316230303.1813397-6-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Endpoint decoders have the decoder-unique properties of having their range being constrained by the media they're a part of, and, having a concrete need to disambiguate between volatile and persistent capacity (due to partitioning). As part of region programming, these decoders will be required to be pre-configured, ie, have the size and volatility set. Endpoint decoders must consider two different address space for address allocation. Sysram will need to be mapped for use of this memory if not set up in the EFI memory map. Additionally, the CXL device itself has it's own address space domain which requires allocation and management. To handle the device address space, a gen_pool is used per device. Host physical address space will get allocated as needed when the region is created. The existing gen_pool API is an almost perfect fit for managing the device memory. There exists one impediment however. HDM decoders (as of the CXL 2.0 spec) must map incremental address. 8.2.5.12.20 states, "Decoder[m+1].Base >= (Decoder[m].Base+Decoder[m].Size)". To handle this case, a custom gen_pool algorithm is implemented which searches for the last enabled decoder and allocates at the next address after that. This is like a first fit + fixed algorithm. /sys/bus/cxl/devices/decoder3.0 ├── devtype ├── interleave_granularity ├── interleave_ways ├── locked ├── modalias ├── size ├── start ├── subsystem -> ../../../../../../../bus/cxl ├── target_type ├── uevent └── volatile Signed-off-by: Ben Widawsky --- Documentation/ABI/testing/sysfs-bus-cxl | 13 +- drivers/cxl/Kconfig | 3 +- drivers/cxl/core/port.c | 168 +++++++++++++++++++++++- drivers/cxl/cxl.h | 6 + 4 files changed, 187 insertions(+), 3 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 7c2b846521f3..01fee09b8473 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -117,7 +117,9 @@ Description: range is fixed. For decoders of devtype "cxl_decoder_switch" the address is bounded by the decode range of the cxl_port ancestor of the decoder's cxl_port, and dynamically updates based on the - active memory regions in that address space. + active memory regions in that address space. For decoders of + devtype "cxl_decoder_endpoint", size is a mutable value which + carves our space from the physical media. What: /sys/bus/cxl/devices/decoderX.Y/locked Date: June, 2021 @@ -163,3 +165,12 @@ Description: memory (type-3). The 'target_type' attribute indicates the current setting which may dynamically change based on what memory regions are activated in this decode hierarchy. + +What: /sys/bus/cxl/devices/decoderX.Y/volatile +Date: March, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Provide a knob to set/get whether the desired media is volatile + or persistent. This applies only to decoders of devtype + "cxl_decoder_endpoint", diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index b88ab956bb7c..8796fd4b22bc 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -95,7 +95,8 @@ config CXL_MEM If unsure say 'm'. config CXL_PORT - default CXL_BUS tristate + default CXL_BUS + select DEVICE_PRIVATE endif diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index fe50a42bed7b..89505de0843a 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include @@ -86,8 +87,170 @@ static ssize_t size_show(struct device *dev, struct device_attribute *attr, struct range r = cxl_get_decoder_extent(cxld); return sysfs_emit(buf, "%#llx\n", range_len(&r)); +}; + +struct genpool_data_cxl { + struct cxl_port *port; + struct cxl_endpoint_decoder *cxled; +}; + +struct maximum_addr { + int id; + unsigned long addr; +}; + +/* Find the first enabled decoder below the one we care about */ +static int cxl_find_max_addr(struct device *dev, void *data) +{ + struct maximum_addr *ma = (struct maximum_addr *)data; + struct cxl_decoder *cxld; + struct cxl_endpoint_decoder *cxled; + + if (!is_cxl_decoder(dev)) + return 0; + + cxld = to_cxl_decoder(dev); + + if (cxld->id >= ma->id) + return 0; + + if (!(cxld->flags & CXL_DECODER_F_ENABLE)) + return 0; + + if (cxled->drange.end) { + ma->addr = cxled->drange.end + 1; + return 1; + } + + return 0; } -static DEVICE_ATTR_RO(size); + +unsigned long gen_pool_cxl_alloc(unsigned long *map, unsigned long size, + unsigned long start, unsigned int nr, + void *data, struct gen_pool *pool, + unsigned long start_addr) +{ + struct genpool_data_cxl *port_dec = data; + struct cxl_port *port = port_dec->port; + struct cxl_endpoint_decoder *cxled = port_dec->cxled; + unsigned long offset_bit; + unsigned long start_bit; + struct maximum_addr ma; + + lockdep_assert_held(&port->media_lock); + + ma.id = cxled->base.id; + ma.addr = 0; + + device_for_each_child(&port->dev, &ma, cxl_find_max_addr); + + /* From here on, it's fixed offset algo */ + offset_bit = ma.addr >> pool->min_alloc_order; + + start_bit = bitmap_find_next_zero_area(map, size, start + offset_bit, + nr, 0); + if (start_bit != offset_bit) + start_bit = size; + return start_bit; +} + +static ssize_t size_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + struct cxl_port *port = to_cxl_port(cxled->base.dev.parent); + struct genpool_data_cxl gpdc = { + .port = port, + .cxled = cxled, + }; + unsigned long addr; + void *type; + u64 size; + int rc; + + rc = kstrtou64(buf, 0, &size); + if (rc) + return rc; + + if (size % SZ_256M) + return -EINVAL; + + rc = mutex_lock_interruptible(&cxled->res_lock); + if (rc) + return rc; + + /* No change */ + if (range_len(&cxled->drange) == size) + goto out; + + /* Extent was previously set */ + if (range_len(&cxled->drange)) { + dev_dbg(dev, "freeing previous reservation %#llx-%#llx\n", + cxled->drange.start, cxled->drange.end); + gen_pool_free(port->media, cxled->drange.start, + range_len(&cxled->drange)); + cxled->drange = (struct range){ 0, -1 }; + if (!size) + goto out; + } + + rc = mutex_lock_interruptible(&port->media_lock); + if (rc) + goto out; + + addr = gen_pool_alloc_algo_owner(port->media, size, gen_pool_cxl_alloc, + &gpdc, &type); + mutex_unlock(&port->media_lock); + if (!addr && !type) { + dev_dbg(dev, "couldn't find %u bytes\n", size); + cxl_set_decoder_extent(cxld, 0, 0); + rc = -ENOSPC; + } else { + cxl_set_decoder_extent(cxld, addr, size); + } + +out: + mutex_unlock(&cxled->res_lock); + return rc ? rc : len; +} +static DEVICE_ATTR_RW(size); + +static ssize_t volatile_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + + return sysfs_emit(buf, "%u\n", cxled->volatil); +} + +static ssize_t volatile_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(cxld); + bool p; + int rc; + + rc = kstrtobool(buf, &p); + if (rc) + return rc; + + rc = mutex_lock_interruptible(&cxled->res_lock); + if (rc) + return rc; + + if (range_len(&cxled->drange) > 0) + rc = -EBUSY; + mutex_unlock(&cxled->res_lock); + if (rc) + return rc; + + cxled->volatil = p; + return len; +} +static DEVICE_ATTR_RW(volatile); static ssize_t interleave_ways_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -243,6 +406,7 @@ static const struct attribute_group *cxl_decoder_switch_attribute_groups[] = { static struct attribute *cxl_decoder_endpoint_attrs[] = { &dev_attr_target_type.attr, + &dev_attr_volatile.attr, NULL, }; @@ -439,6 +603,7 @@ static struct cxl_port *cxl_port_alloc(struct device *uport, ida_init(&port->decoder_ida); INIT_LIST_HEAD(&port->dports); INIT_LIST_HEAD(&port->endpoints); + mutex_init(&port->media_lock); device_initialize(dev); device_set_pm_not_required(dev); @@ -1298,6 +1463,7 @@ static struct cxl_decoder *__cxl_decoder_alloc(struct cxl_port *port, if (!cxled) return NULL; cxled->drange = (struct range){ 0, -1 }; + mutex_init(&cxled->res_lock); cxld = &cxled->base; } else if (is_cxl_root(port)) { struct cxl_root_decoder *cxlrd; diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index e88b1efe54d3..8944d0fdd58a 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -248,12 +248,16 @@ struct cxl_decoder { * @hrange: Host physical address space consumed by this decoder. * @drange: device space consumed by this decoder. * @skip: The skip count as specified in the CXL specification. + * @res_lock: Synchronize device's resource usage + * @volatil: Configuration param. Decoder target is non-persistent mem */ struct cxl_endpoint_decoder { struct cxl_decoder base; struct range hrange; struct range drange; u64 skip; + struct mutex res_lock; /* sync access to decoder's resource */ + bool volatil; }; /** @@ -338,6 +342,7 @@ struct cxl_nvdimm { * @component_reg_phys: component register capability base address (optional) * @dead: last ep has been removed, force port re-creation * @depth: How deep this port is relative to the root. depth 0 is the root. + * @media_lock: Protects the media gen_pool * @media: Media's address space (endpoint only) */ struct cxl_port { @@ -350,6 +355,7 @@ struct cxl_port { resource_size_t component_reg_phys; bool dead; unsigned int depth; + struct mutex media_lock; struct gen_pool *media; }; From patchwork Wed Mar 16 23:03:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 221CCC433EF for ; Wed, 16 Mar 2022 23:03:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229820AbiCPXEg (ORCPT ); Wed, 16 Mar 2022 19:04:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229812AbiCPXEf (ORCPT ); Wed, 16 Mar 2022 19:04:35 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F4E27659 for ; Wed, 16 Mar 2022 16:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471799; x=1679007799; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ntgvNKnP5SdlVeS55GoLo93G0RxPxxBQBHJ5YIpBFvE=; b=LO1c6yllGC9ZiSk+loyR2t2bPSK0N2OTb8JL3lL2H2M+q0rtSPD4UR2V WvJtiqcNQmnMOVo55gFGWO6WCvOMZpEIBmChUxnr1l0jkrYScBfY+0vDL f7XyE6nAvVQNrorTpZ4DEEMYTtBmIaCvjbzvnB2SXX2voQ56Au+Sgl6rO EHzIVtEbKli26SqouZ8QostvafBfiwKwbHDqXfRiX8ZALST7jYw1oGwS6 hs0eOJfKN3TjRmwrrdBXWOctkgNd10YgUwR776DyAsPM4xxI7dA6Cy4UR 9e0XIEK7632Xa8gEug5ULlbiW1v+l3kzn2Dplb49MyPokLZi3MDruQpxJ A==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677533" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677533" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621407" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 6/7] cxl/region: Add region creation ABI Date: Wed, 16 Mar 2022 16:03:02 -0700 Message-Id: <20220316230303.1813397-7-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Regions are created as a child of the decoder that encompasses an address space with constraints. Regions have a number of attributes that must be configured before the region can be activated. Multiple processes which are trying not to race with each other shouldn't need special userspace synchronization to do so. // Allocate a new region name region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) // Create a new region by name while region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region) ! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region do true; done // Region now exists in sysfs stat -t /sys/bus/cxl/devices/decoder0.0/$region // Delete the region, and name echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region Signed-off-by: Ben Widawsky --- Changes since v5 (all Dan): - Fix erroneous return on create - forward declare to_cxl_region instead of cxl_region_release - Use REGION_DEAD in the right place - Allocate next id in region_alloc - Use memregion_alloc/free - Remove port/decoder from region name - Use simple integer for create/delete - Update commit message (Dan) - Make attr be called create_pmem_region (Dan) --- Documentation/ABI/testing/sysfs-bus-cxl | 23 + .../driver-api/cxl/memory-devices.rst | 11 + drivers/cxl/Kconfig | 5 + drivers/cxl/core/Makefile | 1 + drivers/cxl/core/port.c | 27 +- drivers/cxl/core/region.c | 586 +++++------------- drivers/cxl/cxl.h | 5 + drivers/cxl/region.h | 27 + tools/testing/cxl/Kbuild | 1 + 9 files changed, 241 insertions(+), 445 deletions(-) create mode 100644 drivers/cxl/region.h diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 01fee09b8473..5229f4bd109a 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -174,3 +174,26 @@ Description: Provide a knob to set/get whether the desired media is volatile or persistent. This applies only to decoders of devtype "cxl_decoder_endpoint", + +What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write an integer value to instantiate a new region to be named + regionZ within the decode range bounded by decoderX.Y. Where X, + Y, and Z are unsigned integers, and where decoderX.Y exists in + the CXL sysfs topology. The value written must match the current + value returned from reading this attribute. This behavior lets + the kernel arbitrate racing attempts to create a region. The + thread that fails to write loops and tries the next value. + Regions must subsequently configured and bound to a region + driver before they can be used. + +What: /sys/bus/cxl/devices/decoderX.Y/delete_region +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Deletes the named region. The attribute expects a region number + as an integer. diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index db476bb170b6..66ddc58a21b1 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -362,6 +362,17 @@ CXL Core .. kernel-doc:: drivers/cxl/core/mbox.c :doc: cxl mbox +CXL Regions +----------- +.. kernel-doc:: drivers/cxl/region.h + :identifiers: + +.. kernel-doc:: drivers/cxl/core/region.c + :doc: cxl core region + +.. kernel-doc:: drivers/cxl/core/region.c + :identifiers: + External Interfaces =================== diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 8796fd4b22bc..7ce86eee8bda 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -99,4 +99,9 @@ config CXL_PORT default CXL_BUS select DEVICE_PRIVATE +config CXL_REGION + tristate + default CXL_BUS + select MEMREGION + endif diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 6d37cd78b151..39ce8f2f2373 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_CXL_BUS) += cxl_core.o ccflags-y += -I$(srctree)/drivers/cxl cxl_core-y := port.o cxl_core-y += pmem.o +cxl_core-y += region.o cxl_core-y += regs.o cxl_core-y += memdev.o cxl_core-y += mbox.o diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 89505de0843a..3916669e6f11 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ #include +#include #include #include #include @@ -11,6 +12,7 @@ #include #include #include +#include #include #include "core.h" @@ -368,6 +370,8 @@ static struct attribute_group cxl_decoder_base_attribute_group = { }; static struct attribute *cxl_decoder_root_attrs[] = { + &dev_attr_create_pmem_region.attr, + &dev_attr_delete_region.attr, &dev_attr_cap_pmem.attr, &dev_attr_cap_ram.attr, &dev_attr_cap_type2.attr, @@ -426,6 +430,8 @@ static void cxl_decoder_release(struct device *dev) struct cxl_decoder *cxld = to_cxl_decoder(dev); struct cxl_port *port = to_cxl_port(dev->parent); + if (is_root_decoder(dev)) + memregion_free(to_cxl_root_decoder(cxld)->next_region_id); ida_free(&port->decoder_ida, cxld->id); kfree(cxld); put_device(&port->dev); @@ -1545,12 +1551,21 @@ static struct cxl_decoder *cxl_decoder_alloc(struct cxl_port *port, device_set_pm_not_required(dev); dev->parent = &port->dev; dev->bus = &cxl_bus_type; - if (is_cxl_root(port)) + if (is_cxl_root(port)) { + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + cxld->dev.type = &cxl_decoder_root_type; - else if (is_cxl_endpoint(port)) + mutex_init(&cxlrd->id_lock); + rc = memregion_alloc(GFP_KERNEL); + if (rc < 0) + goto err; + + cxlrd->next_region_id = rc; + } else if (is_cxl_endpoint(port)) { cxld->dev.type = &cxl_decoder_endpoint_type; - else + } else { cxld->dev.type = &cxl_decoder_switch_type; + } /* Pre initialize an "empty" decoder */ cxld->interleave_ways = 1; @@ -1828,6 +1843,12 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd) } EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL); +bool schedule_cxl_region_unregister(struct cxl_region *cxlr) +{ + return queue_work(cxl_bus_wq, &cxlr->detach_work); +} +EXPORT_SYMBOL_NS_GPL(schedule_cxl_region_unregister, CXL); + /* for user tooling to ensure port disable work has completed */ static ssize_t flush_store(struct bus_type *bus, const char *buf, size_t count) { diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index f0a821de94cf..52baa8c1526a 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1,417 +1,46 @@ // SPDX-License-Identifier: GPL-2.0-only -/* Copyright(c) 2021 Intel Corporation. All rights reserved. */ -#include +/* Copyright(c) 2022 Intel Corporation. All rights reserved. */ +#include #include #include -#include #include -#include #include #include -#include #include #include "core.h" -#include "core.h" - /** * DOC: cxl core region * - * Regions are managed through the Linux device model. Each region instance is a - * unique struct device. CXL core provides functionality to create, destroy, and - * configure regions. This is all implemented here. Binding a region - * (programming the hardware) is handled by a separate region driver. + * CXL Regions represent mapped memory capacity in system physical address + * space. Whereas the CXL Root Decoders identify the bounds of potential CXL + * Memory ranges, Regions represent the active mapped capacity by the HDM + * Decoder Capability structures throughout the Host Bridges, Switches, and + * Endpoints in the topology. */ -struct cxl_region *to_cxl_region(struct device *dev); -static const struct attribute_group region_interleave_group; +static struct cxl_region *to_cxl_region(struct device *dev); -static bool is_region_active(struct cxl_region *cxlr) -{ - return cxlr->active; -} - -/* - * Most sanity checking is left up to region binding. This does the most basic - * check to determine whether or not the core should try probing the driver. - */ -bool is_cxl_region_configured(const struct cxl_region *cxlr) -{ - /* zero sized regions aren't a thing. */ - if (cxlr->config.size <= 0) - return false; - - /* all regions have at least 1 target */ - if (!cxlr->config.targets[0]) - return false; - - return true; -} -EXPORT_SYMBOL_GPL(is_cxl_region_configured); - -static void remove_target(struct cxl_region *cxlr, int target) -{ - struct cxl_memdev *cxlmd; - - cxlmd = cxlr->config.targets[target]; - if (cxlmd) - put_device(&cxlmd->dev); - cxlr->config.targets[target] = NULL; -} - -static ssize_t interleave_ways_show(struct device *dev, - struct device_attribute *attr, char *buf) +static void cxl_region_release(struct device *dev) { struct cxl_region *cxlr = to_cxl_region(dev); - return sysfs_emit(buf, "%d\n", cxlr->config.interleave_ways); + memregion_free(cxlr->id); + kfree(cxlr); } -static ssize_t interleave_ways_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - int ret, prev_iw; - int val; - - prev_iw = cxlr->config.interleave_ways; - ret = kstrtoint(buf, 0, &val); - if (ret) - return ret; - if (ret < 0 || ret > CXL_DECODER_MAX_INTERLEAVE) - return -EINVAL; - - cxlr->config.interleave_ways = val; - - ret = sysfs_update_group(&dev->kobj, ®ion_interleave_group); - if (ret < 0) - goto err; - - sysfs_notify(&dev->kobj, NULL, "target_interleave"); - - while (prev_iw > cxlr->config.interleave_ways) - remove_target(cxlr, --prev_iw); - - return len; - -err: - cxlr->config.interleave_ways = prev_iw; - return ret; -} -static DEVICE_ATTR_RW(interleave_ways); - -static ssize_t interleave_granularity_show(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - - return sysfs_emit(buf, "%d\n", cxlr->config.interleave_granularity); -} - -static ssize_t interleave_granularity_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - int val, ret; - - ret = kstrtoint(buf, 0, &val); - if (ret) - return ret; - cxlr->config.interleave_granularity = val; - - return len; -} -static DEVICE_ATTR_RW(interleave_granularity); - -static ssize_t offset_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct cxl_decoder *cxld = to_cxl_decoder(dev->parent); - struct cxl_region *cxlr = to_cxl_region(dev); - resource_size_t offset; - - if (!cxlr->res) - return sysfs_emit(buf, "\n"); - - offset = cxld->platform_res.start - cxlr->res->start; - - return sysfs_emit(buf, "%pa\n", &offset); -} -static DEVICE_ATTR_RO(offset); - -static ssize_t size_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - - return sysfs_emit(buf, "%llu\n", cxlr->config.size); -} - -static ssize_t size_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - unsigned long long val; - ssize_t rc; - - rc = kstrtoull(buf, 0, &val); - if (rc) - return rc; - - device_lock(&cxlr->dev); - if (is_region_active(cxlr)) - rc = -EBUSY; - else - cxlr->config.size = val; - device_unlock(&cxlr->dev); - - return rc ? rc : len; -} -static DEVICE_ATTR_RW(size); - -static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, - char *buf) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - - return sysfs_emit(buf, "%pUb\n", &cxlr->config.uuid); -} - -static ssize_t uuid_store(struct device *dev, struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_region *cxlr = to_cxl_region(dev); - ssize_t rc; - - if (len != UUID_STRING_LEN + 1) - return -EINVAL; - - device_lock(&cxlr->dev); - if (is_region_active(cxlr)) - rc = -EBUSY; - else - rc = uuid_parse(buf, &cxlr->config.uuid); - device_unlock(&cxlr->dev); - - return rc ? rc : len; -} -static DEVICE_ATTR_RW(uuid); - -static struct attribute *region_attrs[] = { - &dev_attr_interleave_ways.attr, - &dev_attr_interleave_granularity.attr, - &dev_attr_offset.attr, - &dev_attr_size.attr, - &dev_attr_uuid.attr, - NULL, -}; - -static const struct attribute_group region_group = { - .attrs = region_attrs, -}; - -static size_t show_targetN(struct cxl_region *cxlr, char *buf, int n) -{ - int ret; - - device_lock(&cxlr->dev); - if (!cxlr->config.targets[n]) - ret = sysfs_emit(buf, "\n"); - else - ret = sysfs_emit(buf, "%s\n", - dev_name(&cxlr->config.targets[n]->dev)); - device_unlock(&cxlr->dev); - - return ret; -} - -static size_t set_targetN(struct cxl_region *cxlr, const char *buf, int n, - size_t len) -{ - struct device *memdev_dev; - struct cxl_memdev *cxlmd; - - device_lock(&cxlr->dev); - - if (len == 1 || cxlr->config.targets[n]) - remove_target(cxlr, n); - - /* Remove target special case */ - if (len == 1) { - device_unlock(&cxlr->dev); - return len; - } - - memdev_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf); - if (!memdev_dev) { - device_unlock(&cxlr->dev); - return -ENOENT; - } - - /* reference to memdev held until target is unset or region goes away */ - - cxlmd = to_cxl_memdev(memdev_dev); - cxlr->config.targets[n] = cxlmd; - - device_unlock(&cxlr->dev); - - return len; -} - -#define TARGET_ATTR_RW(n) \ - static ssize_t target##n##_show( \ - struct device *dev, struct device_attribute *attr, char *buf) \ - { \ - return show_targetN(to_cxl_region(dev), buf, (n)); \ - } \ - static ssize_t target##n##_store(struct device *dev, \ - struct device_attribute *attr, \ - const char *buf, size_t len) \ - { \ - return set_targetN(to_cxl_region(dev), buf, (n), len); \ - } \ - static DEVICE_ATTR_RW(target##n) - -TARGET_ATTR_RW(0); -TARGET_ATTR_RW(1); -TARGET_ATTR_RW(2); -TARGET_ATTR_RW(3); -TARGET_ATTR_RW(4); -TARGET_ATTR_RW(5); -TARGET_ATTR_RW(6); -TARGET_ATTR_RW(7); -TARGET_ATTR_RW(8); -TARGET_ATTR_RW(9); -TARGET_ATTR_RW(10); -TARGET_ATTR_RW(11); -TARGET_ATTR_RW(12); -TARGET_ATTR_RW(13); -TARGET_ATTR_RW(14); -TARGET_ATTR_RW(15); - -static struct attribute *interleave_attrs[] = { - &dev_attr_target0.attr, - &dev_attr_target1.attr, - &dev_attr_target2.attr, - &dev_attr_target3.attr, - &dev_attr_target4.attr, - &dev_attr_target5.attr, - &dev_attr_target6.attr, - &dev_attr_target7.attr, - &dev_attr_target8.attr, - &dev_attr_target9.attr, - &dev_attr_target10.attr, - &dev_attr_target11.attr, - &dev_attr_target12.attr, - &dev_attr_target13.attr, - &dev_attr_target14.attr, - &dev_attr_target15.attr, - NULL, -}; - -static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n) -{ - struct device *dev = container_of(kobj, struct device, kobj); - struct cxl_region *cxlr = to_cxl_region(dev); - - if (n < cxlr->config.interleave_ways) - return a->mode; - return 0; -} - -static const struct attribute_group region_interleave_group = { - .attrs = interleave_attrs, - .is_visible = visible_targets, -}; - -static const struct attribute_group *region_groups[] = { - &cxl_base_attribute_group, - ®ion_group, - ®ion_interleave_group, - NULL, -}; - -static void cxl_region_release(struct device *dev); - -const struct device_type cxl_region_type = { +static const struct device_type cxl_region_type = { .name = "cxl_region", .release = cxl_region_release, - .groups = region_groups }; -static ssize_t create_region_show(struct device *dev, - struct device_attribute *attr, char *buf) +bool is_cxl_region(struct device *dev) { - struct cxl_port *port = to_cxl_port(dev->parent); - struct cxl_decoder *cxld = to_cxl_decoder(dev); - int rc; - - if (dev_WARN_ONCE(dev, !is_root_decoder(dev), - "Invalid decoder selected for region.")) { - return -ENODEV; - } - - rc = ida_alloc(&cxld->region_ida, GFP_KERNEL); - if (rc < 0) { - dev_err(&cxld->dev, "Couldn't get a new id\n"); - return rc; - } - - return sysfs_emit(buf, "region%d.%d:%d\n", port->id, cxld->id, rc); + return dev->type == &cxl_region_type; } +EXPORT_SYMBOL_NS_GPL(is_cxl_region, CXL); -static ssize_t create_region_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_port *port = to_cxl_port(dev->parent); - struct cxl_decoder *cxld = to_cxl_decoder(dev); - int decoder_id, port_id, region_id; - struct cxl_region *cxlr; - ssize_t rc; - - if (sscanf(buf, "region%d.%d:%d", &port_id, &decoder_id, ®ion_id) != 3) - return -EINVAL; - - if (decoder_id != cxld->id) - return -EINVAL; - - if (port_id != port->id) - return -EINVAL; - - cxlr = cxl_alloc_region(cxld, region_id); - if (IS_ERR(cxlr)) - return PTR_ERR(cxlr); - - rc = cxl_add_region(cxld, cxlr); - if (rc) { - kfree(cxlr); - return rc; - } - - return len; -} -DEVICE_ATTR_RW(create_region); - -static ssize_t delete_region_store(struct device *dev, - struct device_attribute *attr, - const char *buf, size_t len) -{ - struct cxl_decoder *cxld = to_cxl_decoder(dev); - int rc; - - rc = cxl_delete_region(cxld, buf); - if (rc) - return rc; - - return len; -} -DEVICE_ATTR_WO(delete_region); - -struct cxl_region *to_cxl_region(struct device *dev) +static struct cxl_region *to_cxl_region(struct device *dev) { if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type, "not a cxl_region device\n")) @@ -419,39 +48,58 @@ struct cxl_region *to_cxl_region(struct device *dev) return container_of(dev, struct cxl_region, dev); } -EXPORT_SYMBOL_GPL(to_cxl_region); -static void cxl_region_release(struct device *dev) -{ - struct cxl_decoder *cxld = to_cxl_decoder(dev->parent); - struct cxl_region *cxlr = to_cxl_region(dev); - int i; - - ida_free(&cxld->region_ida, cxlr->id); - for (i = 0; i < cxlr->config.interleave_ways; i++) - remove_target(cxlr, i); - kfree(cxlr); -} - -struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld, int id) +static void unregister_region(struct work_struct *work) { struct cxl_region *cxlr; - cxlr = kzalloc(sizeof(*cxlr), GFP_KERNEL); - if (!cxlr) - return ERR_PTR(-ENOMEM); + cxlr = container_of(work, typeof(*cxlr), detach_work); + device_unregister(&cxlr->dev); +} - INIT_LIST_HEAD(&cxlr->staged_list); - INIT_LIST_HEAD(&cxlr->commit_list); - cxlr->id = id; +static void schedule_unregister(void *cxlr) +{ + schedule_cxl_region_unregister(cxlr); +} + +static struct cxl_region *cxl_region_alloc(struct cxl_decoder *cxld) +{ + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + struct cxl_region *cxlr; + struct device *dev; + int rc; + + lockdep_assert_held(&cxlrd->id_lock); + + rc = memregion_alloc(GFP_KERNEL); + if (rc < 0) { + dev_dbg(dev, "Failed to get next cached id (%d)\n", rc); + return ERR_PTR(rc); + } + + cxlr = kzalloc(sizeof(*cxlr), GFP_KERNEL); + if (!cxlr) { + memregion_free(rc); + return ERR_PTR(-ENOMEM); + } + + cxlr->id = cxlrd->next_region_id; + cxlrd->next_region_id = rc; + + dev = &cxlr->dev; + device_initialize(dev); + dev->parent = &cxld->dev; + device_set_pm_not_required(dev); + dev->bus = &cxl_bus_type; + dev->type = &cxl_region_type; + INIT_WORK(&cxlr->detach_work, unregister_region); return cxlr; } /** - * cxl_add_region - Adds a region to a decoder + * devm_cxl_add_region - Adds a region to a decoder * @cxld: Parent decoder. - * @cxlr: Region to be added to the decoder. * * This is the second step of region initialization. Regions exist within an * address space which is mapped by a @cxld. That @cxld must be a root decoder, @@ -461,35 +109,98 @@ struct cxl_region *cxl_alloc_region(struct cxl_decoder *cxld, int id) * code. The region will be named "regionX.Y.Z" where X is the port, Y is the * decoder id, and Z is the region number. */ -int cxl_add_region(struct cxl_decoder *cxld, struct cxl_region *cxlr) +static struct cxl_region *devm_cxl_add_region(struct cxl_decoder *cxld) { struct cxl_port *port = to_cxl_port(cxld->dev.parent); - struct device *dev = &cxlr->dev; + struct cxl_region *cxlr; + struct device *dev; int rc; - device_initialize(dev); - dev->parent = &cxld->dev; - device_set_pm_not_required(dev); - dev->bus = &cxl_bus_type; - dev->type = &cxl_region_type; - rc = dev_set_name(dev, "region%d.%d:%d", port->id, cxld->id, cxlr->id); + cxlr = cxl_region_alloc(cxld); + if (IS_ERR(cxlr)) + return cxlr; + + dev = &cxlr->dev; + + rc = dev_set_name(dev, "region%d", cxlr->id); if (rc) - goto err; + goto err_out; cxl_set_lock_class(dev); rc = device_add(dev); if (rc) - goto err; + goto err_put; - dev_dbg(dev, "Added to %s\n", dev_name(&cxld->dev)); + rc = devm_add_action_or_reset(port->uport, schedule_unregister, cxlr); + if (rc) + goto err_put; - return 0; + return cxlr; -err: +err_put: + put_device(&cxld->dev); + +err_out: put_device(dev); + return ERR_PTR(rc); +} + +static ssize_t create_pmem_region_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + size_t rc; + + /* + * There's no point in returning known bad answers when the lock is held + * on the store side, even though the answer given here may be + * immediately invalidated as soon as the lock is dropped it's still + * useful to throttle readers in the presence of writers. + */ + rc = mutex_lock_interruptible(&cxlrd->id_lock); + if (rc) + return rc; + rc = sysfs_emit(buf, "%d\n", cxlrd->next_region_id); + mutex_unlock(&cxlrd->id_lock); + return rc; } +static ssize_t create_pmem_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxld); + struct cxl_region *cxlr; + size_t id, rc; + + rc = kstrtoul(buf, 10, &id); + if (rc) + return rc; + + rc = mutex_lock_interruptible(&cxlrd->id_lock); + if (rc) + return rc; + + if (cxlrd->next_region_id != id) { + rc = -EINVAL; + goto out; + } + + cxlr = devm_cxl_add_region(cxld); + rc = 0; + dev_dbg(dev, "Created %s\n", dev_name(&cxlr->dev)); + +out: + mutex_unlock(&cxlrd->id_lock); + if (rc) + return rc; + return len; +} +DEVICE_ATTR_RW(create_pmem_region); + static struct cxl_region *cxl_find_region_by_name(struct cxl_decoder *cxld, const char *name) { @@ -502,30 +213,21 @@ static struct cxl_region *cxl_find_region_by_name(struct cxl_decoder *cxld, return to_cxl_region(region_dev); } -/** - * cxl_delete_region - Deletes a region - * @cxld: Parent decoder - * @region_name: Named region, ie. regionX.Y:Z - */ -int cxl_delete_region(struct cxl_decoder *cxld, const char *region_name) +static ssize_t delete_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) { + struct cxl_port *port = to_cxl_port(dev->parent); + struct cxl_decoder *cxld = to_cxl_decoder(dev); struct cxl_region *cxlr; - device_lock(&cxld->dev); - - cxlr = cxl_find_region_by_name(cxld, region_name); - if (IS_ERR(cxlr)) { - device_unlock(&cxld->dev); + cxlr = cxl_find_region_by_name(cxld, buf); + if (IS_ERR(cxlr)) return PTR_ERR(cxlr); - } - dev_dbg(&cxld->dev, "Requested removal of %s from %s\n", - dev_name(&cxlr->dev), dev_name(&cxld->dev)); + /* Reference held for wq */ + devm_release_action(port->uport, schedule_unregister, cxlr); - device_unregister(&cxlr->dev); - device_unlock(&cxld->dev); - - put_device(&cxlr->dev); - - return 0; + return len; } +DEVICE_ATTR_WO(delete_region); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 8944d0fdd58a..1631b0aeca85 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -491,6 +491,8 @@ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port); int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm); int devm_cxl_add_passthrough_decoder(struct cxl_port *port); +bool is_cxl_region(struct device *dev); + extern struct bus_type cxl_bus_type; struct cxl_driver { @@ -545,6 +547,7 @@ enum cxl_lock_class { CXL_ROOT_LOCK, CXL_NVDIMM_LOCK, CXL_NVDIMM_BRIDGE_LOCK, + CXL_REGION_LOCK, CXL_PORT_LOCK = 2, /* * Be careful to add new lock classes here, CXL_PORT_LOCK is @@ -585,6 +588,8 @@ static inline int cxl_lock_class(struct device *dev) return CXL_NVDIMM_BRIDGE_LOCK; else if (is_cxl_nvdimm(dev)) return CXL_NVDIMM_LOCK; + else if (is_cxl_region(dev)) + return CXL_REGION_LOCK; else return CXL_ANON_LOCK; } diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h new file mode 100644 index 000000000000..6a0118dcdf2f --- /dev/null +++ b/drivers/cxl/region.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright(c) 2021 Intel Corporation. */ +#ifndef __CXL_REGION_H__ +#define __CXL_REGION_H__ + +#include + +#include "cxl.h" + +/** + * struct cxl_region - CXL region + * @dev: This region's device. + * @id: This region's id. Id is globally unique across all regions. + * @flags: Flags representing the current state of the region. + * @detach_work: Async unregister to allow attrs to take device_lock. + */ +struct cxl_region { + struct device dev; + int id; + unsigned long flags; +#define REGION_DEAD 0 + struct work_struct detach_work; +}; + +bool schedule_cxl_region_unregister(struct cxl_region *cxlr); + +#endif diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild index 82e49ab0937d..3fe6d34e6d59 100644 --- a/tools/testing/cxl/Kbuild +++ b/tools/testing/cxl/Kbuild @@ -46,6 +46,7 @@ cxl_core-y += $(CXL_CORE_SRC)/memdev.o cxl_core-y += $(CXL_CORE_SRC)/mbox.o cxl_core-y += $(CXL_CORE_SRC)/pci.o cxl_core-y += $(CXL_CORE_SRC)/hdm.o +cxl_core-y += $(CXL_CORE_SRC)/region.o cxl_core-y += config_check.o obj-m += test/ From patchwork Wed Mar 16 23:03:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 12783253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4454C4332F for ; Wed, 16 Mar 2022 23:03:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230013AbiCPXEh (ORCPT ); Wed, 16 Mar 2022 19:04:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229812AbiCPXEh (ORCPT ); Wed, 16 Mar 2022 19:04:37 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9506DF89 for ; Wed, 16 Mar 2022 16:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647471800; x=1679007800; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pVuifyKJmLAnfsPY+9wV8c+rrzVGz18sO6H1e91/mw4=; b=kgLxrTWPKbFcgzUzEB2g952gdLuSSydKUVy4K5sksxQF+SS0FDYn6DNU ckCbvSNGkPnV93wsi4zsIPyqrCY+wZ4+bg5mSLm1CyxMIfsnxn47waD5k V4MDliNChkm9xDhbr5ZyHAnXI0eHF2c9f2+o/GeeduypGxSnQbEZlSRqa fXnxoYEk+h4JyFx7Du+BJvmeR4liyJl6HqcXheLKCQHrfLOQwPVNM8lYZ RLUAQ38CsaOhDloYEdjfbElyFSZKuLGnZKC05jyJzekrvCvH3SmtjPBV4 y4LaGbzVECZoRaOTmAZirwV+Bcne36CjvEJ4vSAQj948StXMysfeD9m5U w==; X-IronPort-AV: E=McAfee;i="6200,9189,10288"; a="236677538" X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="236677538" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:18 -0700 X-IronPort-AV: E=Sophos;i="5.90,187,1643702400"; d="scan'208";a="498621413" Received: from msimpso1-cxt.amr.corp.intel.com (HELO localhost.localdomain) ([10.252.132.128]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Mar 2022 16:03:17 -0700 From: Ben Widawsky To: linux-cxl@vger.kernel.org Cc: patches@lists.linux.dev, Ben Widawsky , kernel test robot , Alison Schofield , Dan Williams , Ira Weiny , Jonathan Cameron , Vishal Verma Subject: [RFC PATCH 7/7] cxl/region: Introduce concept of region configuration Date: Wed, 16 Mar 2022 16:03:03 -0700 Message-Id: <20220316230303.1813397-8-ben.widawsky@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220316230303.1813397-1-ben.widawsky@intel.com> References: <20220316230303.1813397-1-ben.widawsky@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org The region creation APIs create a vacant region. Configuring the region works in the same way as similar subsystems such as devdax. Sysfs attrs will be provided to allow userspace to configure the region. Finally once all configuration is complete, userspace may activate the region. Introduced here are the most basic attributes needed to configure a region. Details of these attribute are described in the ABI Documentation. A example is provided below: /sys/bus/cxl/devices/region0.0:0 ├── devtype ├── interleave_granularity ├── interleave_ways ├── modalias ├── offset ├── size ├── subsystem -> ../../../../../../bus/cxl ├── target0 ├── uevent └── uuid Reported-by: kernel test robot (v2) Signed-off-by: Ben Widawsky --- Changes since v3: - Make target be a decoder - Use device_lock for protecting config/probe race - Teardown region on decoder removal --- Documentation/ABI/testing/sysfs-bus-cxl | 55 ++++ drivers/cxl/core/port.c | 33 ++- drivers/cxl/core/region.c | 357 +++++++++++++++++++++++- drivers/cxl/cxl.h | 16 ++ drivers/cxl/region.h | 66 +++++ 5 files changed, 516 insertions(+), 11 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl index 5229f4bd109a..fe9f8d3a1fc8 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -197,3 +197,58 @@ Contact: linux-cxl@vger.kernel.org Description: Deletes the named region. The attribute expects a region number as an integer. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/resource +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + A region is a contiguous partition of a CXL Root decoder address + space. Region capacity is allocated by writing to the size + attribute, the resulting physical address base determined by the + driver is reflected here. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/size +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + System physical address space to be consumed by the region. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/interleave_ways +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Configures the number of devices participating in the region is + set by writing this value. Each device will provide + 1/interleave_ways of storage for the region. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/interleave_granularity +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Set the number of consecutive bytes each device in the + interleave set will claim. The possible interleave granularity + values are determined by the CXL spec and the participating + devices. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/uuid +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write a unique identifier for the region. This field must be set + for persistent regions and it must not conflict with the UUID of + another region. If this field is set for volatile regions, the + value is ignored. + +What: /sys/bus/cxl/devices/decoderX.Y/regionX.Y:Z/endpoint_decoder[0..interleave_ways] +Date: January, 2022 +KernelVersion: v5.19 +Contact: linux-cxl@vger.kernel.org +Description: + Write a decoder object that is unused and will participate in + decoding memory transactions for the interleave set, ie. + decoderX.Y. All attributes must be populated. diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 3916669e6f11..cfeff49f6b66 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -216,6 +216,15 @@ static ssize_t size_store(struct device *dev, struct device_attribute *attr, mutex_unlock(&cxled->res_lock); return rc ? rc : len; } + +static ssize_t size_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct range r = cxl_get_decoder_extent(cxld); + + return sysfs_emit(buf, "%#llx\n", range_len(&r)); +} static DEVICE_ATTR_RW(size); static ssize_t volatile_show(struct device *dev, struct device_attribute *attr, @@ -1734,18 +1743,24 @@ EXPORT_SYMBOL_NS_GPL(cxl_decoder_add, CXL); static void cxld_unregister(void *dev) { struct cxl_decoder *cxld = to_cxl_decoder(dev); + struct cxl_endpoint_decoder *cxled; + struct cxl_port *ep; - if (is_endpoint_decoder(&cxld->dev)) { - struct cxl_endpoint_decoder *cxled = - to_cxl_endpoint_decoder(cxld); - struct cxl_port *ep = to_cxl_port(cxld->dev.parent); + if (!is_endpoint_decoder(&cxld->dev)) + goto out; - if (!range_len(&cxled->drange)) - goto out; + cxled = to_cxl_endpoint_decoder(cxld); + ep = to_cxl_port(cxld->dev.parent); - gen_pool_free(ep->media, cxled->drange.start, - range_len(&cxled->drange)); - } + if (!range_len(&cxled->drange)) + goto out; + + gen_pool_free(ep->media, cxled->drange.start, + range_len(&cxled->drange)); + + mutex_lock(&cxled->cxlr->remove_lock); + device_release_driver(&cxled->cxlr->dev); + mutex_unlock(&cxled->cxlr->remove_lock); out: device_unregister(dev); diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 52baa8c1526a..9c1a581b7035 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -3,9 +3,12 @@ #include #include #include +#include #include +#include #include #include +#include #include #include "core.h" @@ -17,21 +20,369 @@ * Memory ranges, Regions represent the active mapped capacity by the HDM * Decoder Capability structures throughout the Host Bridges, Switches, and * Endpoints in the topology. + * + * Region configuration has some ordering constraints: + * - Size: Must be set after all targets + * - Targets: Must be set after interleave ways + * - Interleave ways: Must be set after Interleave Granularity + * + * UUID may be set at any time before binding the driver to the region. */ -static struct cxl_region *to_cxl_region(struct device *dev); +static const struct attribute_group region_interleave_group; + +static void remove_target(struct cxl_region *cxlr, int target) +{ + struct cxl_endpoint_decoder *cxled; + + mutex_lock(&cxlr->remove_lock); + cxled = cxlr->targets[target]; + if (cxled) { + cxled->cxlr = NULL; + put_device(&cxled->base.dev); + } + cxlr->targets[target] = NULL; + mutex_unlock(&cxlr->remove_lock); +} static void cxl_region_release(struct device *dev) { struct cxl_region *cxlr = to_cxl_region(dev); + int i; memregion_free(cxlr->id); + for (i = 0; i < cxlr->interleave_ways; i++) + remove_target(cxlr, i); kfree(cxlr); } +static ssize_t interleave_ways_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%d\n", cxlr->interleave_ways); +} + +static ssize_t interleave_ways_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_decoder *rootd; + int rc, val; + + rc = kstrtoint(buf, 0, &val); + if (rc) + return rc; + + if (device_lock_interruptible(dev) < 0) + return -EINTR; + + if (dev->driver) { + device_unlock(dev); + return -EBUSY; + } + + if (cxlr->interleave_ways) { + device_unlock(dev); + return -EEXIST; + } + + if (!cxlr->interleave_granularity) { + dev_dbg(&cxlr->dev, "IG must be set before IW\n"); + device_unlock(dev); + return -EILSEQ; + } + + rootd = to_cxl_decoder(cxlr->dev.parent); + if (!cxl_is_interleave_ways_valid(cxlr, rootd, val)) { + device_unlock(dev); + return -EINVAL; + } + + cxlr->interleave_ways = val; + device_unlock(dev); + + rc = sysfs_update_group(&cxlr->dev.kobj, ®ion_interleave_group); + if (rc < 0) { + cxlr->interleave_ways = 0; + return rc; + } + + return len; +} +static DEVICE_ATTR_RW(interleave_ways); + +static ssize_t interleave_granularity_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%d\n", cxlr->interleave_granularity); +} + +static ssize_t interleave_granularity_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + struct cxl_decoder *rootd; + int val, ret; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (device_lock_interruptible(dev) < 0) + return -EINTR; + + if (dev->driver) { + device_unlock(dev); + return -EBUSY; + } + + if (cxlr->interleave_granularity) { + device_unlock(dev); + return -EEXIST; + } + + rootd = to_cxl_decoder(cxlr->dev.parent); + if (!cxl_is_interleave_granularity_valid(rootd, val)) { + device_unlock(dev); + return -EINVAL; + } + + cxlr->interleave_granularity = val; + device_unlock(dev); + + return len; +} +static DEVICE_ATTR_RW(interleave_granularity); + +static ssize_t offset_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + if (!cxlr->res) + return sysfs_emit(buf, "\n"); + + return sysfs_emit(buf, "%pa\n", &cxlr->res->start); +} +static DEVICE_ATTR_RO(offset); + +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + + return sysfs_emit(buf, "%pUb\n", &cxlr->uuid); +} + +static int is_dupe(struct device *match, void *_cxlr) +{ + struct cxl_region *c, *cxlr = _cxlr; + + if (!is_cxl_region(match)) + return 0; + + if (&cxlr->dev == match) + return 0; + + c = to_cxl_region(match); + if (uuid_equal(&c->uuid, &cxlr->uuid)) + return -EEXIST; + + return 0; +} + +static ssize_t uuid_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t len) +{ + struct cxl_region *cxlr = to_cxl_region(dev); + ssize_t rc; + uuid_t temp; + + if (len != UUID_STRING_LEN + 1) + return -EINVAL; + + rc = uuid_parse(buf, &temp); + if (rc) + return rc; + + if (device_lock_interruptible(dev) < 0) + return -EINTR; + + if (dev->driver) { + device_unlock(dev); + return -EBUSY; + } + + if (!uuid_is_null(&cxlr->uuid)) { + device_unlock(dev); + return -EEXIST; + } + + rc = bus_for_each_dev(&cxl_bus_type, NULL, cxlr, is_dupe); + if (rc < 0) { + device_unlock(dev); + return false; + } + + cxlr->uuid = temp; + device_unlock(dev); + return len; +} +static DEVICE_ATTR_RW(uuid); + +static struct attribute *region_attrs[] = { + &dev_attr_interleave_ways.attr, + &dev_attr_interleave_granularity.attr, + &dev_attr_offset.attr, + &dev_attr_size.attr, + &dev_attr_uuid.attr, + NULL, +}; + +static const struct attribute_group region_group = { + .attrs = region_attrs, +}; + +static size_t show_targetN(struct cxl_region *cxlr, char *buf, int n) +{ + if (!cxlr->targets[n]) + return sysfs_emit(buf, "\n"); + + return sysfs_emit(buf, "%s\n", dev_name(&cxlr->targets[n]->base.dev)); +} + +static size_t store_targetN(struct cxl_region *cxlr, const char *buf, int n, + size_t len) +{ + struct cxl_endpoint_decoder *cxled; + struct cxl_decoder *cxld; + struct device *cxld_dev; + + if (device_lock_interruptible(&cxlr->dev) < 0) + return -EINTR; + + if (cxlr->dev.driver) { + device_unlock(&cxlr->dev); + return -EBUSY; + } + + if (cxlr->targets[n]) { + device_unlock(&cxlr->dev); + return -EEXIST; + } + + cxld_dev = bus_find_device_by_name(&cxl_bus_type, NULL, buf); + if (!cxld_dev) + return -ENOENT; + + if (!is_cxl_decoder(cxld_dev)) { + put_device(cxld_dev); + return -EINVAL; + } + + if (!is_cxl_endpoint(to_cxl_port(cxld_dev->parent))) { + put_device(cxld_dev); + return -EINVAL; + } + + cxld = to_cxl_decoder(cxld_dev); + if (cxld->flags & CXL_DECODER_F_ENABLE) { + put_device(cxld_dev); + return -EBUSY; + } + + /* decoder reference is held until teardown */ + cxled = to_cxl_endpoint_decoder(cxld); + cxlr->targets[n] = cxled; + cxled->cxlr = cxlr; + + return len; +} + +#define TARGET_ATTR_RW(n) \ + static ssize_t target##n##_show( \ + struct device *dev, struct device_attribute *attr, char *buf) \ + { \ + return show_targetN(to_cxl_region(dev), buf, (n)); \ + } \ + static ssize_t target##n##_store(struct device *dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ + { \ + return store_targetN(to_cxl_region(dev), buf, (n), len); \ + } \ + static DEVICE_ATTR_RW(target##n) + +TARGET_ATTR_RW(0); +TARGET_ATTR_RW(1); +TARGET_ATTR_RW(2); +TARGET_ATTR_RW(3); +TARGET_ATTR_RW(4); +TARGET_ATTR_RW(5); +TARGET_ATTR_RW(6); +TARGET_ATTR_RW(7); +TARGET_ATTR_RW(8); +TARGET_ATTR_RW(9); +TARGET_ATTR_RW(10); +TARGET_ATTR_RW(11); +TARGET_ATTR_RW(12); +TARGET_ATTR_RW(13); +TARGET_ATTR_RW(14); +TARGET_ATTR_RW(15); + +static struct attribute *interleave_attrs[] = { + &dev_attr_target0.attr, + &dev_attr_target1.attr, + &dev_attr_target2.attr, + &dev_attr_target3.attr, + &dev_attr_target4.attr, + &dev_attr_target5.attr, + &dev_attr_target6.attr, + &dev_attr_target7.attr, + &dev_attr_target8.attr, + &dev_attr_target9.attr, + &dev_attr_target10.attr, + &dev_attr_target11.attr, + &dev_attr_target12.attr, + &dev_attr_target13.attr, + &dev_attr_target14.attr, + &dev_attr_target15.attr, + NULL, +}; + +static umode_t visible_targets(struct kobject *kobj, struct attribute *a, int n) +{ + struct device *dev = container_of(kobj, struct device, kobj); + struct cxl_region *cxlr = to_cxl_region(dev); + + if (n < cxlr->interleave_ways) + return a->mode; + return 0; +} + +static const struct attribute_group region_interleave_group = { + .attrs = interleave_attrs, + .is_visible = visible_targets, +}; + +static const struct attribute_group *region_groups[] = { + ®ion_group, + ®ion_interleave_group, + &cxl_base_attribute_group, + NULL, +}; + static const struct device_type cxl_region_type = { .name = "cxl_region", .release = cxl_region_release, + .groups = region_groups }; bool is_cxl_region(struct device *dev) @@ -40,7 +391,7 @@ bool is_cxl_region(struct device *dev) } EXPORT_SYMBOL_NS_GPL(is_cxl_region, CXL); -static struct cxl_region *to_cxl_region(struct device *dev) +struct cxl_region *to_cxl_region(struct device *dev) { if (dev_WARN_ONCE(dev, dev->type != &cxl_region_type, "not a cxl_region device\n")) @@ -48,6 +399,7 @@ static struct cxl_region *to_cxl_region(struct device *dev) return container_of(dev, struct cxl_region, dev); } +EXPORT_SYMBOL_NS_GPL(to_cxl_region, CXL); static void unregister_region(struct work_struct *work) { @@ -93,6 +445,7 @@ static struct cxl_region *cxl_region_alloc(struct cxl_decoder *cxld) dev->bus = &cxl_bus_type; dev->type = &cxl_region_type; INIT_WORK(&cxlr->detach_work, unregister_region); + mutex_init(&cxlr->remove_lock); return cxlr; } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 1631b0aeca85..fffe85764aae 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -81,6 +81,19 @@ static inline int cxl_to_interleave_ways(u8 eniw) } } +static inline int cxl_from_ways(u8 ways) +{ + if (is_power_of_2(ways)) + return ilog2(ways); + + return ways / 3 + 8; +} + +static inline int cxl_from_granularity(u16 g) +{ + return ilog2(g) - 8; +} + /* CXL 2.0 8.2.8.1 Device Capabilities Array Register */ #define CXLDEV_CAP_ARRAY_OFFSET 0x0 #define CXLDEV_CAP_ARRAY_CAP_ID 0 @@ -250,6 +263,7 @@ struct cxl_decoder { * @skip: The skip count as specified in the CXL specification. * @res_lock: Synchronize device's resource usage * @volatil: Configuration param. Decoder target is non-persistent mem + * @cxlr: Region this decoder belongs to. */ struct cxl_endpoint_decoder { struct cxl_decoder base; @@ -258,6 +272,7 @@ struct cxl_endpoint_decoder { u64 skip; struct mutex res_lock; /* sync access to decoder's resource */ bool volatil; + struct cxl_region *cxlr; }; /** @@ -286,6 +301,7 @@ struct cxl_root_decoder { struct cxl_decoder_targets *targets; int next_region_id; struct mutex id_lock; /* synchronizes access to next_region_id */ + struct list_head regions; }; #define _to_cxl_decoder(x) \ diff --git a/drivers/cxl/region.h b/drivers/cxl/region.h index 6a0118dcdf2f..4bc8e2529890 100644 --- a/drivers/cxl/region.h +++ b/drivers/cxl/region.h @@ -13,6 +13,14 @@ * @id: This region's id. Id is globally unique across all regions. * @flags: Flags representing the current state of the region. * @detach_work: Async unregister to allow attrs to take device_lock. + * @remove_lock: Coordinates region removal against decoder removal + * @list: Node in decoder's region list. + * @res: Resource this region carves out of the platform decode range. + * @size: Size of the region determined from LSA or userspace. + * @uuid: The UUID for this region. + * @interleave_ways: Number of interleave ways this region is configured for. + * @interleave_granularity: Interleave granularity of region + * @targets: The memory devices comprising the region. */ struct cxl_region { struct device dev; @@ -20,8 +28,66 @@ struct cxl_region { unsigned long flags; #define REGION_DEAD 0 struct work_struct detach_work; + struct mutex remove_lock; /* serialize region removal */ + + struct list_head list; + struct resource *res; + + u64 size; + uuid_t uuid; + int interleave_ways; + int interleave_granularity; + struct cxl_endpoint_decoder *targets[CXL_DECODER_MAX_INTERLEAVE]; }; +bool is_cxl_region(struct device *dev); +struct cxl_region *to_cxl_region(struct device *dev); bool schedule_cxl_region_unregister(struct cxl_region *cxlr); +static inline bool cxl_is_interleave_ways_valid(const struct cxl_region *cxlr, + const struct cxl_decoder *rootd, + u8 ways) +{ + int root_ig, region_ig, root_eniw; + + switch (ways) { + case 0 ... 4: + case 6: + case 8: + case 12: + case 16: + break; + default: + return false; + } + + if (rootd->interleave_ways == 1) + return true; + + root_ig = cxl_from_granularity(rootd->interleave_granularity); + region_ig = cxl_from_granularity(cxlr->interleave_granularity); + root_eniw = cxl_from_ways(rootd->interleave_ways); + + return ((1 << (root_ig - region_ig)) * (1 << root_eniw)) <= ways; +} + +static inline bool +cxl_is_interleave_granularity_valid(const struct cxl_decoder *rootd, int ig) +{ + int rootd_hbig; + + if (!is_power_of_2(ig)) + return false; + + /* 16K is the max */ + if (ig >> 15) + return false; + + rootd_hbig = cxl_from_granularity(rootd->interleave_granularity); + if (rootd_hbig < cxl_from_granularity(ig)) + return false; + + return true; +} + #endif